FACTOID # 58: The women of Iceland earn two-thirds of their nation's university degrees.
 
 Home   Encyclopedia   Statistics   Countries A-Z   Flags   Maps   Education   Forum   FAQ   About 
 
WHAT'S NEW
RECENT ARTICLES
More Recent Articles »
 

Encyclopedia > Spam blog

Link spam (also called blog spam or comment spam) is a form of spamming or spamdexing that recently became publicized most often when targeting weblogs (or blogs), but also affects wikis (where it is often called wikispam), guestbooks, and online discussion boards. Any web application that displays hyperlinks submitted by visitors or the referring URLs of web visitors may be a target. A KMail folder full of spam emails collected over a few days. ... Spamdexing or search engine spamming is the practice of deliberately and dishonestly modifying HTML pages to increase the chance of them being placed close to the beginning of search engine results, or to influence the category to which the page is assigned in a dishonest manner. ... The first use of the term weblog. ... A wiki is a web application that allows users to add content, as on an Internet forum, but also allows anyone to edit the content. ... A hyperlink, or simply a link, is a reference in a hypertext document to another document or other resource. ... Referer is a common misspelling of the word referrer, so common in fact that it made it into the official specification of HTTP - the communication protocol of the world wide web. ...


A spamblog is an automated weblog that exists solely to send spam (particularly comment or trackback spam) or to facilitate the sending of such spam. The first use of the term weblog. ... TrackBack is a system implemented by Movable Type and later adopted by many blogging tools, including BoastMachine, Dotclear, TypePad, Nucleus or WordPress, that allows a blogger to see who has seen the original post and has written another entry concerning it. ...


Adding links that point to the spammer's web site increases the page rankings for the site in the search engine Google. An increased page rank means the spammer's commercial site would be listed ahead of other sites for certain Google searches, increasing the number of potential visitors and paying customers. PageRank is a family of algorithms for assigning numerical weightings to hyperlinked documents (or web pages) indexed by a search engine. ... Google, Inc (NASDAQ: GOOG), is a U.S. public corporation, initially established as a privately-held corporation in 1998, that designed and manages the Internet Google search engine. ... Google, Inc (NASDAQ: GOOG), is a U.S. public corporation, initially established as a privately-held corporation in 1998, that designed and manages the Internet Google search engine. ...

Contents


History

Link spamming originally appeared in internet guestbooks, where spammers repeatedly fill a guestbook with links to their own site and no relevant comment to increase search engine rankings. If an actual comment is given it is often just "cool page", "nice website", or keywords of the spammed link. <a href=http://www. ...


In 2003, spammers began to take advantage of the open nature of comments in the blogging software like Movable Type by repeatedly placing comments to various blog posts that provided nothing more than a link to the spammer's commercial web site. Jay Allen created a free plugin, called MT-BlackList, for the Movable Type weblog tool that attempts to alleviate this problem. Many current blog software now have methods of preventing or reducing the effect of blog spam. 2003 is a common year starting on Wednesday of the Gregorian calendar. ... The first use of the term weblog. ... Movable Type is a proprietary weblog publishing system developed by California-based Six Apart. ...


Migration to wikis

Because of prevention improvements in blog software link spam is now increasingly concentrated on wikis around the World Wide Web including Wikipedia, the largest wiki on the Internet (see [1]). Wiki spam sometimes only appears on a wiki's sandbox page, but is often found defacing multiple pages. The website chongqed.org lists URLs and IP addresses of offending wiki spammers. A wiki is a web application that allows users to add content, as on an Internet forum, but also allows anyone to edit the content. ... Graphic representation of the World Wide Web around Wikipedia The World Wide Web (WWW, W3, or simply Web) is an information space in which the items of interest, referred to as resources, are identified by global identifiers called Uniform Resource Identifiers (URIs). ... The Wikipedia logo Wikipedia is a Web-based, multi-language, free-content encyclopedia written collaboratively by volunteers and sponsored by the non-profit Wikimedia Foundation. ... A Uniform Resource Locator, URL (spelled out as an acronym, not pronounced as earl), or Web address, is a standardized address name layout for resources (such as documents or images) on the Internet (or elsewhere). ... An IP address (Internet Protocol address) is a unique number, similar in concept to a telephone number, used by machines (usually computers) to refer to each other when sending information through the Internet. ...


Possible solutions

Instead of displaying a direct hyperlink submitted by a visitor, a web application could display a link to a script on its own website that redirects to the correct URL. This will not prevent all spam since spammers do not always check for link redirection but has proven very effective. Redirecting links prevent Google from factoring the link in its PageRank algorithm for that site making the spam ineffective. An added benefit is that the redirection script can count how many people visit external URLs, although it will increase the load on the site. A Uniform Resource Locator, URL (spelled out as an acronym, not pronounced as earl), or Web address, is a standardized address name layout for resources (such as documents or images) on the Internet (or elsewhere). ... PageRank is a family of algorithms for assigning numerical weightings to hyperlinked documents (or web pages) indexed by a search engine. ...


Another option is for the script to be client-side JavaScript. For example, JavaScript, in its more modern form, is an object-based scripting programming language based on the concept of prototypes. ...

 <a href="javascript:window.location.href='http://www.wiki.org'">Link</a> 

would work as a link but not be picked up by Google. Moreover, the javascript could be more complicated to ensure that the link would never be picked up since it was encoded. For example, In cryptography, encryption is the process of obscuring information to make it unreadable without special knowledge. ...

 <a href="javascript:redirectFunction('hfksksgjlsll')">Link</a> 

where 'hfksksgjlsll' is an encoded URL that is decoded by the javascript function redirectFunction which presumably is stored in the HEAD tag of the page. A downside of this is that visitors who have disabled Javascript in their browser would be unable to follow the links. A Uniform Resource Locator, URL (spelled out as an acronym, not pronounced as earl), or Web address, is a standardized address name layout for resources (such as documents or images) on the Internet (or elsewhere). ... In computer science, a subroutine (function, procedure, or subprogram) is a sequence of code which performs a specific task, as part of a larger program, and is grouped as one or more statement blocks; such code is sometimes collected into software libraries. ... This article is about HTML elements. ...


This kind of redirection can also be done via the .htaccess file in Apache, thus saving the load of a script. .htaccess (Hypertext Access) is the default name of Apaches directory-level configuration file. ... Apache HTTP Server is an open source HTTP web server for Unix platforms (BSD, Linux, and UNIX systems), Microsoft Windows, and other platforms. ...


nofollow

In early 2005 Google introduced an HTML attribute that disables the assignment of ranking credits for a particular link. This is a much easier solution that makes the improvised techniques above irrelevant. Most weblog software now comes with this enabled by default (and no option to disable it without code modification) adding the nofollow attribute to reader-submitted links:

 <a href="http://www.wiki.org/" rel="nofollow">Link</a> 

However, some weblog authors object to using the attributes, due to concerns over the motives for its introduction (the large amount of inter-linking between blogs makes search engine algorithms less accurate) and its effectiveness, since a spambot does not know whether its target is using 'nofollow' or not.


Turing tests

Various methods requiring humans to do spamming by hand have been attempted. A variety of captcha gateways have been implemented, in an effort to prevent bots from submitting entries. Drawbacks to this are the annoyance it poses for regular users, the lack of any alternative for visually impaired users, and the ability of some advanced bots to fool simple captchas most of the time. A captcha (an acronym for completely automated public Turing test to tell computers and humans apart) is a type of challenge-response test used in computing to determine whether or not the user is human. ...


Specific anti-spam methods

Particularly popular software products such as Movable Type and MediaWiki have developed their own custom anti-spam measures, as spammers focus more attention on targeting those platforms. Whitelists and blacklists that prevent certain IPs from posting, or that prevent people from posting content that matches certain filters, are common defenses. More advanced access-control lists require various forms of validation before users can contribute anything like linkspam. Movable Type is a proprietary weblog publishing system developed by California-based Six Apart. ... MediaWiki is a Wiki software package licensed under the GNU General Public License. ...


The goal in every case is to allow good users to continue to add links to their comments, as that is considered by some to be a valuable aspect of any comments section.

This article is part of the Spamming series.
Blog spam | E-mail spam | Flyposting | Messaging spam | Mobile phone spam
Newsgroup spam | Spamdexing | VoIP spam | Telemarketing
Advance fee fraud | Lottery scam | Make money fast | Phishing
History of spamming
DNSBL | Stopping e-mail abuse

Spamming is the use of any electronic communications medium to send unsolicited messages in bulk. ... A typical spam advertisement Spam by e-mail is a type of spam that involves sending identical (or nearly identical) messages to thousands (or millions) of recipients. ... Flyposting is the act of putting advertising posters or flyers in illegal places. ... Messaging spam, sometimes called SPIM, is a type of spam where the target is instant messaging services. ... Mobile phone spam is a form of spamming directed at the text messaging service of a mobile phone. ... Newsgroup spam is a type of spamming where the targets are Usenet newsgroups. ... Spamdexing or search engine spamming is the practice of deliberately and dishonestly modifying HTML pages to increase the chance of them being placed close to the beginning of search engine results, or to influence the category to which the page is assigned in a dishonest manner. ... Spam Telephony (spit) is the VoIP equivalent of unsolicited email — unwanted messages clogging up your voice mail box. ... Telemarketing is a form of direct marketing where a salesperson uses the telephone to solicit prospective customers to sell products or services. ... Advance fee fraud, often also known as the Nigerian money transfer fraud, Nigerian scam or 419 scam after the relevant section of the Nigerian Criminal Code [1] that it violates, is a fraudulent scheme to extract money from investors living in rich countries in Europe, Australia, or North America. ... A lottery scam is a scam email that tells the recipient they have won a sum of money in the lottery. ... Make money fast was a title of an electronically forwarded chain letter which became so famous that the term is now used to describe all sorts of chain letters forwarded over the Internet, by e-mail spam or Usenet newsgroups. ... This phishing attempt, disguised as an official email from Charter One Bank, attempts to trick users into giving away their account information by confirming it at the phishers linked website. ... Although spamming has existed on the Internet since as early as 1978, the first major spamming incidents didnt take place until the early 1990s. ... A DNS-based Blackhole List, or DNSBL, is a means by which an Internet site may publish a list of IP addresses, in a format which can be easily queried by computer programs on the Internet. ... E-mail has become the subject of much abuse, in the form of both spamming and E-mail worm programs. ...

External links


  Results from FactBites:
 
Spam Kings Blog (1124 words)
In Spam Kings, author and investigative journalist Brian McWilliams delivers a compelling account of the cat-and-mouse game played by spam entrepreneurs (including the notorious Davis Wolfgang Hawke, "Dr. Fatburn," and Scott Richter) in search of easy fortunes and the cyber-vigilantes who are trying to stop them.
But despite such progress, my spam folders are still filling up with hundreds of spams each day, and many of the same names are on the Spamhaus list of the world's biggest spammers.
Some recent drug spams are apparently coming from webmail providers including Frys.com and some public libraries, such as one in Maryland.
BBC NEWS | Technology | How spammers are targeting blogs (983 words)
Blogs evolved out of a desire to remove barriers to online conversation, and restricting their ability to add comments would seriously reduce the sort of lively debate that makes them so interesting.
After all, a public blog with an accessible comments page is hardly a closed system, and even if you have an acceptable use policy saying what sort of postings you welcome, that is not legally binding either.
A blog is a place to express your views in a public arena, and having some unknown people fill the space with advertising is the online equivalent of finding that someone has pinned a card advertising "private massage" to your coat when you were not looking.
  More results at FactBites »

 

COMMENTARY     


Share your thoughts, questions and commentary here
Your name
Your location
Your comments
Please enter the 5-letter protection code


Lesson Plans | Student Area | Student FAQ | Reviews | Press Releases |  Feeds | Contact
The Wikipedia article included on this page is licensed under the GFDL.
Images may be subject to relevant owners' copyright.
All other elements are (c) copyright NationMaster.com 2003-5. All Rights Reserved.
Usage implies agreement with terms.