|
Spam is the posting of advertisements, abusive, or unneeded messages on Internet forums. It is generally posted by automated spambots. This article needs additional references or sources for verification. ...
Spambots Spambots are automated programs designed to register on forums, disseminate spam, and leave. They usually supply a fake name, freebase email address, and sometimes mask their true IP address. Spammers can set the message that the spambot will post. Most spambots target one specific forum software or hosting company. Spambots are easy to identify by the nature of the message they leave, or the links in the signature. A typical post contains no topical content, but is accompanied by either spam links in the post itself, or in the user's signature. Some spambots will never post, and rely on the links in their signature to increase their search engine visibility. Looking up the spambot's user name with a search engine will often reveal thousands of registrations in unrelated forums. IP may refer to: IP address (Internet Protocol address), a computer network address Internet Protocol, the computer networking protocol used on the Internet Internet protocol suite or TCP/IP protocol suite, Internet communications protocols Intellectual property: the legal concept that the works created with intellectual effort is a form of...
An example of a spambot which has gained some notoriety since November of 2006 is XRumer. XRumer attempts to bypass anti-spamming mechanisms put in place by forum administrators, with some success. It uses a database of known HTTP proxies to mask the IP address of the poster, making it difficult for administrators to use a naive IP-banning mechanism. XRumer is a Windows program that posts link spam on forums, blogs, and wikis in order to boost search engine rankings. ...
Types of spam Most spambot forum spam consists of links, with the dual goals of increasing search engine visibility in highly competitive areas such as weightloss, pharmaceuticals, gambling, pornography, real estate or loans, and generating more traffic for these commercial websites. Some of these links contain code to track the spambot's identity if a sale goes through, when the spammer behind the spambot works on commission. Google search is the worlds most popular search engine. ...
Spam posts may contain anything from a single link, to dozens of links. Text content is minimal, usually innocuous and unrelated to the forum's topic. Another type that is recently hitting forums is full banner ads and blatant posting of an ad that is unrelated to the thread's topic. This takes away from the thread creator's topic of discussion, is rude and interrupts any fluid discussion started. This also eats up bandwidth, the time for someone to delete the SPAM and space wasted on the server. Especially when the person is going from thread to thread posting the same thing over and over with no regards to the topic or rules of those forums posted to. Alternately, the spam links are posted in the user's signature, in which case the spambot will never post. The link sits quietly in the signature field, where it is most likely to be harvested by search engine spiders than discovered by forum administrators and moderators.
Effects of spam Spam prevention and deletions measurably increase the workload of forum administrators and moderators. The amount of time and resources spent keeping a forum spam free contributes significantly to labour cost, and the skill required in the running of a public forum. Marginally profitable or smaller forums may be permanently closed by administrators. Forums that do not require registration are becoming rare.
Spam prevention - Flood control: This forces users to wait for a short interval between making posts to the forum, thus preventing spambots from flooding the forum with repeated spam messages.
- Registration control: Some forums employ CAPTCHA (visual confirmation) routines on their registration pages to prevent spambots carrying out automated registrations. Simple CAPTCHA systems which display alphanumeric characters have proven vulnerable to optical character recognition software but those that scramble the characters appear to be far more effective.
- Posting limits: Limit posting to registered users and/or require that the user pass a CAPTCHA test before posting.
- Registration restrictions: Applying careful restrictions can seriously impact bogus and spambot registrations. One approach consists in the denial of registration from certain domain extensions that are a major source of spambots such .ru, .br, .biz, or freebase addresses such as "gawab.com". Another, more labor-intensive, consists in manual examination of new registrants. This examination looks at several indicators. First, spambots often delay email confirmation by several hours, while humans will confirm promptly. Second, spambots will tend to create user names that are unique, and unlikely to already be used in the forum, preferring "John84731" or "JohnbassKeepsie" to the much more common "John." Third, using a search engine to investigate, one finds hundreds, if not thousands of profiles using the spambot login name, sometimes with the diagnostic spam post, or "banned" label.
- Changing technical details of the forum software to confuse bots - for example, changing "agreed=true" to "mode=agreed" in the registration page of phpBB.
- Block posts or registrations that contain certain blacklisted words.
- Sites like the Wikipedia use nofollow to discourage spammers.
- Be wary of IPs used by untrusted posters (anonymous posts or newly registered users). A useful technique for proactive detection of well-known spammer proxies is to count the number of hits on Google for the IP. If a phrase search on the IP returns a large number of hits (more than 100,000 hits means that the IP has posted to 100,000 fora -- does that sound human?) -- don't allow the post. If the post isn't anonymous and is a freshly registered user instead, you probably want to ban the user as well.
Early CAPTCHAs such as these, generated by the EZ-Gimpy program, were used on Yahoo. ...
Optical character recognition, usually abbreviated to OCR, is a type of computer software designed to translate images of handwritten or typewritten text (usually captured by a scanner) into machine-editable text, or to translate pictures of characters into a standard encoding scheme representing them (e. ...
phpBB is a popular internet forum package written in the PHP programming language. ...
Wikipedia - Wikipedia, the free encyclopedia /**/ @import /skins-1. ...
nofollow is an HTML attribute value used to instruct search engines that a hyperlink should not influence the link targets ranking in the search engines index. ...
See Also Signal-to-noise ratio (often abbreviated SNR or S/N) is an electrical engineering concept defined as the ratio of a signal power to the noise power corrupting the signal. ...
External links |