Вы находитесь на странице: 1из 5


Captcha : - Completely Automated Public Turing test to tell Computers and Humans

Interoduction :-

A CAPTCHA is a program that protects websites against bots by generating and

grading tests that humans can pass but current computer programs cannot.

A CAPTCHA or Captcha is a type of challenge-response test used in computing

to ensure that the response is not generated by a computer. CAPTCHA requires that the
user type letters or digits from a distorted image that appears on the screen. Any user
entering a correct solution is presumed to be human else user is bot and denied access. It
is sometimes described as a reverse Turing test. OCRs(Optical Character Recognition)
are not able to read CAPTCHAs.

Characteristics :-

A CAPTCHA is a means of automatically generating new challenges which:

• Current software is unable to solve accurately.

• Most humans can solve
• Does not rely on the type of CAPTCHA being new to the attacker.

CAPTCHAs rely on difficult problems in artificial intelligence.

Origin :-

First developed by Alta Vista in 1997. The term coined in 2000 by Luis von Ahn ,
Manuel Blum and Nicholas J. Hopper of Carnegie Mellon University and John Langford
of IBM. Primitive CAPTCHAs seem to have been developed in 1997 by Andrei Broder,
Martin Abadi, Krishna Bharat, and Mark Lillibridge to prevent bots from adding URLs to
their search engine.
Turing Test :-

Proposed by Alan Turing. To test a machine’s level of intelligence Human judge

asks questions to two participants, one is a machine, he doesn’t know which is which, If
judge can’t tell which is the machine, the machine passes the test.

CAPTCHA employs a reverse Turing test,

judge = CAPTCHA program,

participant = user
if user passes CAPTCHA, he is human
if user fails, it is a machine

Types of CAPTCHAs :-

1. Text Based CAPTCHAs

2. Graphics Based CAPTCHAs
3. Audio or Sound Based CAPTCHAs

1. Text Based CAPTCHAs :-

Typically relay on sophisticated distortion of text images rendering them unrecognizable

to the state of the art of the pattern recognition programs but recognizable by humans.


• Simple, normal language questions:

 What is sum of three and thirty-five?

 If today is Saturday, what is day after tomorrow?
 Very effective, needs a large question bank
 Cognitively challenged users find it hard .

I. Gimpy:

• Originally designed by Yahoo and CMU.

• Based on human ability to read heavily distorted and corrupted text.
• works by choosing a certain number of words from a dictionary, and then
displaying them corrupted and distorted in an image; after that Gimpy asks the
user to type the words displayed in that image.
II. EZ-Gimpy:

• A modified version of Gimpy.

• Used in Yahoo Messenger Service.
• It contains only one random character string.
• The word is random and not picked from the dictionary.
• Its not a good implementation of CAPTCHA, and already broken OCRs.

III. MSN Passport service CAPTCHAs:

• its provided for Microsoft MSN services.

• uses 8 characters.
• Warping is used to distort.
• Its very strongly implemented and hasn’t been broken.

2. Graphics Based CAPTCHAs

Requires user to perform image recognition test. Examples:


• CAPTCHA that requires two steps to be passed.

• first step visitor clicks elsewhere on the picture that composed of a few images
and selects in this way a single image.
• second step the selected image is loaded. It is enlarged but very distorted. Also
variants of the answer are loaded on the client side. The visitor should select a
correct answer from the set of the proposed words.


• After M.M.Bongard, pattern recognition expert.

• User has to solve a pattern recognition problem.


• Animal Species Image Recognition for Restricting Access.

• It·s a HIP that works by asking users to identify photographs of cats and dogs.
• Difficult for computers but humans can accomplish it very quickly and accurately.
3. Audio or Sound Based CAPTCHAs

• Require user to solve a speech recognition test.

• In this version of captcha letters are read aloud instead of being displayed in an
• Helps visually disabled users


• 3DCaptcha is the "captcha nice to humans, bad to machines".

• It is written in PHP.
• A new approach to captchas, using human's spatial cognition abilities to
differentiate humans from machines.
• It uses a markov-chain to generate words that resemble human language and are
easy to type, yet avoid dictionary lookups.
• It filters profane language. _It's easy to deploy.

Applications :-

1. Preventing Comment Spam in Blogs

2. Protecting Website Registration
3. Protecting Email Addresses From Scrapers
4. Online Polls
5. Preventing Dictionary Attacks
6. Search Engine Bots
7. Worms and Spam

Constructing CAPATCHAs As :-

• Things to keep in mind:

 Don’t store CAPTCHA solution in Web page’s metadata

 A CAPTCHA is no good if it doesn't distort
 Need a large database of different CAPTCHA questions
 Avoid repetition of questions

• CAPTCHA Logic:

 Generate the question

 Persist the correct answer
 Present the question to user
 Evaluate answer, if incorrect, start again-- Generate a different CAPTCHA
 If correct, allow access to user

• Guidelines:

 Accessibility
 Image security
 Script security
 Security after widespread adoption
 Custom implementation or a general CAPTCHA?

Breaking CAPTCHAs :-

• Cracking CAPTCHAs through programs

 Convert CAPTCHA into greyscale

 Detect patterns in the image corresponding to characters Or, read session files
of that user and know the CAPTCHA word

Solution: Only store a hash of the CAPTCHA word in session files

Issues with CAPTCHAs :-

• Usability issues:
 W3C mandates Web to be accessible to all people
 Some CAPTCHAs are inaccessible to visually impaired, cognitively
challenged people

• Compatibility issues:
 JavaScript may need to be activated in browsers
 Some may need Adobe Flash plugin installed


• CAPTCHAs are an effective way to counter bots and reduce spam

• They serve dual purpose- help advance AI knowledge
• Applications are varied- from stopping bots to character recognition & pattern
• Some issues with current implementations represent challenges for future