Presentation is loading. Please wait.

Presentation is loading. Please wait.

SUBMITTED TO:-SUBMITTED BY:- Ms.Kavita KhannaShruty Ahuja H.O.D(CSE DEPARTMENT)02/MT/10 PDM,BAHADURGARHCE(2 ND SEM)

Similar presentations


Presentation on theme: "SUBMITTED TO:-SUBMITTED BY:- Ms.Kavita KhannaShruty Ahuja H.O.D(CSE DEPARTMENT)02/MT/10 PDM,BAHADURGARHCE(2 ND SEM)"— Presentation transcript:

1 SUBMITTED TO:-SUBMITTED BY:- Ms.Kavita KhannaShruty Ahuja H.O.D(CSE DEPARTMENT)02/MT/10 PDM,BAHADURGARHCE(2 ND SEM)

2 What are CAPTCHAs? Completely Automated Public Test to Tell Computers and Humans Apart. Web-based protection mechanisms Only humans allowed to perform certain tasks` Opening E-mail accounts Voting on-line, etc. Prevent automated attacks by bots To avoid eating up resources To avoid biasing results, etc. Most current systems - text-based. Text-based CAPTCHAs

3 Why there came a need for Captchas??? Preventing Comment Spam in Blogs. Protecting Website Registration. Protecting Email Addresses From Scrapers. Worms and Spam. Search Engine Bots. Preventing Dictionary Attacks. Online Polls.

4 Background First used by Altavista in1997 Reduced SPAM add-url by over 95% CMU/Yahoo! Automated the creating and grading of challenges PARC Relies on document image degradation to prevent successful OCR Conducted user-focused studies to assess the effectiveness of CAPTCHAs

5 Background - Papers Pessimal Print: A Reverse Turing Test Allison L. Coates, Henry S. Baird, Richard J. Fateman Telling Humans and Computer Apart Automatically Luis von Ahn, Manuel Blum, and John Langford CAPTCHA: Using Hard AI Problems for Security Luis von Ahn, Manuel Blum, Nicholas J. Hopper, and John Langford Using Machine Learning to Break Visual Human Interaction Proofs (HIPs) Kumar Chellapilla, Patrice Y. Simard

6 Types of CAPTCHAs Text based Gimpy, ez-gimpy Gimpy-r, Google CAPTCHA Simard’s HIP (MSN) Graphic based Bongo Pix Audio Based

7 Text Based CAPTCHAs Gimpy, ez-gimpy Pick a word or words from a small dictionary Distort them and add noise and background Gimpy-r, Google’s CAPTCHA Pick random letters Distort them, add noise and background Simard’s HIP Pick random letters and numbers Distort them and add arcs

8 Text Based CAPTCHAs

9 ISSUES OF TEXT-BASED CAPTCHAS

10 Audio CAPTCHA In audio CAPTCHAs, this often means text is synthesized and mixed in with background noise, such as music. These were initially created to enable people that are visually impaired to register or make use of service that requires solving of a Captcha Used in restricting Spam over internet Telephony

11 Spam over IP Telephony: SPIT Fear of SPIT With VoIP, costs per call initiation will reduce dramatically Very low costs are the main reason why e-mail spam is proliferating in the Internet age  Reasonable to assume that SPIT will become a problem when VoIP gets massively deployed SPIT is much more obtrusive than e-mail spam E-mails get “pulled” from a server by the user; VoIP calls are “pushed” to the user your telephone might ring in the middle of the night… Most successful approaches against spam from the e-mail world will probably not work Content filtering needs to be done in real-time

12 Elements of Audio CAPTCHA’s There are 3 elements in Audio Captchas 1)Vocabulary 2)Background Noise 3)Audio Production

13 THE PROBLEM WITH CURRENT AUDIO CAPTCHAS In some cases the human passing rate is only 70%! To make the CAPTCHAs secure, noise was injected into the audio files making it harder for both computers and humans to pass. A CAPTCHA is considered broken once a program can pass it 5% of the time. Since the current audio CAPTCHAs use a limited vocabulary, it was possible for us to collect enough data to train a system that could pass the current audio CAPTCHAs more than 45% of the time.

14 HOW DID WE TEST THE CURRENT AUDIO CAPTCHAs? Selected three different types of audio CAPTCHAs: google, reCAPTCHA, and digg Collected 1000 CAPTCHAs per type of audio CAPTCHA to use for training and testing Created an ASR system using machine learning techniques

15 THE ALGORITHM Input: Audio CAPTCHA as an audio file Segmentation Find the highest energy peak, and extract a fixed size segment centered at that peak Recognition Extract features from segment Give segment to classifier and obtain label Stop extracting segments once all segments have been labeled or a max solution size is reached.

16 ANALYSIS OF CURRENT AUDIO CAPTCHAs Using three machine learning techniques to perform ASR on the CAPTCHAs AdaBoost Support Vector Machines (SVM) k-Nearest Neighbor (k-NN)

17 THE GOAL Make a secure audio CAPTCHA which will be easier for a human to pass and harder for a computer to pass. Equate solving a CAPTCHA with doing some useful work. In other words, create an audio reCAPTCHA.

18 WHAT IS reCAPTCHA? reCAPTCHA helps digitize text on which OCR fails by using the text as its CAPTCHA. Since millions of people solve CAPTCHAs each day, millions of words get digitized each day!

19

20 THE AUDIO RECAPTCHA Takes advantage of the human ability to understand words through context. Will help transcribe digital audio on which ASR systems fail. The audio being used was originally recorded with the intention that it should be easily understood by humans.

21 Graphic Based CAPTCHAs Bongo Display two series of blocks User must find the characteristic that sets the two series apart User is asked to determine which series each of four single blocks belongs to Difference? thick vs. thin lines

22 Graphic Based CAPTCHAs PIX Create a large database of labeled images Pick a concrete object Pick four images of the object from the images database Distort the images Ask the user to pick the object for a list of words

23 Why image-based CAPTCHAs ? Computer vision techniques have broken text-based CAPTCHAs Confusing characters. Solution More noise – harder for humans Natural image based CAPTCHAs Present an image to the user User labels content Hard to attack Image recognition is a hard problem Hence more secure CAPTCHAs Image-based CAPTCHAs

24 The IMAGINATION System Image Generation for Internet Authentication. Exploits the difference between human perception and current level of machine perception. Generates a CAPTCHA based on a hard AI problem. Breaking IMAGINATION, though highly unlikely, would in turn advance the state-of-the-art in AI. Uses a two-phase click-and- annotate process to achieve very low chance of attack. Click Phase – Select center of an image Annotate Phase – Select best label from list

25 Composite Image Generation Composite image generation by re-partitioning and dithering using different randomly chosen base colors

26 Composite Distortion Selection Enforce probabilistic constraints on what is a good distortion Make some realistic assumptions Generate many distortions Choose a subset that satisfies these constraints Include in the IMAGINATION system A tiger image distorted by four acceptable composite distortions

27 Composite Distortions: Probabilistic Constraints An image distortion is considered acceptable, if probabilistically, potential attack algorithms are unable to significantly reduce the uncertainty associated with the labeling of those images

28 Benefits of IMAGINATION Likely to be more robust against attacks Promise of a more secure Internet Web servers become more reliable Has great potential for commercialization

29 Captcha Creator Captcha Creator is an easy to use PHP Script that generates Strong Captchas, which has NOT been broken yet. The script is updated very often. Is very easy to install on any website with php support, and can be used to stop web forms submissions made by spam bots on sites like: Guestbooks, Blogs, Wiki, Comments, Feedback forms, etc The online Captcha Customization Tool allows you to select what letters and numbers should be used, the face, size and color of the font, background image, noise, and more. After uploading the script to your website, you can use it on your existing web form by following easy steps.

30 Benefits The database already exists and is public. The database is constantly being updated and maintained. Distortion prevents caching hacks. Quick expiration limits streaming hacks. Work even with servers not configured to generate images or sound. Server sends encrypted OTP to service, which sends image to client. Code is easy to embed. Saves bandwidth and processor time.

31 Drawbacks Not accessible to people with disabilities (which is the case of most CAPTCHAs) Relies on Google’s infrastructure Unlike CAPTCHAs using random letters and numbers, the number of challenge words is limited. People have written bots that do OCR (Optical Character Recognition) in order to handle these tests. CAPTCHA is only one layer of protection against spam bots. You should consider using the other protections available for the latest release of Geeklog, the Bad Behavior plugin, Dirk's SLV Spam-X class and trackback validation. When embedded in web pages, audio CAPTCHAs can also cause compatibility issues.

32 CONCLUSION Sites with attractive resources and millions of users will always have a need for access control systems that limit widespread abuse. At that level, it is reasonable to employ many concurrent approaches, including audio and visual CAPTCHA We need to refine our understanding of the design of usable and secure CAPTCHAs, for which current collective knowledge is limited. a lot more can be explored for sound- based and image-based CAPTCHAs. The design of CAPTCHA is still an art, rather than a science.

33 References en.wikipedia.org/wiki/CAPTCHA www.captcha.net http://cups.cs.cmu.edu/soups/2008/proceedings/p44Yan.pdf http://cseweb.ucsd.edu/~savage/papers/UsenixSec10.pdf http://www.cs.sfu.ca/~mori/research/papers/mori_cvpr03.pdf http://www.richgossweiler.com/projects/rotcaptcha/rotcaptcha. pdf http://webinsight.cs.washington.edu/papers/captchachi.pdf wang.ist.psu.edu/imagination/imagination.ppt www.cse.psu.edu/~datta/Present/mir05.ppt www.richhall.com/isc4350/captcha.ppt


Download ppt "SUBMITTED TO:-SUBMITTED BY:- Ms.Kavita KhannaShruty Ahuja H.O.D(CSE DEPARTMENT)02/MT/10 PDM,BAHADURGARHCE(2 ND SEM)"

Similar presentations


Ads by Google