Presentation is loading. Please wait.

Presentation is loading. Please wait.

By: Navid Bahrani, Niloufar Azmi, Majid Mafi

Similar presentations


Presentation on theme: "By: Navid Bahrani, Niloufar Azmi, Majid Mafi"— Presentation transcript:

1 By: Navid Bahrani, Niloufar Azmi, Majid Mafi
Keystroke Biometric By: Navid Bahrani, Niloufar Azmi, Majid Mafi Submitted to Professor El Saddik in partial fulfillment of the requirements for the course ELG 5121 November 03, 2009

2 Outline Introduction Overview of Biometrics Various approaches of research on keystroke dynamics Features/Attributes Feature Extraction Classification methods Advantages of keystroke dynamics Conclusion Future Vision

3 User Authentication Approaches
The first and foremost step in preventing unauthorized access is user Authentication. User authentication is the process of verifying claimed identity. Conventionally, user authentication is categorized into three classes: • Knowledge - based, • Object or Token - based, • Biometric - based. -The knowledge-based authentication is based on something one knows. The examples of knowledge-based authenticators are commonly known passwords and PIN codes. -The object-based authentication relies on something one has and is characterized by possession. -Behavioural characteristics are related to what a person does, or how the person uses the body. Voiceprint, gait Traditional keys to the doors can be assigned to the object-based category. Usually the token-based approach is combined with the knowledge-based approach. An example of this combination is a bankcard with PIN code. In knowledge-based and object-based approaches, passwords and tokens can be forgotten, lost or stolen. There are also usability limitations associated with them. For instance, managing multiple passwords / PINs, and memorizing and recalling strong passwords are not easy tasks. Biometric-based person recognition overcomes the above mentioned difficulties of knowledge-based and object based approaches.

4 What is Biometric Authentication?
An automatic method that identifies user or verifies the identity Involves something one is or does Types of Biometric Physiological Behavioural

5 Physiological characteristics
Biological/chemical based Finger prints Iris, Retinal scanning Hand shape geometry blood vessel/vein pattern Facial recognition ear image DNA Physiological characteristics refer to what the person is, or, in other words, they measure physical parameters of a certain part of the body. Some examples are Blood vessel pattern

6 Behavioral characteristics
A reflection of an individual’s psychology Hand written signatures Voice pattern Mouse movement dynamics Gait (way of walking) Keystroke dynamics A behavioural characteristic is a reflection of an individual’s psychology. Examples as come in figures: Finally: Because of the variability over time of most behavioural characteristics, a biometric system needs to be designed to be more dynamic and accept some degree of variability. On the other hand, behavioural biometrics are associated with less intrusive systems, conducing to better acceptability by the users. Behavioural characteristics, such as handwritten signatures, voice patterns, and keystrokes, comprise behaviour patterns that can also be considered as unique Keystroke dynamics (or typing rhythms) is an innovative behavioural biometric technique, refers to a user’s habitual typing characteristics. These typing characteristics are believed to be unique among large populations. The advantages of keystroke dynamics include the low level of detraction from the regular computer work, because the user would be entering keystrokes when giving a password to the system. Since the input device is the existing keyboard, the technology has a reduced cost compared to expensive biometrics acquisition devices. (When attempting to access a computerized system, people are normally authenticated through a username and password. It has been demonstrated that this form of authentication is not completely safe because, through a series of different attacks, an intruder can determine the username and password, which would then allow unauthorized access [3, 4]. Behavioral biometrics, such as measures based on keystroke analysis, could be used as complementary metrics to defeat intrusion attempts. Keystroke measures are different from many other behavioral biometrics because of the specific human computer interaction component. Behavioral biometrics such as handwritten signatures do not directly involve the target device (i.e., the computer keyboard), whereas keystroke patterns have a direct relationship with the specific input device through which authentication takes place.) In systems that implement keystroke patterns as part of their authentication mechanisms, the login process requires not only the correct username and password, but also matching specific keystroke patterns that have been identified and stored for that individual. Such a mechanism could increase the difficulty of intruders being authenticated, as the unique typing pattern would be very difficult to reproduce.

7 Comparison of various biometric techniques
Parameters: Universality - describes how commonly a biometric is found individually. Uniqueness - how well the biometric separates individually from another. Permanence - measures how well a biometric resists aging. Collectability - defines ease of acquisition for measurement. Performance accuracy, speed, and robustness - practical aspects of using specific biometric technology. Acceptability - degree of approval of a technology. Circumvention - defines ease of use of a substitute, in other words, how easy it is to fraud specific biometric characteristic.

8 Keystroke History Typing rhythms is an idea whose origin lies in
the observation (made in 1897) that telegraph operators have distinctive patterns of keying messages over telegraph lines Behavioral biometrics In keeping with these early observations, British radio interceptors, during World War II, identified German radio-telegraph operators by their "fist," the personal style of tapping out a message.

9 Keystroke Applications
A Behavioral measurement aiming to identify users based on typing pattern/ rhythms or attributes Keystroke dynamics system different modes Identification mode (Find) One-to-many Verification mode (Check) One-to-one Non-repudiation Keystroke dynamics systems can run in two different modes: Identification mode or Verification --Identification is the process of trying to find out a person’s identity by examining a biometric pattern calculated from the person’s biometric features. A larger amount of keystroke dynamics data is collected, and the user of the computer is identified based on previously collected information of keystroke dynamics profiles of all users. For each of the users, a biometric template is calculated in this training stage. A pattern that is going to be identified is matched against every known template, yielding either a score or a distance describing the similarity between the pattern and the template. The system assigns the pattern to the person with the most similar biometric template. To prevent impostor patterns (in this case all patterns of persons not known by the system) from being correctly identified, the similarity has to exceed a certain level. If this level is not reached, the pattern is rejected. --verification case checked a person’s identity. The pattern that is verified is only compared with the person’s individual template. A large amount of keystroke dynamics data Identified based on all users’ data profiles Verification mode Checks a person’s identity Compared only with person’s individual template

10 Keystroke Verification Techniques
Static verification (Fixed text mode) Only based on password typing rhythm Authentication only at login time Dynamic verification (free text mode) pattern regardless of the typed text A continuous or periodic monitoring (On-the-fly user authentication) not required to memorize a predetermined text (username & password) Static: recognition is based only on password typing rhythm and the authentication is carried out only at login time Detecting user change is impossible after login Dynamic: pattern regardless of the text that has been types. it isn’t required to memorize a predetermined text such as username and password. the verification can be done continuously during the session and is not limited only to login time.

11 Biometric System Both physiological and behavioral systems can be logically divided into two phases: enrolment and authentication/verification phase. During the enrolment phase as shown in Fig. 1 user biometric data is acquired, processed and stored as reference file in a database. This is treated as a template for future use by the system in subsequent authentication operations. During the authentication/verification phase user biometric data is acquired, and processed. The authentication decision shall be based on the outcome of a matching process of the newly presented biometric to the pre-stored reference templates. (All studies use two stages: 1) learning users keystroke dynamics (enrollment), and 2) comparing new data to the profile collected in stage 1. Stage 1 consists of writing the username and password several times, though sometimes only usernames are used, and forming a profile. The type of profile depends on the used classification method. Used classification methods include traditional statistic techniques, Bayesian classifiers, neural networks and fuzzy systems.)

12 Currently, there are 4 primary methods for user authentication:
Continuous Biometric User Authentication in online Examination (Dynamic): Currently, there are 4 primary methods for user authentication: Knowledge factors, or something unique that the user knows Ownership factors, or something unique that the user hast Something unique that the user is Something unique that the user does

13 Some metrics for user verification in online authentication:
Typing speed Keystroke seek-time Flight time Characteristic sequences of keystrokes Examination of characteristic errors

14 Keystrokes Dynamics (Features)
Converts biometric data to feature vector can be used for classification Keystrokes latencies (fight) Duration of a specific keystroke (dwell) Pressure (Force of keystrokes) Typing speed Frequency of error Overlapping of specific keys combinations Method of error correction Feature extraction: process of extracting unique data from the sample to create the template In developing a scheme using keystroke dynamics for identity verification, it is very necessary to determine which keystrokes characterize the individual’s key pattern (we should decide on features that gonna make up the pattern). the mostly used characteristics are: Frequency of error(use of backspace) Flight time:Latency between consecutive keystrokes ‘flight’ duration of pause between keystrokes- keystrokes duration ‘dwell’ duration of a specific keystroke – keystrokes latencies or key-hold time Pressure: energy you use to force on keys ‘Typing speed average number of keystrokes per time interval Overlapping of specific keys combinations caused by fast typing and using ‘shift’ for writing capital/small letters; amount of errors (in practical way detected by usage of ‘delete’ or ‘backspace’ keys); method of error correcting (selecting text before or deleting letters one by one a behavioral measurement aims to identify users based on typing pattern of individuals or attributes such as

15 Keystroke analysis Variety of methods Mean typing rate
Inter-interval comparison Digraph Trigraph Mean error rate etc

16 Features & feature extraction method

17 Features & feature extraction method

18 Figures of Merit False Rejection Rate - type I error – FRR
False alarm False Acceptance Rate - type II error – FAR Missed alarm Equal-error rate (EER) or Crossover Error Rate (CER) Different values of the operating threshold may result in different values of FRR and FAR To ensure comparability across different systems There are two metrics that are commonly used to assess the reliability of a biometric system: false rejection rate (FRR) and false acceptance rate (FAR), to express the security of a system. The FAR is the percentage of fraudulent users incorrectly accepted (In statistics, this is referred to as a Type II error) and the FRR is the percentage of correct users denied (In statistics, this is referred to as a Type I error). Both error rates should ideally be 0%. From a security point of view, type II errors should be minimized that is no chance for an unauthorized user to login. However, type I errors should also be infrequent because valid users get annoyed if the system rejects them incorrectly. The acceptable values for the FAR and FRR depend on the application. In a banking application such as in this example, the FRR must be zero to avoid inconveniencing the bank's customers. --(EER) or the Cross-Over Error Rate (CER): indicates that the proportion of false acceptances is equal to the proportion of false rejections. The lower the equal error rate value, the higher the accuracy of the biometric systems. Since different values of the operating threshold may result in different values of FRR and FAR, in order to ensure comparability across different systems, another metric that is commonly used is Equal Error Rate (EER), the point at which both FAR and FRR are equal.

19 Classification methods
Minimum distance Bayesian classifier Random forest classifier Neural nets “combined” neural net Multi-Layer Perceptron RBFN Fuzzy (ANFIS) Support-vector machines Decision trees Markov models (hidden Markov model) Statistical Methods(mean, Std)

20 Classification Categories
Statistical Methods Neural Networks Pattern Recognition Techniques Hybrid Techniques Other Approaches Classification aims to find the best class that is closest to the classified pattern. A table of some sort is maintained that contains a user’s details along with associated reference signature collected during the enrolment process. When those access details are entered, the system looks up the respective details and performs a similarity measure of some sort. The user’s keystroke dynamics extracted during log-in session is compared and classified with the stored reference signature in the database. If they are within a pre- scribed tolerance limit – the user is authenticated. If not – then the system can decide whether to lock up the workstation – or take some other suitable action. The following sections categorize into seven classes the major publications in classification methods in identifying legitimate user’s features. They are statistical methods, neural networks, pattern recognition techniques, hybrid techniques and other approaches. --The basic idea of the statistical approach is to compare a reference set of typing characteristics of a certain user with a test set of typing characteristics of the same user or a test set of a hacker. The distance between these two sets (reference and test) should be below a certain threshold or else the user is recognized as a hacker. (from another paper): A reference signature is mostly calculated using the distance measure of keystroke latency and duration. The standard statistical measures like mean, standard deviation, etc. of the reference signature are used for generating template and classification. --Neural Networks process first builds a prediction model from historical data, and then uses this model to predict the outcome of a new trial (or to classify a new observation). (from another paper): Rather than performing a sequential set of instructions, neural networks are capable of exploring many competing hypotheses in parallel. Because of this quality, neural networks are considered to have the greatest potential in the area of biometrics. --Pattern recognition is the scientific discipline whose goal is the classification of objects or patterns into a number of categories or classes. --Hybrid techniques: Many researchers have proposed methods of combination of various neural networks, pattern recognition, statistical measures, etc.

21 Statistical Methods Mean, standard deviation and digraph
Geometric distance, Euclidean distance Degree of disorder k-Nearest neighbour approach Hidden Markov model N - graphys Manhattan distance Mean reference signature (mean & std)

22 Neural Networks Perceptron Algorithm Auto associative neural network
Deterministic RAM network (DARN) Back Propagation model BPNN and RMSE Adaline and BPNN

23 Pattern Recognition Techniques

24 Hybrid Techniques Fuzzy logic: There are many adjustable elements such as membership functions and fuzzy rules Advantage: many adjustable elements increase the flexibility of the fuzzy based authentication Disadvantage: increase the complexity in designing fuzzy-based authentication system.

25 Other Approaches

26 Some Opportunities: Login information Continuous authentication
Computer Cell phones Automated Teller Machine Digital telephone dial Digital electronic security keypad at a building entrance Continuous authentication Online examination

27 Advantages of keystroke dynamics
Software Only method. (No Additional Hardware except a Keyboard) Simple To Deploy and Use (username & passwords) – Universally accepted Unobtrusive, Non-Invasive, Cost Effective No End-User Training It provides a simple natural way for increased computer security Can be used over the internet Unobtrusive: More convenient than physiological methods the features can be collected without the need for special hardware Unobtrusive: low level of detraction from the regular computer work (keystroke patterns have a direct relationship with the specific input device through which authentication takes place while in other behavioral biometric method the target device differs from one with which the user is verified) User/pass: Password authentication is an inexpensive and familiar paradigm that most operating systems support. Increased security: In systems that implement keystroke patterns as part of their authentication mechanisms, the login process requires not only the correct username and password, but also matching specific keystroke patterns that have been identified and stored for that individual. Such a mechanism could increase the difficulty of intruders being authenticated, as the unique typing pattern would be very difficult to reproduce. --keystroke dynamics for verification is a 2-factor authentication mechanism: even if the password is stolen the keystroke pattern still has to match with the stored profile

28 Keystroke drawbacks: User’s susceptibility to fatigue
Dynamic change in typing patterns Injury, skill of the user Change of keyboard hardware. Change of keyboard hardware using another workstation with different keyboard. Unlike other physiological biometrics such as fingerprints, retinas, and facial features, all of which remain fairly consistent over long periods of time, typing patterns can be rather erratic. Even though any biometric can change over time, typing patterns have smaller time scale for changes. Not only the typing patterns is inconsistent when compared to other biometrics, a person’s hands can also get tired or sweaty after prolonged periods of typing. This often results in major pattern differences over the course of a day.

29 Keystroke Challenges Lack of a shared set of standards for data collection, benchmarking, measurement Which methods have lower error rate? Error rate comparison is difficult Work with very short sample texts There is no identical biometric samples Requires adaptive learning 1. Lack of a shared set of standards for data collection, benchmarking, and measurement have prevented, to some degree, any growth from collaboration and independent confirmation of techniques. 2. it is cognizant that there is a lack of uniformity in how methods are evaluated and, so, it is imprudent to explicitly declare which methods indeed have the lowest error rates. 3. Though the error rates are reported for each method when available, tests are often done on different test subjects, feature set and classification methods, so, comparisons are often difficult. 4. As a common feature, many of the systems described in this survey strive to work with very short sample texts. Hence, one may well note that it is unfair to compare the outcomes of such systems. What are our concerns of this method? Of course as every behavioral method the characteristic may change with time, so this forces the system to learn new correct samples and this is extremely difficult. If sample isn't correct and system learns on it, classification error grows. Another problem is that, according to sentence, that “there is no identical biometric samples”, there is a big problem with classifying correct samples.

30 Conclusions Extreme different typing patterns among examinees
It seems promising , still needs more efforts specially for identification Iris scanners provide the lowest total error rate - on the order of 10-6 in many cases Even fingerprints provide an error rate on the order of 10-2 Extreme different typing patterns among examinees

31 Conclusions Several commercial systems on offer:
BioPassword (now AdmitOne), PSYLock, Trustable Passwords but no evaluation data are publicly available for these systems Combined features of maximum pressure with latency  effective way to verify authorized user Combined ANN & ANFIS  greater promising result The combining features of maximum pressure with latency are considerably more effective way to verify the authorized person due to unique typing biometric of each individual. Combining both an Artificial Neural Network (ANN) and Adaptive Neuro-Fuzzy Inference System (ANFIS)-based classifiers has the greatest promising result for improving accuracy in order to verify the authorized user as compared to standalone classifier.

32 Future work Using longer fixed texts Test on extensive database
Combining many features increase the accuracy of keystroke analysis Find the most efficient features Adding mouse dynamic Helpful for identification Special characters & character overlapping Typing pattern as Digital Signature

33 Future work Researchers focus rather on user verification, there is a little works on users identification Maybe an obstacle is gathering big database Also trends in classifiers shows that many people uses ANN work on black-box basis adding new user to the database Future research to reduce FAR & FRR A maturing field

34

35 Comparison of Classifiers
The random forest classifier is robust against noise its tree- classification rules enable it to find informative signatures in small subsets of the data (i.e., automatic feature selection) In contrast, SVMs do not perform variable selection, can perform poorly when the classes are distributed in a large number of different but simple ways.

36 Methods to measure the users typing biometric:
Fuzzy logic: There are many adjustable elements such as membership functions and fuzzy rules Advantage: many adjustable elements increase the flexibility of the fuzzy based authentication Disadvantage: increase the complexity in designing fuzzy-based authentication system.

37 A: Methods to measure the users typing biometric:
RBFN:(Radial basis function network) Alternative neural network architecture Major advantage: can be trained to allow fast convergence to solitary global minimum for a given set of fixed hidden node parameter. Dis adv,: By adding new user the whole system should be designed and trained all over again. 5. Also trends in classifiers shows that many people uses artificial neural networks which work on black-box basis. We don't know which features gives the most importance. And huge problem with ANN and user identification is when we add new user to the database, and so all database have to be recalculated, what takes a lot of time.


Download ppt "By: Navid Bahrani, Niloufar Azmi, Majid Mafi"

Similar presentations


Ads by Google