QoI: Assessing Participation in Threat Information Sharing

Slides:



Advertisements
Similar presentations
Item Analysis.
Advertisements

Principles of Measurement Lunch & Learn Oct 16, 2013 J Tobon & M Boyle.
1 Volume measures and Rebasing of National Accounts Training Workshop on System of National Accounts for ECO Member Countries October 2012, Tehran,
Original Figures for "Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring"
Cognitive Modelling – An exemplar-based context model Benjamin Moloney Student No:
S519: Evaluation of Information Systems Analyzing data: Merit Ch8.
Multiple Criteria for Evaluating Land Cover Classification Algorithms Summary of a paper by R.S. DeFries and Jonathan Cheung-Wai Chan April, 2000 Remote.
Evaluation and optimization of clustering in gene expression data analysis A. Fazel Famili, Ganming Liu and Ziying Liu National Research Council of Canada.
Ensemble Learning: An Introduction
ML ALGORITHMS. Algorithm Types Classification (supervised) Given -> A set of classified examples “instances” Produce -> A way of classifying new examples.
Latent Semantic Analysis (LSA). Introduction to LSA Learning Model Uses Singular Value Decomposition (SVD) to simulate human learning of word and passage.
INTRO TO RATING SCALES 2014 v1.0. Define Decision: Build Rating Scales 2 Identify Alternatives Identify Criteria Identify Participants Build Ratings Scales.
Chapter 6 Training Evaluation
Unit 4: Monitoring Data Quality For HIV Case Surveillance Systems #6-0-1.
Evaluating Performance for Data Mining Techniques
Measurement and Data Quality
CISC Machine Learning for Solving Systems Problems Presented by: Akanksha Kaul Dept of Computer & Information Sciences University of Delaware SBMDS:
Supervised Learning and k Nearest Neighbors Business Intelligence for Managers.
A methodology for developing new technology ideas to avoid
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 14 Measurement and Data Quality.
Knowledge Management Assessment of an Organization
Machine Learning CSE 681 CH2 - Supervised Learning.
 Internal Validity  Construct Validity  External Validity * In the context of a research study, i.e., not measurement validity.
Market Research Lesson 6. Objectives Outline the five major steps in the market research process Describe how surveys can be used to learn about customer.
Performance Improvement Projects: Validating Process, Tips, and Hints Eric Jackson, MA Research Analyst October 19, 2009.
Patterns of Event Causality Suggest More Effective Corrective Actions Abstract: The Occurrence Reporting and Processing System (ORPS) has used a consistent.
Discriminant Analysis Discriminant analysis is a technique for analyzing data when the criterion or dependent variable is categorical and the predictor.
Data Reduction via Instance Selection Chapter 1. Background KDD  Nontrivial process of identifying valid, novel, potentially useful, and ultimately understandable.
What Is Statistics Chapter 01 McGraw-Hill/Irwin Copyright © 2013 by The McGraw-Hill Companies, Inc. All rights reserved.
Chapter 6 Training Evaluation
CISC Machine Learning for Solving Systems Problems Presented by: Ashwani Rao Dept of Computer & Information Sciences University of Delaware Learning.
C. Lawrence Zitnick Microsoft Research, Redmond Devi Parikh Virginia Tech Bringing Semantics Into Focus Using Visual.
Educational Research CECS 5610 Dr. Gerald Knezek University of North Texas Clicking on the Speaker or Quicktime icon will play the audio associated with.
Authors: Marius Pasca and Benjamin Van Durme Presented by Bonan Min Weakly-Supervised Acquisition of Open- Domain Classes and Class Attributes from Web.
Copyright © 2008 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 17 Assessing Measurement Quality in Quantitative Studies.
MOI UNIVERSITY SCHOOL OF BUSINESS AND ECONOMICS CONCEPT MEASUREMENT, SCALING, VALIDITY AND RELIABILITY BY MUGAMBI G.K. M’NCHEBERE EMBA NAIROBI RESEARCH.
Overview and Types of Data
WERST – Methodology Group
PS204 - Statistics. Obtaining Knowledge Intuition - get a “feeling” Tenacity - hear it over and over Authority - we are “told” Rationalism - use of reason.
Post-Ranking query suggestion by diversifying search Chao Wang.
Improved Video Categorization from Text Metadata and User Comments ACM SIGIR 2011:Research and development in Information Retrieval - Katja Filippova -
Chapter 6 - Standardized Measurement and Assessment
Outline of Today’s Discussion 1.The Chi-Square Test of Independence 2.The Chi-Square Test of Goodness of Fit.
A Supervised Machine Learning Algorithm for Research Articles Leonidas Akritidis, Panayiotis Bozanis Dept. of Computer & Communication Engineering, University.
Software Measurement: A Necessary Scientific Basis By Norman Fenton Presented by Siv Hilde Houmb Friday 1 November.
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
Copyright © 2014 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 11 Measurement and Data Quality.
Unveiling Zeus Automated Classification of Malware Samples Abedelaziz Mohaisen Omar Alrawi Verisign Inc, VA, USA Verisign Labs, VA, USA
Presenter: Siddharth Krishna Sinha Instructor: Jing Gao
What is Statistics Chapter 1 McGraw-Hill/Irwin
Learning to Detect and Classify Malicious Executables in the Wild by J
What is Statistics? Introduction 1.
Queensland University of Technology
Lecture 7: Measurements and probability
PLANNING, MATERIALITY AND ASSESSING THE RISK OF MISSTATEMENT
What Is Statistics? Chapter 1.
Classroom Assessment A Practical Guide for Educators by Craig A
What is Statistics? Chapter 1 McGraw-Hill/Irwin
BotCatch: A Behavior and Signature Correlated Bot Detection Approach
Educational Research CECS 5610
Using Friendship Ties and Family Circles for Link Prediction
Neural Networks and Their Application in the Fields of Coporate Finance By Eric Séverin Hanna Viinikainen.
Graph-based Security and Privacy Analytics via Collective Classification with Joint Weight Learning and Propagation Binghui Wang, Jinyuan Jia, and Neil.
Measurement Concepts and scale evaluation
What is Statistics? Chapter 1.
Chapter 1 Introduction to Statistics
Chapter 14: Case studies Introduction 14.1 Case study design
WSExpress: A QoS-Aware Search Engine for Web Services
  Using the RUMM2030 outputs as feedback on learner performance in Communication in English for Adult learners Nthabeleng Lepota 13th SAAEA Conference.
SCIENTIFIC INVESTIGATION AND PROCESS
Presentation transcript:

QoI: Assessing Participation in Threat Information Sharing Jeman Park

Outline Threat Information (TI) Sharing Quality of Indicator System Architecture Methodology (Numerical Scoring) Dataset Results Conclusion

Threat Information (TI) Sharing <Example of structured TI sharing [1]> Threat information is shared with trusted partners using information sharing standards. [1] https://stixproject.github.io

Threat Information Sharing, cont. A better countermeasure can be found only if users actively share meaningful threat information. There are two ways to evaluate the user’s contribution: Quantity: How much is the information the user contributes? Quality: How useful is the information the user contributes? However, the current contribution measurement mainly focuses on quantitative aspects.

Quality of Indicator (QoI) We identified four metrics to be used for qualitative evaluation of user’s contribution: Correctness: captures whether attributes of an indicator (e.g., label used for attribution) are consistent with the assessor’s reference. Relevance: measures the extent to which an indicator is contextual and of interest to the rest of the community. Utility: captures whether an indicator characterizes prominent features of cyber-threats. Uniqueness: is another assessor of quality, which is defined as a measure of similarity with previously seen indicator. By aggregating the scores of these four metrics, the aggregated QoI score can be generated.

System Architecture

System Architecture, cont. Defining Metrics: Quality metrics are determined to be used as measurement criteria to generate the score of threat indicators that users provide. Defining Labels: Annotations can be labels capturing either the type of threat, the level (of severity, timeliness, etc.) or the quality type of an indicator. Utilizing these annotations, a scoring method is used to convert the quality labels to a numeric score for the indicator.

System Architecture, cont. Building a Reference: The reference dataset is used to evaluate QoI for a sample of indicators submitted by a sample provider. To build the initial reference dataset, data that is collected through security operations is vetted for their validity and applicability. Extrapolating: Extrapolation allows each assessor to predict the label of an indicator using its feature set and classifier model. The classifier is trained using a supervised learning process extracted from the reference dataset. We use Nearest Centoid Classifier (NCC), a classification model that assigns to observations the label of the class of training samples whose mean (centroid) is closest to the observation.

Numerical Scoring To illustrate the example of QoI, we conducted QoI scoring on the Anti-Virus (AV) Vendors and their malware families. Correctness: The reference dataset is used as the benchmark for determining the correct label for an arbitrary sample. After building and training the classifier, the assigned label by the vendor is compared with the predicted label by the classifier and a positive score is given if labels match. Relevance: The weight values are chosen based on the interest of community members, and a mapping function is defined to assign weights, for example, giving higher weight on trojan and lower weight on DDoS.

Numerical Scoring, cont. Utility: The utility score is assigned differently depending on the type of label submitted by the vendor. For instance, we use three classes: complete labels: industrially popular name. generic labels: commonly used names such as ‘generic’, ‘worm’, and ‘trojan’. incomplete labels: ‘suspicious’, ‘malware’, and ‘unclassified’. Uniqueness: Malware samples that have not been seen before (e.g., no common hash values in current dataset) are given a high uniqueness score. Aggregated QoI: The weights are set according to the importance of each metric, all weighted scores are summed to calculate the aggregated QoI score.

Dataset For the evaluation of QoI, we used the dataset of 11 malware families submitted to VirusTotal by 48 AV vendors. (# of samples, family) Avzhan (3,458, DDoS) Darkness (1,878, DDoS) Ddoser (502, DDoS) Jkddos (333, DDoS) N0ise (431, DDoS) ShadyRAT (1,287, APT) DNSCalc (403, APT) Lurid (399, APT) Getkys (953, APT) Zero Access (568, rootkit) Zeus (1,975, banking trojan)

Result Correctness: Some vendors (4, 27 and 30) outperformed other vendors with a score in the 80s up to top 90s range. The majority of vendors have significantly lower correctness-based contribution measures than the volume-based score. Relevance: Certain contributors (42 and 43) with high volume-based scores have a low score of relevance, while some others (10, 16, and 27) have the opposite score pattern.

Result, cont. Utility: Certain vendors (39 through 41) are rated as high utility indicator providers that surpass their volume-based scores. Aggregated QoI: Some vendors (39 and 46) with low QoI scores are rated with higher volume-based scores, potentially alluding to free-riding. Some other vendors (1, 8, and 33) contribute small volume of indicators but get high QoI scores, which means they share appropriate and useful information.

Conclusion In this paper, we have the first look at the notion of the quality of indicators (QoI). As empirically analyzed, identifying the levels of contribution cannot be simply expressed in the volume-based measure of contribution. By verifying our metrics on a real-world data of antivirus scans, we unveil that contribution measured by volume is not always consistent with those quality measures.

Thank You