Unconstrained Endpoint Profiling Googling the Internet Ionut Trestian, Supranamaya Ranjan, Alekandar Kuzmanovic, Antonio Nucci Reviewed by Lee Young Soo.

Slides:



Advertisements
Similar presentations
Wenke Lee and Nick Feamster Georgia Tech Botnet and Spam Detection in High-Speed Networks.
Advertisements

Web Mining.
Google-based Traffic Classification Aleksandar Kuzmanovic Northwestern University IEEE Computer Communications Workshop (CCW 08) October 23, 2008
The Internet and the Web
Taming User-Generated Content in Mobile Networks via Drop Zones Ionut Trestian Supranamaya Ranjan Aleksandar Kuzmanovic Antonio Nucci Northwestern University.
Measuring Serendipity: Connecting People, Locations and Interests in a Mobile 3G Network Ionut Trestian Supranamaya Ranjan Aleksandar Kuzmanovic Antonio.
A Survey of Botnet Size Measurement PRESENTED: KAI-HSIANG YANG ( 楊凱翔 ) DATE: 2013/11/04 1/24.
Detecting Malicious Flux Service Networks through Passive Analysis of Recursive DNS Traces Roberto Perdisci, Igino Corona, David Dagon, Wenke Lee ACSAC.
Marios Iliofotou (UC Riverside) Brian Gallagher (LLNL)Tina Eliassi-Rad (Rutgers University) Guowu Xi (UC Riverside)Michalis Faloutsos (UC Riverside) ACM.
ClassBench: A Packet Classification Benchmark
BotMiner Guofei Gu, Roberto Perdisci, Junjie Zhang, and Wenke Lee College of Computing, Georgia Institute of Technology.
Wide-scale Botnet Detection and Characterization Anestis Karasaridis, Brian Rexroad, David Hoeflin.
Multi-level Application-based Traffic Characterization in a Large-scale Wireless Network Maria Papadopouli 1,2 Joint Research with Thomas Karagianis 3.
PBS: Periodic Behavioral Spectrum of P2P Applications Tom Z.J. Fu, Yan Hu, Xingang Shi, Dah Ming Chiu and John C.S. Lui The Chinese University of Hong.
Internet Cache Pollution Attacks and Countermeasures Yan Gao, Leiwen Deng, Aleksandar Kuzmanovic, and Yan Chen Electrical Engineering and Computer Science.
Unconstrained Endpoint Profiling (Googling the Internet)‏ Ionut Trestian Supranamaya Ranjan Aleksandar Kuzmanovic Antonio Nucci Northwestern University.
Personalized Ontologies for Web Search and Caching Susan Gauch Information and Telecommunications Technology Center Electrical Engineering and Computer.
Licentiate Seminar: On Measurement and Analysis of Internet Backbone Traffic Wolfgang John Department of Computer Science and Engineering Chalmers University.
Lesson 19 Internet Basics.
RelSamp: Preserving Application Structure in Sampled Flow Measurements Myungjin Lee, Mohammad Hajjat, Ramana Rao Kompella, Sanjay Rao.
WEB ANALYTICS Prof Sunil Wattal. Business questions How are people finding your website? What pages are the customers most interested in? Is your website.
Prof. Vishnuprasad Nagadevara Indian Institute of Management Bangalore
A fast identification method for P2P flow based on nodes connection degree LING XING, WEI-WEI ZHENG, JIAN-GUO MA, WEI- DONG MA Apperceiving Computing and.
Web Usage Mining Sara Vahid. Agenda Introduction Web Usage Mining Procedure Preprocessing Stage Pattern Discovery Stage Data Mining Approaches Sample.
FALL 2012 DSCI5240 Graduate Presentation By Xxxxxxx.
CS 401 Paper Presentation Praveen Inuganti
SharePoint 2010 Business Intelligence Module 6: Analysis Services.
Introduction The large amount of traffic nowadays in Internet comes from social video streams. Internet Service Providers can significantly enhance local.
Abstract Introduction Results and Discussions James Kasson  (Dr. Bruce W.N. Lo)  Information Systems  University of Wisconsin-Eau Claire In a world.
DPNM, POSTECH 1/23 NOMS 2010 Jae Yoon Chung 1, Byungchul Park 1, Young J. Won 1 John Strassner 2, and James W. Hong 1, 2 {dejavu94, fates, yjwon, johns,
COMP 6005 An Introduction To Computing Session Four: Internetworking and the World Wide Web.
P.1Service Control Technologies for Peer-to-peer Traffic in Next Generation Networks Part2: An Approach of Passive Peer based Caching to Mitigate P2P Inter-domain.
UOS 1 Ontology Based Personalized Search Zhang Tao The University of Seoul.
When Experts Agree: Using Non-Affiliated Experts To Rank Popular Topics Meital Aizen.
An Overview of the Internet: The Internet: Then and Now How the Internet Works Major Features of the Internet.
Microsoft Research1 Characterizing Alert and Browse Services for Mobile Clients Atul Adya, Victor Bahl, Lili Qiu Microsoft Research USENIX Annual Technical.
Heuristics to Classify Internet Backbone Traffic based on Connection Patterns Wolfgang John and Sven Tafvelin Dept. of Computer Science and Engineering.
Srivastava J., Cooley R., Deshpande M, Tan P.N.
April 4th, 2002George Wai Wong1 Deriving IP Traffic Demands for an ISP Backbone Network Prepared for EECE565 – Data Communications.
Qiang Xu†, Yong Liao‡, Stanislav Miskovic‡, Z. Morley Mao†, Mario Baldi‡, Antonio Nucci‡, Thomas Andrews† †University of Michigan, ‡Symantec, Inc.
Studying Spamming Botnets Using Botlab
Web-Mining …searching for the knowledge on the Internet… Marko Grobelnik Institut Jožef Stefan.
Search Worms, ACM Workshop on Recurring Malcode (WORM) 2006 N Provos, J McClain, K Wang Dhruv Sharma
Search Engine using Web Mining COMS E Web Enhanced Information Mgmt Prof. Gail Kaiser Presented By: Rupal Shah (UNI: rrs2146)
Tracking Malicious Regions of the IP Address Space Dynamically.
BotCop: An Online Botnet Traffic Classifier 鍾錫山 Jan. 4, 2010.
Detecting and Characterizing Social Spam Campaigns Yan Chen Lab for Internet and Security Technology (LIST) Northwestern Univ.
Development of a QoE Model Himadeepa Karlapudi 03/07/03.
Library Online Resource Analysis (LORA) System Introduction Electronic information resources and databases have become an essential part of library collections.
2009/6/221 BotMiner: Clustering Analysis of Network Traffic for Protocol- and Structure- Independent Botnet Detection Reporter : Fong-Ruei, Li Machine.
#16 Application Measurement Presentation by Bobin John.
Googling the Internet (and Beyond) Aleksandar Kuzmanovic EECS Department Northwestern University
1 Internet Traffic Measurement and Modeling Carey Williamson Department of Computer Science University of Calgary.
Introduction Web analysis includes the study of users’ behavior on the web Traffic analysis – Usage analysis Behavior at particular website or across.
Chapter 8: Web Analytics, Web Mining, and Social Analytics
The Web Web Design. 3.2 The Web Focus on Reading Main Ideas A URL is an address that identifies a specific Web page. Web browsers have varying capabilities.
Interaction and Animation on Geolocalization Based Network Topology by Engin Arslan.
Automated Experiments on Ad Privacy Settings
Discovering User Access Patterns on the World-Wide Web
Monitoring Network Bias
Internet LINGO.
Network Profiler: Towards Automatic Fingerprinting of Android Apps
Introduction to the Internet and Web
Web Privacy Chapter 6 – pp 125 – /12/9 Y K Choi.
2019/1/1 High Performance Intrusion Detection Using HTTP-Based Payload Aggregation 2017 IEEE 42nd Conference on Local Computer Networks (LCN) Author: Felix.
Lesson 19 Internet Basics.
Transport Layer Identification of P2P Traffic
The Internet and Electronic mail
Unconstrained Endpoint Profiling (Googling the Internet)‏
When Machine Learning Meets Security – Secure ML or Use ML to Secure sth.? ECE 693.
Presentation transcript:

Unconstrained Endpoint Profiling Googling the Internet Ionut Trestian, Supranamaya Ranjan, Alekandar Kuzmanovic, Antonio Nucci Reviewed by Lee Young Soo

Introduction  Obtaining ‘raw’ packet trace from operational networks can be very hard.  Accurately classifying in an online fashion at high speeds is an inherently hard problem. For understanding what people are doing on the Internet Analyze operational network trace.

Unconstrained Endpoint Profiling  Introduction of a novel methodology.  No operational traces are available  Packet-level traces are available  Sampled flow-level traces are available  Internet access trend analysis for four world regions.

Methodology  Rule Generation  Querying Google using a sample ‘seed set’ of random IP address from the networks in four world regions.  Constrain top N keywords that could be meaningfully used for endpoint classification.

Methodology

 Web Classifier  Rapid URL search  Hit text search  Example URL :

Methodology  IP tagging  URL based tagging  General hit text based tagging  Hit text based tagging for Forums  Post-date & username is in the vicinity of the IP address => forum user  Presence of following keywords : ftp:\, ppstream:\, mms:\ => http share, ftp share, streaming node

Methodology  Examples  inforum.insite.com  URL based tagging  ttzai.com  Hit text based tagging for Forum

Information come from  Web logs  Proxy logs  Forums  Malicious list  Server list  P2P communication

Evaluation  When No Traces are Available.  When Packet-Level Trace are Available.  When Sampled Trace are Available.

When No Traces are Available  Applying the unconstrained endpoint approach on a subset of the IP range belonging to four ISPs shown in above table.

When No Traces are Available

Correlation with operational traces. Correlation with other sources. Unconstrained endpoint profiling approach can be effectively used to estimate application popularity trends.

When Packet-Level Trace are Available BLINC Off-line tool Cannot classify particularly at application level Variable quality result for different traces UEP Superior classification result Efficiently operate online

When Packet-Level Trace are Available  Collect most popular 5% of IP address and tag them by applying the methodology.  Use this information to classify the traffic flow.

When Packet-Level Trace are Available

When Sampled Trace are Available  Due to sampling, insufficient amount of data remains in the trace, and hence the graphlets approach simply does not work.  Popular endpoint are still present in the trace, despite sampling.

When Sampled Trace are Available  Endpoint approach remains largely unaffected by sampling.

Endpoint Profiling  Endpoint Clustering  Employ clustering in networking has been done before : Autoclass algorithm.  A set of tagged IP addresses from region’s network Input to the endpoint clustering algorithm.

Endpoint Profiling  Browsing, browsing and chat or mail seems to be most common behavior.

Endpoint Profiling  Traffic Locality

Conclusion  UEP  Accurately predict application and protocol usage trends when no network traces are available.  Dramatically out perform when packet traces are available.  Retain high classification capabilities when flow-level traces are available.  Profile endpoints residing at four different world regions.  Network applications and protocols used in these region.  Characteristics of endpoint classes that share similar access patterns.  Clients’ locality properties.