Hao Hu, Luo Qi, Fazhi Qi IHEP 22 Mar. 2018

Slides:



Advertisements
Similar presentations
IP Router Architectures. Outline Basic IP Router Functionalities IP Router Architectures.
Advertisements

Top-Down Network Design Chapter Nine Developing Network Management Strategies Copyright 2010 Cisco Press & Priscilla Oppenheimer.
Module 13: Performance Tuning. Overview Performance tuning methodologies Instance level Database level Application level Overview of tools and techniques.
Virtual Network Diagnosis as a Service Wenfei Wu (UW-Madison) Guohui Wang (Facebook) Aditya Akella (UW-Madison) Anees Shaikh (IBM System Networking)
Progress Report: Metering NSLP (M-NSLP) 66th IETF meeting, NSIS WG.
OpenSketch Slides courtesy of Minlan Yu 1. Management = Measurement + Control Traffic engineering – Identify large traffic aggregates, traffic changes.
Multi-Layer Switching Layers 1, 2, and 3. Cisco Hierarchical Model Access Layer –Workgroup –Access layer aggregation and L3/L4 services Distribution Layer.
Connect. Communicate. Collaborate Click to edit Master title style MODULE 1: perfSONAR TECHNICAL OVERVIEW.
Network Management Overview IACT 918 July 2004 Gene Awyzio SITACS University of Wollongong.
Monitoring a Large-Scale Network: Selecting the Right Tool Sayadur Rahman United International University & Network Manager, Financial Service.
1 ITC242 – Introduction to Data Communications Week 12 Topic 18 Chapter 19 Network Management.
The new The new MONARC Simulation Framework Iosif Legrand  California Institute of Technology.
NetFlow Analyzer Drilldown to the root-QoS Product Overview.
Data Warehousing: Defined and Its Applications Pete Johnson April 2002.
Design and Implementation of SIP-aware DDoS Attack Detection System.
Chapter 7 Configuring & Managing Distributed File System
What is FORENSICS? Why do we need Network Forensics?
Fraunhofer FOKUSCompetence Center NET T. Zseby, CC NET1 IPFIX – IP Flow Information Export Overview Tanja Zseby Fraunhofer FOKUS, Network Research.
Top-Down Network Design Chapter Nine Developing Network Management Strategies Oppenheimer.
Session 2 Security Monitoring Identify Device Status Traffic Analysis Routing Protocol Status Configuration & Log Classification.
POSTECH DP&NM Lab. Internet Traffic Monitoring and Analysis: Methods and Applications (1) 5. Passive Monitoring Techniques.
1 Detecting Malicious Flux Service Networks through Passive Analysis of Recursive DNS Traces Speaker: Jun-Yi Zheng 2010/03/29.
NetFlow: Digging Flows Out of the Traffic Evandro de Souza ESnet ESnet Site Coordinating Committee Meeting Columbus/OH – July/2004.
The University of Bolton School of Games Computing & Creative Technologies LCT2516 Network Architecture CCNA Exploration LAN Switching and Wireless Chapter.
Adaptive Data Visualization Packet Information Collection and Transformation for Network Intrusion Detection and Prevention Richard A. Aló,
Management of the LHCb DAQ Network Guoming Liu * †, Niko Neufeld * * CERN, Switzerland † University of Ferrara, Italy.
Jennifer Rexford Princeton University MW 11:00am-12:20pm Measurement COS 597E: Software Defined Networking.
PART3 Data collection methodology and NM paradigms 1.
1 | © 2015 Infinera Open SDN in Metro P-OTS Networks Sten Nordell CTO Metro Business Group
International Symposium on Grid Computing (ISGC-07), Taipei - March 26-29, 2007 Of 16 1 A Novel Grid Resource Broker Cum Meta Scheduler - Asvija B System.
Multi-objective Topology Synthesis and FPGA Prototyping Framework of Application Specific Network-on-Chip m Akram Ben Ahmed Xinyu LI, Omar Hammami.
Library Online Resource Analysis (LORA) System Introduction Electronic information resources and databases have become an essential part of library collections.
Management of the LHCb DAQ Network Guoming Liu *†, Niko Neufeld * * CERN, Switzerland † University of Ferrara, Italy.
Module 11: Configuring and Managing Distributed File System.
Module 11 Configuring and Managing Distributed File System.
NetFlow Analyzer Best Practices, Tips, Tricks. Agenda Professional vs Enterprise Edition System Requirements Storage Settings Performance Tuning Configure.
Golubev Alexandr, Chisinau 2016
Partner Billing and Reporting
Presenting and aggregating network statistics with Stager
How Alluxio (formerly Tachyon) brings a 300x performance improvement to Qunar’s streaming processing Xueyan Li (Qunar) & Chunming Li (Garena)
University of Maryland College Park
WP18, High-speed data recording Krzysztof Wrona, European XFEL
System Monitoring with Lemon
Pilot Walktour Operation Guide V3.5 (Android)
Distributed Network Traffic Feature Extraction for a Real-time IDS
Golubev Alexandr, MAGIC project 2016
MongoDB Er. Shiva K. Shrestha ME Computer, NCIT
Data Streaming in Computer Networking
Basic Policy Overview Palo Alto.
Chapter 4: Routing Concepts
Pilot Walktour Operation Guide V3.4 (Android)
CHAPTER 3 Architectures for Distributed Systems
Top-Down Network Design Chapter Nine Developing Network Management Strategies Copyright 2010 Cisco Press & Priscilla Oppenheimer.
Algorithms for Big Data Delivery over the Internet of Things
© 2002, Cisco Systems, Inc. All rights reserved.
Data collection methodology and NM paradigms
Introducing EasyVoIZ 5 Version 5.
Tutorial 8 Objectives Continue presenting methods to import data into Access, export data from Access, link applications with data stored in Access, and.
Cover page.
Nettest An implementation of BEREC’s recommendations
Chapter 8: Monitoring the Network
Overview of big data tools
Information Technology Ms. Abeer Helwa
Dynamic Management for End-to-end IP QoS
DATA COMMUNICATION Lecture-3.
IP Control Gateway (IPCG)
The design and development of Vulnerability management system
Zhihui Sun , Fazhi Qi, Tao Cui
Overview of Computer system
Top-Down Network Design Chapter Nine Developing Network Management Strategies Copyright 2010 Cisco Press & Priscilla Oppenheimer.
Presentation transcript:

Hao Hu, Luo Qi, Fazhi Qi IHEP 22 Mar. 2018 Design and Development of the Platform for Network Traffic Statistics and Analysis Hao Hu, Luo Qi, Fazhi Qi IHEP 22 Mar. 2018 ISGC 2018, Taipei

Outline Motivation Platform design Function modules Future Plan Summary

Outline Motivation Platform design Function modules Future Plan Summary

Bandwidth utilization of the links between IHEP and USA/EUR/ASIA? IHEP WAN Topology IHEP- USA IHEP-CSTNet-CERNet-USA 10Gbps IHEP- EUR IHEP-CSTNet-CERNet-London-EUR IHEP- Asia IHEP-CSTNet-HKIX-Asia 2.5Gbps IHEP- Domestic Univ IHEP-CSTNet-CERNet-Univ review Bandwidth utilization of the links between IHEP and USA/EUR/ASIA? 1

Motivation--IHEP Traffic Status IPv4 average traffic: 2.5Gbps(in+out), Max. 7Gbps IPv6 average traffic: 1Gbps(in+out), Max. 1.5Gbps Data exchange: over 7PB/year 2

Motivation Know clearly for: Who, When, What, Where, Why in the network traffic Network optimization--Reliability/Performance/Efficiency Historical trends for strategic planning  Network security analysis .... WHERE Who: IP addresses / Users When: on which time What: protocols, ports, data traffic, applications, etc. Where: flow direction, which countries/regions (max.volume) Why: malicious attacks or normal data transfer 3

Motivation--IHEP traffic statistics status Traffic statistics based on IP address range:Daya Bay、CMS、ATLAS Traffic statistics based on experimental data transfer system:PhEDEx Lack of overall fine-grained network traffic statistics and analysis Lack of user behavior analysis and intrusion detection in cyber security 4

Outline Motivation Platform design Function modules Future Plan Summary

General design principles Traffic of high-speed network can be captured without missing:10Gbps Traffic flow records should include the following elements: 5-Tuple(src_ip, src_port, dst_ip, dst_port, protocal) Large amount of historical data can be stored and queried efficiently: at least 1 year raw data Data analysis module should be extensible (add or remove analysis plugins) Flexible and friendly user interface 5

Architecture Data Sources + Data preprocessing(ingest & fusion)+ Storage + Data Query + Graphic Display Principle: loose coupling between layers--extensible 6

Key tool— pmacct Open source software A small set of multi-purpose passive network monitoring tools which can account, classify, aggregate, replicate and export forwarding-plane data, ie. IPv4 and IPv6 traffic; Collect data through: libpcap, Netlink/NFLOG, NetFlow v1/v5/v7/v8/v9, sFlow v2/v4/v5 and IPFIX Save data to backends including: Relational Databases, NoSQL databases, RabbitMQ, Kafka, memory tables, flat files 7

pmacct workflow Workflow: 8 » Receives sFlow/netflow/IPFIX data from devices » Network flow aggregate » Reduces data diversity » Precompiles statistics -> reduces amount of data » Looks like RRD (round robin database) » Resolution of statistics » sFlow 15 seconds » IPFIX 5 minutes » Output with tab-spaced/CSV/Avro/JSON format 8

Outline Motivation Platform design Function modules Future Plan Summary

Realization: Function modules Data acquisition:aggregate, classify--pmacct Data storage:Message broker(kafka)+Database (MySQL/MongoDB) Data analysis: Spark cluster according to GeoIP database Analysis Layer Data presentation:web service + Echarts 9

Data Acquisition--pmacct Data source: collected from border router of IHEP Key points: PMACCT server collects data through libpcap (according to the network device ) Mirror port: avoids the impact on the performance of network devices Data preprocessing purpose:improving the efficiency of data storage and data reading PMACCT configuration:aggregation, filtering, classification aggregation:5mins rules:src_ip, dst_ip, src_port, dst_port, proto filtering:pcap_filter(collect IN/OUT source data) 10

Data storage: Kafka + MongoDB pmacct Kafka plugin Kafka-mongodb connector pmacct Kafka MongoDB Kafka fast, scalable, durable and distributed for big data: Millions of records within 5 minutes(IN+OUT) The data within 1 week are preserved MongoDB Distributed and document-oriented database(15G/month) With high insert and query performance, the average insert speed is up to 7000 records/s Source data are saved as: pmacct_IN_topic pmacct_OUT_topic 11

Data analysis Processing Storage Open-source & cluster-computing framework: Spark cluster. Task: timed cron jobs are set to calculate IN/OUT cumulative traffic between IHEP and domestic/international IP addresses within 10mins, 1hour, 1day separately. GeoLite2 database is used to classify the regional information of the traffic. Connector task scheduling source data results MongoDB Processing Storage 12

Data analysis—processing efficiency Flows/ Records Last 24 hours Time(s) The amount of data records every 10 mins (IN): 200,000 – 900,000 Processing time is stable: less than 200 seconds Computing source can be increased for larger volumes of data 13

Data Presentation International/National/Total traffic Daily/Weekly/Monthly/Yearly 14

Outline Motivation Platform design Function modules Future Plan Summary

Future Plan Data Analysis Display Rule-based network intrusion detection module will be added to identify the malicious action in network(DDoS attack, Scans,Worms) User behavior analysis(P2P Apps, Botnets) Display GeoIP plugin will be used to display regional traffic data on a map. 15

Summary A framework with network traffic data acquisition, storage, analysis and graphic interface has been finished. The network traffic statistics based on IP prefix and GeoLite2 database has been realized. Network security detection plugin and network traffic statistics display on map are under developing. 16

Thank you for your attention!