Download presentation
Presentation is loading. Please wait.
1
Network Data Visualization Neal Patwari Assistant Professor Dept of Electrical & Computer Engineering http://span.ece.utah.edu CS 5480/6480 - Computer Networks 14 October 2007
2
Slide 2 Computer Networks October 16, 2007© 2007 Network Data Measurement S1S1 S2S2 S3S3 S4S4 S5S5 S8S8 S6S6 S9S9 S7S7 Unlimited Measurements Limited Communication
3
Slide 3 Computer Networks October 16, 2007© 2007 Internet Monitoring Detect ‘zero-day’ attacks on Internet Attacks change distribution of traffic Collaboration Abilene backbone network: 11 routers Packet Size Protocol Port y t,l (s i,d i,p i )
4
Slide 4 Computer Networks October 16, 2007© 2007 Work: Network Visualization Spatial View: network connectivity / geography Temporal View: traffic by feature vs. time [1]Plonka, D. "FlowScan: A Network Traffic Flow Reporting and Visualization Tool". In Proc. LISA 2001. Skitter Plot - CAIDA FlowScan, D. Plonka [1]
5
Slide 5 Computer Networks October 16, 2007© 2007 Internet Monitoring Relationships –Temporal Abilene backbone network: 11 routers Time Value Time Value Time Value Time Value Time Value Time Value Time Value Time Value Time Value Time Value –Spatial –Feature
6
Slide 6 Computer Networks October 16, 2007© 2007 What data? Or distances? Assume high dim measurements –at nodes i and j, in range 1…n –vectors v i and v j We calculate a distance ij = || v i - v j ||
7
Slide 7 Computer Networks October 16, 2007© 2007 Outline Manifold Learning and Dimension Reduction Map-tools Operation Map-tools Example
8
Slide 8 Computer Networks October 16, 2007© 2007 Manifold Learning: Introduction Extract low-d structure from high-d data Data may lie on curved (but locally linear) subsurface
9
Slide 9 Computer Networks October 16, 2007© 2007 MDS/PCA: Project to Linear Subspace Find the linear subspace (plane) which best fits projected data points Result: May be inaccurate
10
Slide 10 Computer Networks October 16, 2007© 2007 Manifold Learning: Preserve Neighbors Keep local structure (nearest neighbors) –Preserve distances or similarities –Only use nearest neighbors
11
Slide 11 Computer Networks October 16, 2007© 2007 Manifold Learning: Preserve Neighbors Isomap [1]: –Preserve shortest path distances in kNN graph –Find coordinates z to minimize [1]J.B. Tenenbaum, V. de Silva, J.C. Langford “A Global Geometric Framework for Nonlinear Dimensionality Reduction” Science, 22 Dec 2000.
12
Slide 12 Computer Networks October 16, 2007© 2007 Manifold Learning: Preserve Neighbors Laplacian eigenmaps (LE): –Preserve similarity, i.e., inverse distance, –Locally Linear Embedding (LLE) [1], Hessian-based LLE –Find coordinates z to minimize [1]S.T. Roweis and L.K. Saul, "Nonlinear Dimensionality Reduction by Local Linear Embedding“Science, 22 Dec 2000.
13
Slide 13 Computer Networks October 16, 2007© 2007 Manifold Learning: Preserve Neighbors Weighted multi-dimensional scaling (WMDS): –Preserve weighted dist (weight = 0 for non-neighbors) –Find coordinates z to minimize Pair (i,j) weight: Prior for i weight: –Non-incr. cost: majorization, computation O(k) –Local communication, measurement [12]J. Costa, N. Patwari, A.O. Hero III “Distributed Weighted Multidimensional Scaling for Node Localization in Sensor Networks”, IEEE/ACM Trans. Sensor Networks, Feb. 2006.
14
Slide 14 Computer Networks October 16, 2007© 2007 Manifold Learning Localization Algs. [11]Y. Shang, W. Ruml, Y. Zhang, M.P.J. Fromherz, “Localization from mere connectivity,” in Mobihoc ’03, June 2003, pp. 201–212. [13]N. Patwari, A.O. Hero III “Adaptive neighborhoods for manifold learning-based sensor localization”, IEEE SPAWC 2005, June 2005. [14] N. Patwari, A.O. Hero III, J. Costa “Learning Sensor Location from Signal Strength and Connectivity”, (to appear) Springer edited volume, Radha Poovendran, Cliff Wang, and Sumit Roy, eds. Eigen-de- composition Iterative, distributed majorization Eigen- decomposition Algorithm Basis Natural for connectivity Can incorporate prior info Sensitive to large range errors Notes Cost to Minimize SimilarityDistance Distance or Similarity? LEAN [13,14] WMDS [12] MDS-MAP [11] or Isomap
15
Slide 15 Computer Networks October 16, 2007© 2007 Analogy: Distance / Similarity Mesh is rotated (and scaled) to match a priori coordinate information Distance-based Algorithms Similarity-based Algorithms Springs have natural length = measured distance, and can push or pull. Rubber bands have thickness (similarity), and can only pull. Constraint: average distance from center ≠ 0.
16
Slide 16 Computer Networks October 16, 2007© 2007 Two Perspectives on one Solution Equivalent Problems: –Find coordinates for sensor’s data –Find location of sensor Figure 4.6: The intrinsic geometric structure (represented using Isomap K=6) of a sequence of 64x64 pixel images of a face rendered with different poses and lighting directions. [6]J.B. Tenenbaum, V. de Silva, J.C. Langford “A Global Geometric Framework for Nonlinear Dimensionality Reduction” Science, 22 Dec 2000. Right Left Camera High Low
17
Slide 17 Computer Networks October 16, 2007© 2007 Outline Manifold Learning and Dimension Reduction Map-tools Operation Map-tools Example
18
Slide 18 Computer Networks October 16, 2007© 2007 Tool Operation Detailed installation procedure is online: http://span.ece.utah.edu/pmwiki/pmwiki.php?n=Main.Map-tools Installation requirements –wnlib – Will Naylor & Bill Chapman, a C code library for computation, data structures –Map-tools code For use with network data –Flow-tools – Mark Fullmer’s tools for processing NetFlow data –NetFlow – Cisco, save 1/N packet headers
19
Slide 19 Computer Networks October 16, 2007© 2007 Map-tools operation Sensor Data Vectors {v i } for i=1…N NetFlow Measurement Files Scripts using flow-tools Sensor Data Distances { i,j } for ( i,j) C {1…N } 2 spl2dist wmds Sensor Coordinates {x i } for i = 1…N coords2eps Sensor Map Graphic Prior coordinate information {x i } & r i for i = 1…N sensorRouter, sensorPort, and sensorTime
20
Slide 20 Computer Networks October 16, 2007© 2007 NetFlow Data Storage Abilene has directory structure $NETFLOW/ ATLAng CHINng …. WASHng 2005 2006 2007 2006-01 2006-02 … 2006-12 2006-02-01 2006-02-02 … 2006-02-28 24 hours of 5 minute files
21
Slide 21 Computer Networks October 16, 2007© 2007 Example Operation: sensorRouter sensorRouter [year-mo-da] [hr:mn] [reportNum] [columnSelected] {filterString} sensorRouter 2005-01-06 17:55 8 1 > temp.sdat Report Numbers: 5 UDP/TCP destination port 6 UDP/TCP source port 7 UDP/TCP port 8 Destination IP 9 Source IP 10 Source/Destination IP 11 Source or Destination IP 12 IP protocol 16 IP Next Hop 17 Input interface 18 Output interface 19 Source AS 20 Destination AS 21 Source/Destination AS 22 IP ToS 23 Input/Output Interface 24 Source Prefix 25 Destination Prefix 26 Source/Destination Prefix Column Numbers: 1 Flows, 2 Octets, 3 Packets.
22
Slide 22 Computer Networks October 16, 2007© 2007 Example Operation: temp.sdat #date 2005-01-06 #time 17:55 #command flow-stat #arguments -f9-S1 #flow-filter-args // #begin ATLA 130.91.40.0 2542 130.14.24.0 1675 171.66.120.0 1373 207.68.176.0 1037 207.68.168.0 1027 198.32.152.0 857 207.46.248.0 772 128.2.192.0 756 199.77.128.0 735 160.36.56.0 731 64.4.16.0 724 169.229.48.0 683 129.241.56.0 650 152.2.208.0 623 … #end ATLA #begin CHIN 140.113.144.0 65312 140.135.8.0 25459 207.68.176.0 4222 207.68.168.0 3658 207.46.248.0 3032 128.208.128.0 2802 171.66.120.0 2686 64.4.16.0 2585 207.46.104.0 2363 64.4.56.0 2149 204.179.120.0 2140 130.14.24.0 1847 195.148.248.0 1800 64.4.48.0 1726 65.54.192.0 1694 209.175.40.0 1678 128.223.216.0 1618 65.54.200.0 1597 … #end CHIN #begin WASH 130.91.40.0 13407 130.14.24.0 7420 128.230.32.0 5109 152.2.208.0 2693 136.142.88.0 2113 140.90.192.0 2037 128.112.136.0 2012 205.156.48.0 1619 207.68.176.0 1514 207.68.168.0 1389 128.208.128.0 1279 128.2.192.0 1236 136.142.136.0 1153 204.179.120.0 1153 129.241.56.0 1139 130.91.56.0 1068 193.204.72.0 1066 216.165.104.0 1039 152.3.136.0 979 207.46.104.0 948 … #end WASH
23
Slide 23 Computer Networks October 16, 2007© 2007 Example Operation: spl2dist cat temp.sdat | spl2dist > temp.dst 0 0.25219 0.085844 0.044975 0.070635 0.081589 0.06363 0.069737 0.07584 0.10461 0.075869 0.25219 0 0.25439 0.25471 0.25021 0.25306 0.25486 0.25274 0.25355 0.2608 0.26588 0.085844 0.25439 0 0.094413 0.029012 0.016077 0.092319 0.089151 0.084442 0.039092 0.12688 0.044975 0.25471 0.094413 0 0.081154 0.090056 0.058425 0.082722 0.076684 0.11365 0.07889 0.070635 0.25021 0.029012 0.081154 0 0.02096 0.080102 0.072976 0.072323 0.055431 0.11501 0.081589 0.25306 0.016077 0.090056 0.02096 0 0.089066 0.084622 0.081436 0.043225 0.1235 0.06363 0.25486 0.092319 0.058425 0.080102 0.089066 0 0.088622 0.041734 0.11924 0.10538 0.069737 0.25274 0.089151 0.082722 0.072976 0.084622 0.088622 0 0.085641 0.10708 0.086854 0.07584 0.25355 0.084442 0.076684 0.072323 0.081436 0.041734 0.085641 0 0.11318 0.1203 0.10461 0.2608 0.039092 0.11365 0.055431 0.043225 0.11924 0.10708 0.11318 0 0.13997 0.075869 0.26588 0.12688 0.07889 0.11501 0.1235 0.10538 0.086854 0.1203 0.13997 0 ATLA CHIN DNVR HSTN IPLS KSCY LOSA NYCM SNVA STTL WASH
24
Slide 24 Computer Networks October 16, 2007© 2007 Example Operation: wmds cat temp.dst | wmds -n 11 -K 5 -p fourWeekJanAvg.f8.S1.K5.r10-3.crds -r 0.001 -w loess -ND > temp.crds 0.34765 0.87826 -0.15339 0.25189 -0.25142 -0.093909 -0.45259 0.80821 -0.84686 -0.73001 1.3737 -0.48442 3.9062 0.75528 -1.058 0.25641 0.53271 -0.87555 0.22546 -0.38275 0.90408 -0.93036 2.3660 11.460 7.2655 6.7590 2.1818 5.6226 6.7851 4.1489 5.1869 7.8140 8.0149 ATLA CHIN DNVR HSTN IPLS KSCY LOSA NYCM SNVA STTL WASH
25
Slide 25 Computer Networks October 16, 2007© 2007 Example Operation: coords2eps cat temp.crds | coords2eps -n 11 -m fourWeekJanAvg.f8.S1.K5.r10-3.crds -z -c abilenePrior.conn > temp.eps 4-week Router Avg Current router coord Network Connection CITY Router Name
26
Slide 26 Computer Networks October 16, 2007© 2007 Example Operation: Notes Thursday, 6-Jan-2005 at 17:55 UTD: anomaly that totals 90,000 flows at CHIN router. single, 40-byte packet flows from two source IP addresses in Taiwan to small range destination IPs in Hungary observed on CHIN and no other router
27
Slide 27 Computer Networks October 16, 2007© 2007 Outline Manifold Learning and Dimension Reduction Map-tools Operation Map-tools Examples
28
Slide 28 Computer Networks October 16, 2007© 2007 Results: Sensor Map Example Typical router map, 18-Jan 17:00 UTD Sensors (routers) as positioned by dwMDS Coordinates are normalized (flows) so are unitless Lines show physical Abilene links Small dots (- - -) show distance from 4-week mean coord
29
Slide 29 Computer Networks October 16, 2007© 2007 Maps Respond to Anomaly Wed. 19-Jan 2005, 0:00-1:00 UTD At 0:30, 0:35: large network scan –22,000 anomalous flows observed at STTL, DNVR, KSCY, IPLS, ATLA –60-byte, TCP –From a few Miss. State U. IPs, Src Port < 1024 –To range of Microsoft IPs, Dest Port 113
30
Slide 30 Computer Networks October 16, 2007© 2007 Time Series: Small Change Abilene Backbone Total Flows, 18-19 Jan ‘05 Network Scan
31
Slide 31 Computer Networks October 16, 2007© 2007 2-D Data: Large change Test with a multivariate t-test Network Scan 2 2:45kflow port scan from.tw to.dk 3:46kflow port scan from.tw to.pl 3
32
Slide 32 Computer Networks October 16, 2007© 2007 More Examples Sunday, 02-Jan-2005 at 3:00 UTD: There is a port scan occurring between 2:45 and 3:30 which involves two source IP addresses sending a total of about 61,000 flows per 5 minutes. The traffic is measured only at CHIN, IPLS, DNVR, and KSCY. The flows are coming from source IPs 198.59.80.0 (unknown) and 140.113.200.0 (nctu.edu.tw) from port 48775 to destination IP 140.113.200.0 (du.se, Högskolan Dalarna, Sweden). The source AS number is zero. Almost all of the flows are single, 29-byte UDP packets, to a wide range of destination ports. There are a few, larger (100-300 kB flows) to ports 22, 53, 6667, and 6669. Because of the low traffic level (it is a Sunday and the day after New Year's day) this traffic corresponds to 40% of the total number of flows.
33
Slide 33 Computer Networks October 16, 2007© 2007 More Examples Wednesday, 5- Jan-2005 at 08:55 UTD: scheduled maintenance on CHIN-IPLS link scheduled maintenance on CHIN-IPLS link during downtime, reroutes Eg, through WASH and ATLA
34
Slide 34 Computer Networks October 16, 2007© 2007 More Examples Four-week summary map Center dot shows average for ellipse Dashed ellipses show 1- range of each router
35
Slide 35 Computer Networks October 16, 2007© 2007 More Examples Destination port map 01-Jan-2005 3:35 UTD dest port # past 1 hour map history (dotted line circles). Sensors attached to top 30 dest ports measure flows per source IP
36
Slide 36 Computer Networks October 16, 2007© 2007 Conclusion We can visualize measurements by their relationship with other measurements –Space, time, feature One way to do this: –WMDS method, Map-tools
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.