observation theoryreality models experiment instruments virtual reality predictions test Other disciplines are similar: whole genomes, satellite maps, sensor networks, etc.
To understand the complex reality, we need complex models To verify complex models we need a lot of data (large statistical sample)
Virtual Observatories: few key elements Modern web and database technologies to standardize and share data Advanced indexing techniques for multi- terabyte, multidimensional archives Distributed computing for analysis and modeling – move computation to data Visualization Same challenges for different disciplines: we can share the tools and solutions
Internet Although man-made but there is no “blueprint” “Astronomical” number of components, complex non-linear interactions We need similar methods as in natural sciences to understand its behavior We need to observe / experiment model Future internet: self- aware, self-managing, self-healing …
Measurement/experiment facilities Planetlab / Onelab A network of open computers distributed across the world and available for the development of new network services. Realistic platform available for trial deployment and experimentation, with services such as distributed storage, network mapping, peer-to-peer systems, distributed hash tables and query processing. OneLab2 EU Integrated Project ,29 participants Europe wide + 1 from Japan, budget 7.5 M€
Measurement/experiment facilities Etomic An open m easurement infrastructure in Europe to carry out high temporal resolution (~10 nano second), globally synchronized, active measurements. Provides high resolution, spatially extended dynamic picture of fast changes in the network traffic.
ETOMIC hardware Server PC architecture, Linux Endace DAG 3.6 GE card or ARGOS FPGA measurement card with packet sending capability (packet offset ~60 ns) GPS antenna for global time synchronization New low cost version based on Blackfin microcontroller dedicated IP packet time stamping module (< μsec) Can be called as SOAP web services / from stored procedures low cost (300 €)
Web interface: experiment bundle
Scheduling Experiments Time slot reservation in a calendar system for the experiments
Science archive – based on CasJobs CasJobs: database + batch + web services (Developed for SDSS; Alex Szalay, William O’Mullane, Nolan Li, María Nieto-Santisteban, Ani Thakar, Jim Gray) CasJobs: database + batch + web services (Developed for SDSS; Alex Szalay, William O’Mullane, Nolan Li, María Nieto-Santisteban, Ani Thakar, Jim Gray)
most network measurement projects: use a single dedicated infrastructure scan only narrow sub-segments analyze a limited set of network characteristics centralized and separated from each other key idea: try to interconnect separate measurement efforts large-scale behavior long-term evolution Network Measurement Virtual Observatory
The challenge: seamless integration of multiple tools and data sources We need easy to use standards for data access/exchange Searchable metadata, organized access Working prototypes, use cases Unified interface: standardized data model (NetXML) wrapper each platform (to convert data into NetXML) standardized way of reaching services (Web services) Mediator: metadata registry service registry (e.g. time slice service) joint queries: query broker Many ideas, concepts from astronomical VO
database Processing functions Stored procedures Search tools Visualization tools Web service interface graphical user interface traceroute archive SQL data subset XML, SOAP WSDL, processed data database Processing functions Stored procedures Search tools Visualization tools Web service interface graphical user interface delay archive SQL data subset XML, SOAP WSDL, processed data database Processing functions Stored procedures Search tools Visualization tools Web service interface graphical user interface BGP archive SQL data subset XML, SOAP WSDL, processed data Mediator/query broker GUI WS
Use cases: what do we measure one way delay (60 ns resolution) tracking topology changes available bandwidth meter transport protocol testing queuing delay tomography geolocalization …
Geolocalization A B
Useful for: content distribution, user identification, network diagnostics, language selection,… WhoIS and DNS databases are not reliable Method: probabilistic triangulation from landmark nodes More landmarks – more precise localization – more computation
Hierarchical Triangular Mesh Hierarchical subdivision of spherical triangles, represented as quad tree. Developed for indexing the sky (SDSS; A. Szalay, G. Fekete) Hierarchical subdivision of spherical triangles, represented as quad tree. Developed for indexing the sky (SDSS; A. Szalay, G. Fekete) Standalone library + SQL Server integration Quick routines for intersecting spherical polygons: speedup geolocalization calculations Standalone library + SQL Server integration Quick routines for intersecting spherical polygons: speedup geolocalization calculations
P. Mátray, P. Hága, S. Laki, J. Stéger B. Hullár, G. Vattay (Eötvös University) collaborations Universidad Autonoma de Madrid Universidad Publica de Navarra Ericsson Research Tel Aviv University This work was partially supported by the National Office for Research and Technology (NAP 2005/ KCKHA005) and the EU ICT Moment (#215225) and OneLab2 Integrated Project (#224263). Collaborators, acknowledgements
Sok boldog születésnapot, Sanyi!
Sándor Laki (C) IP Geolocation 31 Plans Probabilistic geolocation Using the distribution of the velocity of signal propagation Pre-clustering delay measurements
New kind of science: mega-surveys New technology has allowed new kind of surveys All-sky surveys GALEX,SDSS, IRAS,WMAP, NVSS, … Multi-wavelength Radio, infrared, optical, UV, X-ray, Gamma-ray Gravitational waves, neutrinos & cosmic particles Time domain Pan-STARRS, LSST, GAIA, … Collecting data in (public) digital archives Instead of targeting special objects: observe everything New challenge: store, organize, analyze huge amount of data Virtual Observatories / E-science
main goals: create large archives to: share available measurement data & analysis results provide easy-to-use “online” network data analysis tools integrate existing measurement and monitoring infrastructures toward a common and open platform: standardisation of network measurement data design and implement a system to share network data
store & share raw data joint analysis of different types of measurement data reanalysis (with new statistical methods) reference data (historical comparison) share analysis tools server side processing simplifies client applicatons no need to transfer bulk data packages: online processing exchange data in a standardized way existing efforts (OGF, IETF, KML) Network Measurement Virtual Observatory (nmVO) Database Web services netXML
Science archive – based on CasJobs database Web service interface Stored procedure