A year & a summer of June 21 2012 – August 31 2013.

Slides:



Advertisements
Similar presentations
Summary Cache: A Scalable Wide-Area Web Cache Sharing Protocol Li Fan, Pei Cao and Jussara Almeida University of Wisconsin-Madison Andrei Broder Compaq/DEC.
Advertisements

IT253: Computer Organization
Sorting Really Big Files Sorting Part 3. Using K Temporary Files Given  N records in file F  M records will fit into internal memory  Use K temp files,
Confused, Timid, and Unstable: Picking a Video Streaming Rate is Hard Published in 2012 ACM’s Internet Measurement Conference (IMC) Five students from.
Toyota InfoTechnology Center U.S.A, Inc. 1 Mixture Models of End-host Network Traffic John Mark Agosta, Jaideep Chandrashekar, Mark Crovella, Nina Taft.
Measurement, Modeling, and Analysis of a Peer-2-Peer File-Sharing Workload Presented For Cs294-4 Fall 2003 By Jon Hess.
Sampling and Flow Measurement Eric Purpus 5/18/04.
Disk Access Model. Using Secondary Storage Effectively In most studies of algorithms, one assumes the “RAM model”: –Data is in main memory, –Access to.
Query Reordering for Photon Mapping Rohit Saboo. Photon Mapping A two step solution for global illumination: Step 2: Shoot eye rays and perform a “gather”
1 Web Performance Modeling Chapter New Phenomena in the Internet and WWW Self-similarity - a self-similar process looks bursty across several time.
Adaptive Sampling in Distributed Streaming Environment Ankur Jain 2/4/03.
GLAST LAT ProjectGLAST Flight Software IDT, October 16, 2001 JJRussell1 October 16, 2001 What’s Covered Activity –Monitoring FSW defines this as activity.
Using Secondary Storage Effectively In most studies of algorithms, one assumes the "RAM model“: –The data is in main memory, –Access to any item of data.
First a digression The POC Ranking the Methods Jennie Watson-Lamprey October 29, 2007.
An Introduction to Internetworking. Algorithm for client-server communication with UDP (connectionless) A SERVER A CLIENT Create a server-socket (listener)and.
Outline Network related issues and thinking for FAX Cost among sites, who has problems Analytics of FAX meta data, what are the problems  The main object.
Indexing Debapriyo Majumdar Information Retrieval – Spring 2015 Indian Statistical Institute Kolkata.
Ian M. Fisk Fermilab February 23, Global Schedule External Items ➨ gLite 3.0 is released for pre-production in mid-April ➨ gLite 3.0 is rolled onto.
E-insights, LLC © 2000 All rights reserved. Understanding Web Traffic Michael Whelan part - 2.
Presentation of Wind Data  The wind energy that is available at a specific site is usually presented on an annual basis.  There are several methods by.
Large scale data flow in local and GRID environment V.Kolosov, I.Korolko, S.Makarychev ITEP Moscow.
Introduction to UNIX/Linux Exercises Dan Stanzione.
Introduction: Following the great success of AAA and FAX data federations of all US ATLAS & CMS T1 and T2 sites, AAA embarked on exploration of extensions.
Research on cloud computing application in the peer-to-peer based video-on-demand systems Speaker : 吳靖緯 MA0G rd International Workshop.
Building a Real Workflow Thursday morning, 9:00 am Lauren Michael Research Computing Facilitator University of Wisconsin - Madison.
Test Of Distributed Data Quality Monitoring Of CMS Tracker Dataset H->ZZ->2e2mu with PileUp - 10,000 events ( ~ 50,000 hits for events) The monitoring.
Dividing the Pizza An Advanced Traffic Billing System An Advanced Traffic Billing System Christopher Lawrence Burke The University of Queensland.
FAX UPDATE 26 TH AUGUST Running issues FAX failover Moving to new AMQ server Informing on endpoint status Monitoring developments Monitoring validation.
Data Transfers in the Grid: Workload Analysis of Globus GridFTP Nicolas Kourtellis, Lydia Prieto, Gustavo Zarrate, Adriana Iamnitchi University of South.
Building a Real Workflow Thursday morning, 9:00 am Greg Thain University of Wisconsin - Madison.
Fusion-SDM (1) Problem description –Each run in future: ¼ Trillion particles, 10 variables, 8 bytes –Each time step, generated every 60 sec is (250x10^^9)x8x10.
Analysis of the ROOT Persistence I/O Memory Footprint in LHCb Ivan Valenčík Supervisor Markus Frank 19 th September 2012.
Optimising Cuts for HLT George Talbot Supervisor: Stewart Martin-Haugh.
Building a Real Workflow Thursday morning, 9:00 am Lauren Michael Research Computing Facilitator University of Wisconsin - Madison.
LIGO-T v1 Rand Dannenberg 1 Some Defect Counting Microscope Results – Lessons Learned Dark Field Particle Counting on LMA Micromap Sample & Tinsley.
Factors affecting ANALY_MWT2 performance MWT2 team August 28, 2012.
Scientific Data Management Research Group National Energy Research Scientific Computing Center, L B N L 1 Henrik Nordberg, June 1998 Query Estimator Henrik.
1 Andrea Sciabà CERN Critical Services and Monitoring - CMS Andrea Sciabà WLCG Service Reliability Workshop 26 – 30 November, 2007.
CMS Computing Model Simulation Stephen Gowdy/FNAL 30th April 2015CMS Computing Model Simulation1.
Sep. 17, 2002BESIII Review Meeting BESIII DAQ System BESIII Review Meeting IHEP · Beijing · China Sep , 2002.
Work and School, Cause of Bad Grades? By: Benjamin Brandt, Kimberly Travis, Kody Schvaneveldt.
SLACFederated Storage Workshop Summary For pre-GDB (Data Access) Meeting 5/13/14 Andrew Hanushevsky SLAC National Accelerator Laboratory.
Efi.uchicago.edu ci.uchicago.edu FAX status report Ilija Vukotic on behalf of the atlas-adc-federated-xrootd working group S&C week Jun 2, 2014.
CIAR Summer School Tutorial Lecture 1b Sigmoid Belief Nets Geoffrey Hinton.
Outline: Status: Report after one month of Plans for the future (Preparing Summer -Fall 2003) (CNAF): Update A. Sidoti, INFN Pisa and.
PERFORMANCE AND ANALYSIS WORKFLOW ISSUES US ATLAS Distributed Facility Workshop November 2012, Santa Cruz.
MiddleMan: A Video Caching Proxy Server NOSSDAV 2000 Brian Smith Department of Computer Science Cornell University Ithaca, NY Soam Acharya Inktomi Corporation.
1 Oct 2009Paul Dauncey1 Status of 2D efficiency study Paul Dauncey.
Developing Load Reduction Estimates Caused by Interrupting and/or Curtailing Large Customers By Carl L. Raish 2000 AEIC Load Research Conference.
Atcllc.com Using CEMS Data to Estimate Coal-Fired Plant FORs and Scheduled Maintenance Chris Hagman
Stephen Gowdy FNAL 9th Feb 2015CMS Computing Model Simulation 1.
8 August 2006MB Report on Status and Progress of SC4 activities 1 MB (Snapshot) Report on Status and Progress of SC4 activities A weekly report is gathered.
CSC2535 Lecture 5 Sigmoid Belief Nets
Maria Girone, CERN  CMS in a High-Latency Environment  CMSSW I/O Optimizations for High Latency  CPU efficiency in a real world environment  HLT 
Network Protocols: Design and Analysis Polly Huang EE NTU
1 June 11/Ian Fisk CMS Model and the Network Ian Fisk.
Any Data, Anytime, Anywhere Dan Bradley representing the AAA Team At OSG All Hands Meeting March 2013, Indianapolis.
Ó 1998 Menascé & Almeida. All Rights Reserved.1 Part VIII Web Performance Modeling (Book, Chapter 10)
Data Distribution Performance Hironori Ito Brookhaven National Laboratory.
Efi.uchicago.edu ci.uchicago.edu Sharing Network Resources Ilija Vukotic Computation and Enrico Fermi Institutes University of Chicago Federated Storage.
29/04/2008ALICE-FAIR Computing Meeting1 Resulting Figures of Performance Tests on I/O Intensive ALICE Analysis Jobs.
Getting the Most out of Scientific Computing Resources
Getting the Most out of Scientific Computing Resources
Fast Pattern-Based Throughput Prediction for TCP Bulk Transfers
Storage elements discovery
Pollution from Diesel Generators – Gurugram
Random access memory Sequential circuits all depend upon the presence of memory. A flip-flop can store one bit of information. A register can store a single.
Any Data, Anytime, Anywhere
Peta-Cache: Electronics Discussion I Gunther Haller
Summer 2002 at SLAC Ajay Tirumala.
Presentation transcript:

A year & a summer of June – August

Intro The plan: use UW data to estimate operational parameters for XRootd caching proxy – UW chosen as it uses XRootd for internal access, too … so we have all the monitoring data For cross-check, also analyzed: – remote access to UW – jobs running at UW and reading from elsewhere Separate also on /store/XXX/ top-level: – use user & group vs. all the rest (PhEDEx) – but I have histograms for individual subdirs, too.

Data Filtered out monitoring access (Brian & Dan) Things seemed a bit weird: – majority of accesses read a whole file; – pronounced peak in read-rate of 10 MB/s – pronounced peak in average read request size of 128 MB So I also cut out accesses with: – duration < 100s – bytes read < 1 MB This is being shown in the histograms that follow … … and it didn’t make the weirdness go away.

Number of file accesses weed out monitoringtighter cuts: 100s, 1 MB UW -> UW56,528,30217,108,892 UW -> XXX1,009,351694,696 XXX -> UW855,761702,402 Total number of records: 107,586,853 (all CMS XRootd monitoring) Wow, this cut down to a third! Somebody at UW is doing funny things

Hourly traffic & its histogram Just to give you an impression of scale … UW as a whole serves about 1GByte/s This is “cumulative” information, obtained by summing up individual transfers.

Fraction of file read Note the log scale: the bin around 1 is 100- times higher … and there are 200 bins from 0 to 2. So … about 50% of access read a file in full! I thought this is way lower …

Average read rate This isn’t so dramatic … but the highest peak is at 10MB/s! What are those jobs? Skimming? Funny thing: this peak is not there for UW -> XXX sample (but is there for XXX -> UW) so it almost seems like a UW peculiarity.

Average request size Again a well pronounced peak (order of magnitude) at 100 – 128 MB request size. I assume this is XRootd maximum Do we really manage to make requests this big? This is a bit of a pain for caching proxy …

Proto-conclusions & confusions All these three are a bit of a bad news for both caching proxy and less disk-full T2 operation. 100 Gbps networks are coming to the rescue – but this will not be free lunch, based on all the issues Alja and I see with the proxy operation on a 1Gbps node I’m a bit confused about the high per-job data rate compared to the average output of whole UW at 1GB/s