Presentation is loading. Please wait.

Presentation is loading. Please wait.

Large Scale Computing Systems Data Computations Infrastructures A new ERA: BIG x 3 Revisit: algorithms, architectures, distributed systems, parallel computing,

Similar presentations


Presentation on theme: "Large Scale Computing Systems Data Computations Infrastructures A new ERA: BIG x 3 Revisit: algorithms, architectures, distributed systems, parallel computing,"— Presentation transcript:

1 Large Scale Computing Systems Data Computations Infrastructures A new ERA: BIG x 3 Revisit: algorithms, architectures, distributed systems, parallel computing, scalable DBs

2 Big Data ‘Moore's’ Law: Data doubles every 18 months 90% of today’s data was created in the last 2 years – Facebook: 20TB/day compressed – CERN/LHC: 40TB/day (15PB/year) – NYSE: 1TB/day Many more – Web logs, financial transactions, medical records, etc

3 Data Growth 1 EB (Exabyte ) = 1000 PB (Petabyte ) Last year (2010) US mobile data traffic 0.8 ZB (Zettabyte) = 800 EB Entire global mass of digital data in 2009 according to IDC 35 ZB (Zettabyte ) IDC’s forecast for all digital data in 2020

4 MapReduce A programming model A software framework for writing applications that – rapidly process vast amounts of data in parallel – on large clusters of compute nodes

5 Cloud computing Big Data pushes databases to their limits NoSQL databases – Horizontal scalable schema-free multi-datacenter data stores that can handle PB of data – Google’s BigTable, Facebook’s Cassandra, LinkedIn’s Voldemort, Amazon’s Dynamo, and many more Cloud Computing – Virtualized resources from distant data centers – Elastic and “pay as you go” resource provisioning – Easy resource manipulation through an API

6 Big computations Challenges for exascale computing: – Scalability up to millions of cores – Programmability (revisit traditional parallel programming models) – Fault tolerance (in thousands or millions of nodes, several may fail every day) – Low power consumption (maximize GFLOP/WATT) It’s not High-Performance Computing (HPC) anymore… it’s High-Efficiency Computing (HEC)

7 Exascale applications (Huge) Graph algorithms: Shortest paths, PageRank, etc Computations on sparse matrices: The heart of scientific and engineering simulations Regular grids: solving PDEs with millions of unknowns

8 Big Infrastructures OS, Architectures revisited Virtualization Cloud Facilities - Datacenters Distributed storage: 100’s PBs using commodity disks HPC clusters: Exascale computing using scalable ‘ingredients’ 8/18


Download ppt "Large Scale Computing Systems Data Computations Infrastructures A new ERA: BIG x 3 Revisit: algorithms, architectures, distributed systems, parallel computing,"

Similar presentations


Ads by Google