Presentation is loading. Please wait.

Presentation is loading. Please wait.

© 2012 Unisys Corporation. All rights reserved. 1 Unisys Corporation. Proprietary and Confidential.

Similar presentations


Presentation on theme: "© 2012 Unisys Corporation. All rights reserved. 1 Unisys Corporation. Proprietary and Confidential."— Presentation transcript:

1 © 2012 Unisys Corporation. All rights reserved. 1 Unisys Corporation. Proprietary and Confidential.

2 © 2012 Unisys Corporation. All rights reserved. 2 Data Technology Landscape Is Rapidly Evolving Relational hegemony is over –Disruptive data technologies abound –Open source, new data models, NoSQL systems –One size no longer fits all Focus expanded from write- to read-intensive applications Old constraints are falling away –Big memory, big storage, big CPU farms, big interconnect –Virtual machines everywhere –New applications with massive data volumes (social networking, BI) –Less restrictive transaction models promote scalability 2 Mike Stonebraker “It’s time for a complete rewrite” UC Berkeley MIT Ingres Postgres Illustra Streambase Vertica VoltDB and more OLTP Analytics 40-odd years OLTP Analytics

3 © 2012 Unisys Corporation. All rights reserved. 3 Hadoop Mimics Google as Big Data Store 3 Google File System Hadoop Distributed File System Map/Reduce BigTable HBase Megastore Google App Engine Megastore Google App Engine Pig Latin, Hive, Zookeeper, Vendor Analytics Pig Latin, Hive, Zookeeper, Vendor Analytics Apache Software Foundation Distributed File System Table-like Data Model Data Access Technique Applications Your Data Everywhere

4 © 2012 Unisys Corporation. All rights reserved. 4 Data ‘sharded’ across nodes How HDFS and GFS Work “Shared Nothing” Data Nodes Your Data Everywhere

5 © 2012 Unisys Corporation. All rights reserved. 5 Map/Reduce Algorithm void map(String name, String document): // name: document name // document: document contents for each word w in document: EmitIntermediate(w, "1"); void reduce(String word, Iterator wordCounts): // word: a word // wordCounts: list of aggregated counts int sum = 0; for each pc in wordCounts: sum += ParseInt(pc); Emit(word, AsString(sum)); A programming pattern –Inspired by functional programming languages –For large scale parallel applications Parallel Algorithm –Map preps input data into pairs, here –Merge (or Combine) phase relevant pairs, arranging them by word –Reduce sums counts for each word, constructs final result Optimized for unstructured data –Minimum metadata stored in dist. file system –Data knowledge resides in map and reduce programs Parts of the algorithm are patented by Google –US Patent #7,650,331 –Filed June 18, 2004, granted January 19, 2010 –Licensed to Hadoop in April, 2010 Standard example is word counting Return Your Data Everywhere

6 © 2012 Unisys Corporation. All rights reserved. 6 Unisys Corporation. Proprietary and Confidential.

7 © 2012 Unisys Corporation. All rights reserved. 7 Unisys Corporation. Proprietary and Confidential.


Download ppt "© 2012 Unisys Corporation. All rights reserved. 1 Unisys Corporation. Proprietary and Confidential."

Similar presentations


Ads by Google