Presentation is loading. Please wait.

Presentation is loading. Please wait.

Google and Cloud Computing Google 与云计算 王咏刚 Google 资深工程师.

Similar presentations


Presentation on theme: "Google and Cloud Computing Google 与云计算 王咏刚 Google 资深工程师."— Presentation transcript:

1 Google and Cloud Computing Google 与云计算 王咏刚 Google 资深工程师

2 Agenda The Internet: From Hardware to Community The Innovation: A Computing Cloud Breakthroughs for Cloud Computing Google Apps for Cloud Computing Google Infrastructure for Cloud Computing

3 The Internet From Hardware to Community

4 The Internet: From Hardware to Community MySpace Facebook 开心网 校内网 ……

5 What Do Today’s Users Want? Accessibility –Access from anywhere and from multiple devices Shareability –Make sharing as easy as creating and saving Freedom –Users don’t want their data held hostage Simplicity –Easy-to-learn, easy-to-use Security –Trust that data will not be lost or seen by unwanted parties

6 6 The Innovation A Computing Cloud

7 Cloud Computing 7

8 Attributes of Cloud Computing 8 Data stored on the cloud Software & services on the cloud - Access via web browser Based on standards and protocols - Linux, AJAX, LAMP, etc. Accessible from any device Hardware Centric Software Centric Service Centric Personal PCClient ServerCloud Computing

9 9 Breakthroughs for Cloud Computing

10 10 User-Centric 1 Task-Centric 2 Powerful 3 Intelligent 4 Affordable 5 Programmable 6

11 User Centric Data stored in the “Cloud” Data follows you & your devices Data accessible anywhere Data can be shared with others music preferences maps news contacts messages mailing lists photo e-mails calendar phone numbers investments

12 Example : GMail –Just a web browser and your account with password! –Once you login, the device is “yours”. –Data stored on remote servers in the “cloud” (with large capacity) Beijing, on travel San Francisco, Monday Home, Wednesday

13 Use Google Docs to Solve a Task Access your docs from anywhere Chat with others in real time Changes instantly appear to other collaborators Task = “Teachers creating a departmental curriculum”

14 Communication Task – Email, Chat, Contacts, Chat History

15 Task: Collaborate on Spreadsheet – Communicate Chat with others editing the spreadsheet

16 Task: Collaborate on Spreadsheet – Collaborate Invite others to collaborate on the spreadsheet

17 Task: Collaborate on Spreadsheet – Publish Invite others to view the spreadsheet

18 You can also easily organize all your common tasks

19 Cloud Computing is Powerful: It can do what no PC can do Is Google Search faster than search in Windows/Outlook/Word? And Google Search must be much harder…. How much storage does it take to store all of the web pages? 100B pages * 10K per page = 1000T disk! Cloud computing has at its disposal Essentially infinite amount of disk Essentially infinite amount of computation (Assuming they can be parallelized) Example: Google Search

20 Web Page Search  Universal Search W 1 st Generation: era of single search – not diverse 2 nd Generation: era of vertical search – too complex 3 rd Generation: an era of Universal Search A B C D E

21 From vertical search to universal search A B CDE Integration of user experience

22 Universal Search Example

23

24 Cloud Computing Infrastructure

25 25 GFS Architecture Google 48% MSN 19% Yahoo 33% Files broken into chunks (typically 64 MB) Master manages metadata Data transfers happen directly between clients/chunkservers Client Replicas Masters GFS Master C0C0 C1C1 C2C2 C5C5 Chunkserver 1 C0C0 C2C2 C5C5 Chunkserver N C1C1 C3C3 C5C5 Chunkserver 2 … Client

26 Typical Cluster 26 Scheduling masters GFS chunkserver Scheduler slave Linux Machine 1 User app2 User app1 … GFS masterLock service GFS chunkserver Scheduler slave Linux Machine N User app3 User app2 User app1 GFS chunkserver Scheduler slave Linux Machine 2 User app3

27 MapReduce 27

28 More specifically… 28 Programmer specifies two primary methods: – map(k, v) → * – reduce(k', *) → * All v' with same k' are reduced together, in order. Usually also specify: – partition(k’, total partitions) -> partition for k’ often a simple hash of the key allows reduce operations for different k’ to be parallelized

29 29 BigTable Distributed multi-level map – With an interesting data model Fault-tolerant, persistent Scalable – Thousands of servers – Terabytes of in-memory data – Petabyte of disk-based data – Millions of reads/writes per second, efficient scans Self-managing – Servers can be added/removed dynamically – Servers adjust to load imbalance

30 30 BigTable: Basic Data Model Distributed multi-dimensional sparse map (row, column, timestamp)  cell contents Good match for most of our applications … … “ …” t1 t2 t3 www.cnn.com ROWS COLUMNS TIMESTAMPS “contents”

31 BigTable: System Architecture Cluster Scheduling Master handles failover, monitoring GFS holds tablet data, logs Lock service holds metadata, handles master-election Bigtable tablet server serves data Bigtable tablet server serves data Bigtable tablet server serves data Bigtable master performs metadata ops, load balancing Bigtable cell Bigtable client Bigtable client library Open()

32 Thanks Q&A


Download ppt "Google and Cloud Computing Google 与云计算 王咏刚 Google 资深工程师."

Similar presentations


Ads by Google