Presentation is loading. Please wait.

Presentation is loading. Please wait.

Distributed Computing with Turing Machine. Turing machine  Turing machines are an abstract model of computation. They provide a precise, formal definition.

Similar presentations


Presentation on theme: "Distributed Computing with Turing Machine. Turing machine  Turing machines are an abstract model of computation. They provide a precise, formal definition."— Presentation transcript:

1 Distributed Computing with Turing Machine

2 Turing machine  Turing machines are an abstract model of computation. They provide a precise, formal definition of what it means for a function to be computable.  It is similar to a finite automaton but with an unlimited and unrestricted memory.  Use infinite tape as inputs unlimited memory;  It has a head can read and write symbols and move R/L on the tape;  The tape contains input string and the other tapes is blank.

3 The distributed computing based on Bigdata to show  Bigdata system is a distributed system, it has an distributed file system named HDFS which can store large data in a cluster, then to manage them.  If the input contains a very large string, even the turing machine can be computed in a polynomial time, it still spend large time to solve it.

4 Similarities between Bigdata and Turing Machine  Mass storage The Turing machine model uses an infinite tape as its unlimited memory. Turing Machine can store mass input tape and instruction. The bigdata their data is based on internet, it is also very ample  main control system Turing Machine has a certain function to compute which control the head direction to read/write the tape. The bigdata also has a main control system named namenode. It used to distribute the datanode and let client to operate the datanode

5 Different between bigdata and Turing machine  Some problem can be solved on a deterministic Turing Machine in a polynomial time. It depend on the size of the input and the function which control the slip of the head.  All the input and the compute just can be done on its own tape.  Bigdata use the HDFS system manage the data, can be execute in many computers.

6 HDFS  Distributed File System  Large Data Assets  HDFS Parts NameNode ◦manage the filesystem namespace ◦manages opening, closing, renaming, etc. ◦maps blocks to datanodes DataNodes ◦manage stores (blocks) – create/delete ◦serves reads/writes for data blocks

7 HDFS:Data loading

8 Key/Value pairs  Take a collection of key, value pairs  Map onto a different collection of key, value pairs. Map(k1,v1) -> (k2,v2) shuffling (A,1),(B,2),(C,3)(B,3),(A,2),(C,1) (D,1),(C,1),(B,2)(A,5) Shuffled (A,(1,2,5)) (B,(2,3,2)) (C,(3,1,1)) (D,(1))

9 Map-Reduce Using the function Map-Reduce to decompose the large computing problem to many small blocks. Using the function Map put them on many computers through the internet,every single machine can distributed compute their own data at the same time. Reduce is a kind of combine, it depend on the key-value model.

10 Map-Reduce process 1.In the mapping phase, MapReduce takes the input data and feeds each data element to the mapper. 2.In the reducing phase, the reducer processes all the outputs from the mapper and arrives at a final result. 3.In simple terms, the mapper is meant to filter and transform the input into something that the reducer can aggregate over.

11

12 Distributed Task Execution Problem Statement: There is a large computational problem that can be divided into multiple parts and results from all parts can be combined together to obtain a final result. Case Study: Simulation of a Digital Communication System There is a software simulator of a digital communication system like WiMAX that passes some volume of random data through the system model and computes error probability of throughput. Each Mapper runs simulation for specified amount of data which is 1/Nth of the required sampling and emit error rate. Reducer computes average error rate. Solution: Problem description is split in a set of specifications and specifications are stored as input data for Mappers. Each Mapper takes a specification, performs corresponding computations and emits results. Reducer combines all emitted parts into the final result.

13 Conclusion If there are a lot of input maybe 10TB or more, using turing machine will spend a lot of time to solve it. But when using distributed computing, it means split the computing in many blocks, one block is a computer(turing machine) to compute, every computer solve a relative small input, the input in every block just is the 1/N of the original input. It will spend less time to solve problem and make the computing more efficient.

14 Thank you


Download ppt "Distributed Computing with Turing Machine. Turing machine  Turing machines are an abstract model of computation. They provide a precise, formal definition."

Similar presentations


Ads by Google