Presentation is loading. Please wait.

Presentation is loading. Please wait.

CSE838 Lecture notes copy right: Moon Jung Chung

Similar presentations


Presentation on theme: "CSE838 Lecture notes copy right: Moon Jung Chung"— Presentation transcript:

1 CSE838 Lecture notes copy right: Moon Jung Chung
Theory Moon Jung Chung 12/9/2018 CSE838 Lecture notes copy right: Moon Jung Chung

2 Parallel Minimum Spanning Tree (Deterministics)
Each node is a super node. Repeat until only one super node For each super node, among edges which connects to another super node, select an edge with minimum merge two super nodes into one super node How many phase? --> O(logn) phase. each phase: O(logn) time in CRCW. Actually, with priority CRCW, O(1) time. Complexity: O(logn) time with O(m) PEs with priority CRCW, where m is the number of edges. 12/9/2018 CSE838 Lecture notes copy right: Moon Jung Chung

3 Parallel Minimum Spanning Tree (Deterministic: detailed)
Repeat until there is only one super node, for each edge (x,y), if x and y are different component, component (x) = y component (y) = x For each node with priority -CW, accept the minimum value of component. Merge two super nodes into a single super node. Complexity: O(logn) time with O(m) PEs with priority CRCW, where m is the number of edges. How to avoid priority-CR? ==> If tree is a spanning tree? 12/9/2018 CSE838 Lecture notes copy right: Moon Jung Chung

4 Parallel Spanning Tree (Probablistic)
For each edge, if it connects two different super nodes, add the edge in a spanning tree, and merge two super node as a single node. For two super nodes, two edges may be selected at the same time connecting them. How about cycle? To prevent these troubles, For each super node, select an edge randomly which connects to other super node. Verify if two different super nodes selected the same edge Verify if there is no cycles If the selected edge is OK, include the edge in a spanning tree, and merge two super nodes. How many phase? --> O(logn) phase in average. each phase: O(1) time in average Complexity: O(logn) time with O(m) CREW PEs. Parallel Connected Components in EREW => Use matrix multiplication: O(log2n) time using O(n2) PEs. 12/9/2018 CSE838 Lecture notes copy right: Moon Jung Chung

5 CSE838 Lecture notes copy right: Moon Jung Chung
Parallel Models (i) Shared Memory (PRAM) -- deterministic how about probablistic? example minimum spanning tree (ii) Circuit: depth and size (iii) Alternating Turing Machine Brent Theorem: Any depth-d size-n combinational circuit with bounded fan-in can be simulated by p-processor CREW algorithm in O(n/p + d) time. proof: store inputs to the combinational circuit in the PRAM Each gate evaluate its output if all inputs are ready. If there are not enough PEs, evaluate gates in the order of depth. (depth of a gate: longest path from the primary inputs) Complexity: Let ni be number of gates at depth i. The simulation takes  ni/p  for the gates at the depth i total time: sum of i ni/p i ( ni/p + 1) = n/p + d. 12/9/2018 CSE838 Lecture notes copy right: Moon Jung Chung

6 CSE838 Lecture notes copy right: Moon Jung Chung
Parallel Models Brent Theorem for EREW: Any depth-d size-n combinational circuit with bounded fan-in, fan-out can be simulated by p-processor EREW algorithm in O(n/p + d) time. proof: For exclusive reading, output values are copied to all gates where it is used. With bounded fan-in, fan-out, it takes constant time. Reading them one by one also takes constant time. 12/9/2018 CSE838 Lecture notes copy right: Moon Jung Chung

7 CSE838 Lecture notes copy right: Moon Jung Chung
Uniform Circuit L be a language. Circuit complexity of L? Definition1: f(n) = number of gates of a circuit accepting strings of length in L. Def. 1 may not be acceptable one: L = {0n| n-th TM accepts n-th input} L is not even recursively enumerable. But L has circuit complexity 1 yes no two candidate circuits accepting a string of length. 12/9/2018 CSE838 Lecture notes copy right: Moon Jung Chung

8 CSE838 Lecture notes copy right: Moon Jung Chung
Uniform Circuit Let Ln = {w | w is in L and |w| = n} There is a family of circuits {Cn}, and generating Cn can be done using polynomial time using O(logn) space. Each gate has bounded fan-in degree. Example of non-uniform: Division circuit => O(logn) time, but generation of it will require polynomial size space! NCk = {L | there is a uniform circuit of poly size and (logn)k depth} NC = k NCk Note: SCk = {L | there is a TM with time poly and (logn)k space} Relationship between SC and NC? 12/9/2018 CSE838 Lecture notes copy right: Moon Jung Chung

9 CSE838 Lecture notes copy right: Moon Jung Chung
Alternating TM TM forks at each state. Subprocesses cannot communicate! TM has two types of states: universal existential At Universal: all branches must be accepted. Existential: one branch should lead to accepting state That is, each computation can be represented as a computation tree. Depth of computation tree: time complexity. Note: Deterministic TM: a path Parallel random access machine: processes can communicate. ASPACE (logn) = P 12/9/2018 CSE838 Lecture notes copy right: Moon Jung Chung

10 Parallel Computation Thesis
parallel time is polynomially equivalent to sequential space. example: parallel time of vector machine is equivalent to sequential space ATIME (f(n)) and DSPACE (f(n)). ATM (S(n), T(n)): Language accepted by ATM with space S(n), time T(n). Theorem: ATM (logn, (logn)k) = NCk 12/9/2018 CSE838 Lecture notes copy right: Moon Jung Chung

11 NC-algorithm and P-complete problems
Let f be a function Input: input of f Output: compute f NC1 reducible from f to g: using oracles of g, we can construct NC1 circuits computing f. oracle gate is counted as depth logn, size n. Division: input: x and y output: x/y Reciprocal: input: x output: x-1 Powering: Input: x Output: xi expressed in n2 bits Example: Division < Reciprocal Using reciprocal, compute y-1 compute x*y-1 Reciprocal < Powering: trivial How to construct log depth powering circuit? ==> seems not easy 12/9/2018 CSE838 Lecture notes copy right: Moon Jung Chung

12 NC-algorithm and P-complete problems
Special case of function: language recognition. Let A, B be languages A is NC1 reducible to B, if using oracle, A can be solved in NC1. log space reduction is NC reduction. A NCB, and B is in NC, then A is also in NC. A is complete for P <=> for any B in P, A <logn B. Theorem: Let A be a P-complete problem (with respect to log space reduction). If A is in NC, then P  NC. 12/9/2018 CSE838 Lecture notes copy right: Moon Jung Chung

13 P-complete and hard problems to make parallel
Examples of P-complete problems: Monotone Circuit Value problem: Input: Monotone circuit (and, or gates, but without “not” gate, and values to primary input. Question: Is the output of circuit 0 with the given primary input values? Generating lexiographically smallest depth first search tree These P-complete problems may not be parallelizable! Open question: Perfect matching, depth first search (directed, undirected), integer GCD, modular exponentiation 12/9/2018 CSE838 Lecture notes copy right: Moon Jung Chung


Download ppt "CSE838 Lecture notes copy right: Moon Jung Chung"

Similar presentations


Ads by Google