(Page 554 – 564) Ping Perez CS 147 Summer 2001 Alternative Parallel Architectures  Dataflow  Systolic arrays  Neural networks.

Slides:

Advertisements

Similar presentations

Systolic Arrays & Their Applications

Advertisements

PIPELINE AND VECTOR PROCESSING

Lecture 13: 10/8/2002CS170 Fall CS170 Computer Organization and Architecture I Ayman Abdel-Hamid Department of Computer Science Old Dominion University.

Computer Organization and Architecture

Superscalar processors Review. Dependence graph S1S2 Nodes: instructions Edges: ordered relations among the instructions Any ordering-based transformation.

CEN 226: Computer Organization & Assembly Language :CSC 225 (Lec#3) By Dr. Syed Noman.

1 Lecture-2 CSIT-120 Spring 2001 Revision of Lecture-1 Introducing Computer Architecture The FOUR Main Elements Fetch-Execute Cycle A Look Under the Hood.

Advanced Topics in Algorithms and Data Structures An overview of the lecture 2 Models of parallel computation Characteristics of SIMD models Design issue.

Chapter XI Reduced Instruction Set Computing (RISC) CS 147 Li-Chuan Fang.

Computational Astrophysics: Methodology 1.Identify astrophysical problem 2.Write down corresponding equations 3.Identify numerical algorithm 4.Find a computer.

1 Lecture-2 CS-120 Fall 2000 Revision of Lecture-1 Introducing Computer Architecture The FOUR Main Elements Fetch-Execute Cycle A Look Under the Hood.

Models of Parallel Computation Advanced Algorithms & Data Structures Lecture Theme 12 Prof. Dr. Th. Ottmann Summer Semester 2006.

Neural Networks. Background - Neural Networks can be : Biological - Biological models Artificial - Artificial models - Desire to produce artificial systems.

Graphs, relations and matrices

College Algebra Fifth Edition James Stewart Lothar Redlin Saleem Watson.

Rohit Ray ESE 251. What are Artificial Neural Networks? ANN are inspired by models of the biological nervous systems such as the brain Novel structure.

Computer Structure.

CHAPTER 4: INTRODUCTION TO COMPUTER ORGANIZATION AND PROGRAMMING DESIGN Lec. Ghader Kurdi.

Alternative Parallel Processing Approaches Jonathan Sagabaen.

PARUS: a parallel programming framework for heterogeneous multiprocessor systems Alexey N. Salnikov (salnikov cs.msu.su) Moscow State University Faculty.

Exercise problems for students taking the Programming Parallel Computers course. Janusz Kowalik Piotr Arlukowicz Tadeusz Puzniakowski Informatics Institute.

Instruction Sets and Pipelining Cover basics of instruction set types and fundamental ideas of pipelining Later in the course we will go into more depth.

Parallelism Processing more than one instruction at a time. Pipelining

Computer Processing of Data

10-1 Chapter 10 - Advanced Computer Architecture Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring.

CMPE 511 DATA FLOW MACHINES Mustafa Emre ERKOÇ 11/12/2003.

CS321 Functional Programming 2 © JAS Implementation using the Data Flow Approach In a conventional control flow system a program is a set of operations.

IT253: Computer Organization Lecture 4: Instruction Set Architecture Tonga Institute of Higher Education.

Multiple-Layer Networks and Backpropagation Algorithms

DEPARTMENT OF COMPUTER SCIENCE & TECHNOLOGY FACULTY OF SCIENCE & TECHNOLOGY UNIVERSITY OF UWA WELLASSA 1 CST 221 OBJECT ORIENTED PROGRAMMING(OOP) ( 2 CREDITS.

Chapter One Introduction to Pipelined Processors.

Copyright © 2013, 2009, 2005 Pearson Education, Inc. 1 5 Systems and Matrices Copyright © 2013, 2009, 2005 Pearson Education, Inc.

LINEAR CLASSIFICATION. Biological inspirations  Some numbers…  The human brain contains about 10 billion nerve cells ( neurons )  Each neuron is connected.

CHAPTER 12 INTRODUCTION TO PARALLEL PROCESSING CS 147 Guy Wong page

Artificial Neural Networks. The Brain How do brains work? How do human brains differ from that of other animals? Can we base models of artificial intelligence.

Copyright © 2013, 2009, 2005 Pearson Education, Inc. 1 5 Systems and Matrices Copyright © 2013, 2009, 2005 Pearson Education, Inc.

CS 345: Chapter 10 Parallelism, Concurrency, and Alternative Models Or, Getting Lots of Stuff Done at Once.

Topic 1Topic 2Topic 3Topic 4Topic

Perceptron Networks and Vector Notation n CS/PY 231 Lab Presentation # 3 n January 31, 2005 n Mount Union College.

ADVANCED PERCEPTRON LEARNING David Kauchak CS 451 – Fall 2013.

Neural Networks in Computer Science n CS/PY 231 Lab Presentation # 1 n January 14, 2005 n Mount Union College.

6. A PPLICATION MAPPING 6.3 HW/SW partitioning 6.4 Mapping to heterogeneous multi-processors 1 6. Application mapping (part 2)

Lab 2 Parallel processing using NIOS II processors

VHDL Discussion Subprograms IAY 0600 Digital Systems Design Alexander Sudnitson Tallinn University of Technology 1.

Section 9-1 An Introduction to Matrices Objective: To perform scalar multiplication on a matrix. To solve matrices for variables. To solve problems using.

The Hashemite University Computer Engineering Department

Different Microprocessors Tamanna Haque Nipa Lecturer Dept. of Computer Science Stamford University Bangladesh.

Fundamentals of Programming Languages-II

TH EDITION LIAL HORNSBY SCHNEIDER COLLEGE ALGEBRA.

Chapter 2 Data Manipulation © 2007 Pearson Addison-Wesley. All rights reserved.

RISC / CISC Architecture by Derek Ng. Overview CISC Architecture RISC Architecture  Pipelining RISC vs CISC.

Neural Networks. Background - Neural Networks can be : Biological - Biological models Artificial - Artificial models - Desire to produce artificial systems.

M211 – Central Processing Unit

Chapter 7 Memory Management Eighth Edition William Stallings Operating Systems: Internals and Design Principles.

Autumn 2006CSE P548 - Dataflow Machines1 Von Neumann Execution Model Fetch: send PC to memory transfer instruction from memory to CPU increment PC Decode.

OPERATORS IN C CHAPTER 3. Expressions can be built up from literals, variables and operators. The operators define how the variables and literals in the.

Multiple-Layer Networks and Backpropagation Algorithms

13.4 Product of Two Matrices

Morgan Kaufmann Publishers

CS1251 Computer Architecture

Pipelining and Vector Processing

Superscalar Processors & VLIW Processors

Functional Units.

Unit-1 Introduction to Java

Artificial Neural Network & Backpropagation Algorithm

Computer Architecture

Memory System Performance Chapter 3

Prof. Onur Mutlu Carnegie Mellon University

Presentation transcript:

(Page 554 – 564) Ping Perez CS 147 Summer 2001

Alternative Parallel Architectures  Dataflow  Systolic arrays  Neural networks

To understand how data flow computers work, it is first necessary to understand dataflow graphs. As a computer program is compiled, it is converted into its equivalent dataflow graph, which shows the data dependencies between statements and is used by the dataflow computer to generate the structures it needs to execute the program.

A code segment and its dataflow graph A  B + C 2.D  E + F 3.G  A + H 4. I  D + G 5.J  I + K B C E F H K

As shown in the figure, each vertex of the graph corresponds to the operator performed by one of the instructions. The directed edges going to a vertex correspond to the operands of the function performed by the vertex, and the directed edge leaving the vertex represents the result generated by the function. A  B + C D  E + F G  A + H I  D + G J  I + K B C E F H K

Single Assignment Rule This code segment has four violations of the single assignment rule, starting with statement 2. The value stored by this statement, B, was used as an operand in statement 1, so it must be renamed. We can rename it B1, and change all references to it later in this code. Similarly, values C and D, set by statements 3 and 4, are also used as operands in prior statements and must be renamed. 1. A  B + C 2. B  A + D 3. C  A + B 4. D  C + B 5. A  A + C

Single Assignment Rule (con’t) Finally, statement 5 stores its result in A, the same variable used to store the result in statement1, we must also change this variable’s name. Note that statement 2, 3 and 5 all use A as an operand: This is not a violation of the single assignment rule. An operand can be used many times. 1. A  B + C 2. B  A + D 3. C  A + B 4. D  C + B 5. A  A + C

1.A  B + C 2.B1  A + D 3.C1  A + B1 4.D1  C1 + B1 5.A1  A + C1 B C D

Single Assignment Rule The data flow graph describes the dependencies between statements and how data will flow between statements. An edge, however, does not show when data flows from one statement to another. The data that traverses an edge is called a token. When a token is available, it is represented as a dot on the edge.

A vertex is ready to fire, or execute its instruction when all edges have tokens, or the instruction’s operands are all available. B C D

I - Structures Within the computer system, dataflow vertices are usually stored as I-structures. Each I- structure includes the operation to be performed, its operands, and a list of destinations for its result.

An I-structure and the dataflow graph with I-structure + 2 ( ) { 2 / 1 } { 2/1, 3/1,4/2} + ( ) 4 {3/2,4/2} + ( ) ( ) {4/1,5/2} + ( ) ( ) -

The architectures of dataflow system 1. Static architectures 2. Dynamic architectures

Static dataflow computer organization This figure shows the organization of the static dataflow computer. The I- store unit has two sections. The memory section stores the I-structures of the dataflow program. I-store unit Processors Firing queue Memory section Update/Ready/section

What is Systolic Arrays? Systolic array incorporates several processing elements into a regular structure, such as linear array or mesh. Each processing element performs a single, fixed function, and communicates only with its neighboring processing elements.

A 2 X 2 systolic array to multiply two matrices U L 1,1 R D U L 1,2 R D U L 2,1 R D U L 2,2 R D

During the first clock cycle we input A 1, 1 to input L and B 1, 1 to input U of processing element 1,1. This processing element calculates A 1, 1 B 1, 1 and adds it to its running total and running time remain 0. A1,1 0 B1,1 0 Total= A1,1 B1,1 Total= 0 Total= 0 Total= 0

During the second clock cycle, we input A 1,2 to L and B 2, 1 to U, this processing element multiplies them and adds to product to its running total, which becomes A 1, 1 B 1, 1 + A 1, 1 B 2, 1, the finial value of C 1,1. A1,2 A2,1 B2,1 B1,2 Total= A1,1 B1,1 + A1,2B2,1 Total= A2,1B1,1 Total= A1,1B1,2 Total= 0 A1,1

Clock cycle 3 continues the matrix multiplication. Since C 1,1 has already been calculated, we input 0 to the inputs of processing element 1,1 so the running total is not changed. The final values of C 1,2 and C 2,1 are calculated during this clock cycle and first part of C 2,2 is generated. 0 0 B 2,2 Total= A1,1 B1,1 + A1,1B2,1 Total= A1,1B1,1 + A2,2B2,1 Total= A1,1B1,2 + A1,2B2,2 Total= A2,1B1,2 A 2,2 B 2,1 B 1,2 A 2,1 A 1,2

The final value of C 2,2 is calculated during clock cycle 4, as shown in the figure, at this point, multiplication of the two matrices has been computed. Total= A1,1 B1,1 + A1,1B2,1 Total= A2,1B1,1 + A2,2B2,1 Total= A1,1B1,2 + A1,2B2,2 Total= A2,1B1,2+ A2,2B2,2 B 2,2 A 2,2

1)Neural network are different from any other computing structure. 2)They incorporate thousands or millions of simple processing elements called neurons. 3)They have far less processing power than CPU.

Unlike traditional computer, which are programmed, neural networks are trained. Training consists of defining system input data and defining the desired system outputs for that input data.

System outputs are generated as a function as a function of the outputs of individual neurons. Each neuron’s output, in turn is a function of the outputs of the neurons to which it is connected. The output of each neuron is multiplied by its weighting factor. All of these weighted values are added together. ( 1 )

This value is compared to the threshold value for that neuron. If the weighted value is greater than or equal to the threshold value, the neural output value is 1, otherwise it’s output is 0. (2)

1 Label weight weight value Label Weight Value Input 1* * * * 0.4 Value =0.7 > 0.65 (N’s threshold value) Since this weighted value 0.7 is greater than the threshold Value, neuron N outputs a logical value of 1

Where is a neural network be used? A neural network is not appropriate for general purpose computing, you won’t find a neural network running windows on a personal computer. Instead it has found applications in tasks that do not run well on conventional architectures. Neural networks are also being used in control systems and artificial intelligence applications.