3/12/2013Computer Engg, IIT(BHU)1 INTRODUCTION-1.

3/12/2013Computer Engg, IIT(BHU)1 INTRODUCTION-1

HIGH PERFORMANCE COMPUTING Any computing that involves multiple computers/ processors for solving a compute intensive problem A supercomputer solves a problem at extremely high speed compared to the computers built at same time

Supercomputing ● First HPC systems were vector based system (e.g.,Cray) named 'supercomputing' because they were an order of magnitude. ● Now 'supercomputer' has little meaning-large systems are now just scaled up various of smaller systems. However, 'high performance computing' has many meanings.

HPC Defined High performance computing: ● Can mean high flop count per processor ● Totaled over many processors working on the same problem ● Totaled over many processors working on related problems

HPC Defined contd. ● can mean faster turnaround time more powerful system ● scheduled to first available system(s) using multiple systems simultaneously

Motivation Continuous demand for more computing power we have Many engineering and science applications require huge calculation on very large amount of data With complexity of a system, its simulation takes more time Many real time applications which require a specific deadline to meet

How performance is increased? A traditional computer has single processor to work There are mainly two ways to increase the performance - by using multiple processor in a single machine (Multiprocessor) - by applying multiple computers to solve a single problem

Importance of HPC ● HPC has tremendous impact on all areas of computation science and engineering in academia, government and industry. ● Many problems have been solved with HPC techniques that were impossible to solve with individual workstations or personal computer.

Definitions ● HPC:any computational technique that solves a large problem faster than possible using single, commodity systems. e.g., ➢ Parallel computing ➢ Distributed computing ➢ Grid computing

Parallel Computing Vs Distributed Computing Main objective of parallel computing is to provide power (processor or memory) while distributed computing aims at convenience/ availability, reliability In parallel computing, resources are generally present on a motherboard but in distributed computing resources are distributed Parallel computing has frequent communication and distributed computing has infrequent communication

Problem decomposition A problem is divided into different subproblems to be solved simultaneously by different components to reduce the running time Ideal situation in dividing a problem is that subproblems are equal in size(it is rarely achieved in practice)

Problem decomposition contd. ● When problem is not divided perfectly into independent parts, interaction is needed between the parts both for data transfer and synchronization of computers ● There are mainly two decomposition methods: ➢ Data decomposition and Recursive decomposition

Data decomposition Generally used when data structures with large amounts of similar data need to be processed A task consists of a group of data Data can be partitioned in various ways - These can be either input data, output data or even intermediate data Each processor performs the same operation on these data, which are often independent from one another

Data decomposition- An Example C[1..1000] = A[1..1000] + B[1..1000] an array of calculations is divided into smaller sub arrays each sub-array is assigned to a thread to perform its work

Recursive decomposition problems that are solved using the divide-and- conquer strategy are best suited The problem is first decomposed into a subproblems These sub-problems are recursively decomposed further to the desired granularity

Speedup Any algorithm designed for a single processor machine is known as sequential algorithm Any algorithm designed for multiprocessor system is known as parallel algorithm Let A be the best sequential method to solve a problem and B be a parallel method to solve the same problem Now we define the speedup S(p) as ratio of the running time of A i.e. T(A) and running time of B i.e. T(B) S(p) = T(A)/T(B)

Speedup contd.. Theoretically the speed can be given in terms of computational steps: S(p) = number of computational steps using single processor/ number of computational steps using p processors

Amdahl’s Law For an algorithm, If f is the fraction of steps that can not be parallelized, then maximum speed up S max can be given as S max = 1/ (f + (1-f)/p) Where p is the no. of processors

Instruction set design Design of machines is based on the instructions that are necessary for solving problems by the machine Instruction set of a computer has important commands that can be used for programming the machine Based on the number of instructions, two designs are evolved, namely CISC (Complex Instruction Set Computer) and RISC (Reduced Instruction Set Computer)

CISC More and more commands have been built in the machine With the popularity of microprogramming, number of instructions increased A typical CISC machine has approximately 120- 350 instructions Compiler developments become easier with the large number of instructions

CISC contd. The main characteristics of CISC microprocessors are: 1.Extensive instructions. 2.Complex and efficient machine instructions. 3.Microencoding of the machine instructions. 4.Extensive addressing capabilities for memory operations. 5.Relatively few registers.

RISC This category has machines that have most necessary instructions in the instruction set

RISC contd. 1.Reduced instruction set computers. 2. Limited and simple instruction set. 3. RISC processors have a CPI (clock per instruction) of one cycle. 4. Large number of registers. 5. Hardwired control unit and machine instructions.

RISC Vs. CISC In comparison, RISC processors are more or less the opposite of the above: 1.Reduced instruction set. 2.Less complex, simple instructions. 3.Hardwired control unit and machine instructions. 4.Few addressing schemes for memory operands with only two basic instructions, LOAD and STORE 5.Many symmetric registers which are organised into a register file.

Classification of parallel machines Michael Flynn (1972) classified parallel computers on basis of number of different operations and amount of data they operate SISD (SINGLE INSTRUCTION SINGLE DATA) SIMD (SINGLE INSTRUCTION MULTIPLE DATA) MISD (MULTIPLE INSTRUCTION SINGLE DATA) MIMD (MULTIPLE INSTRUCTION MULTIPLE DATA)

SISD 1.Single instruction single data. 2.Serial. 3.Only one instruction and data stream is acted on during one clock cycle. 4. Instruction fetching and pipelined execution of instructions are common examples found in most modern SISD computers. 5. This corresponds to the von Neumann architecture.

SIMD 1.Single instruction multiple data. 2. All processing units execute the same instruction at any given clock cycle. 3.Each processing unit operates on a different data element. 4. exploit data level parallelism. 5. Only algorithms that can be vectorized exploit its advantage.

MISD 1.Multiple instruction single data. 2. Very few practical uses for this type of classification. 3.Example: Multiple cryptography algorithms attempting to crack a single coded message. 4.Different instructions operated on a single data element. 5. A systolic array is an example of a MISD structure

MIMD 1.Multiple instruction multiple data. 2.Can execute different instructions on different data elements. 3.Most common type of parallel computer. 4. may be used in a number of application areas such as computer-aided design/computer-aided manufacturing, simulation, modeling. 5. MIMD machines can be of either shared memory or distributed memory categories.

3/12/2013Computer Engg, IIT(BHU)1 INTRODUCTION-1.

Similar presentations

Presentation on theme: "3/12/2013Computer Engg, IIT(BHU)1 INTRODUCTION-1."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

3/12/2013Computer Engg, IIT(BHU)1 INTRODUCTION-1.

Similar presentations

Presentation on theme: "3/12/2013Computer Engg, IIT(BHU)1 INTRODUCTION-1."— Presentation transcript:

Similar presentations

About project

Feedback