Multicore experiment: Plurality Hypercore Processor Performed by: Anton Fulman Ze’ev Zilberman Supervised by: Mony Orbach Characterization presentation.

Slides:



Advertisements
Similar presentations
Overview Motivation Scala on LLVM Challenges Interesting Subsets.
Advertisements

Introduction to Assembly language
1 Chapter 1 Why Parallel Computing? An Introduction to Parallel Programming Peter Pacheco.
MULTICORE, PARALLELISM, AND MULTITHREADING By: Eric Boren, Charles Noneman, and Kristen Janick.
1 Optimizing multi-processor system composition Characterization Presentation November 20 th – 2007 Performing: Isaac Yarom Supervising: Mony Orbach Annual.
Multicore experiment: Plurality Hypercore Processor Performed by: Anton Fulman Ze’ev Zilberman Supervised by: Mony Orbach Project’s poster Winter 2008.
Implementation of a satellite on a Multi-Core System A project by: Daniel Aranki Mohammad Nassar Supervised by: Mony Orbach Winter 2009 Characterization.
The many-core architecture 1. The System One clock Scheduler (ideal) distributes tasks to the Cores according to a task map Cores 256 simple RISC Cores,
(Page 554 – 564) Ping Perez CS 147 Summer 2001 Alternative Parallel Architectures  Dataflow  Systolic arrays  Neural networks.
1 Fast Communication for Multi – Core SOPC Technion – Israel Institute of Technology Department of Electrical Engineering High Speed Digital Systems Lab.
Multicore experiment: Plurality Hypercore Processor Performed by: Anton Fulman Ze’ev Zilberman Supervised by: Mony Orbach Final presentation Winter 2008.
Implementation of DSP Algorithm on SoC. Mid-Semester Presentation Student : Einat Tevel Supervisor : Isaschar Walter Accompaning engineer : Emilia Burlak.
Alon Horn and Oren Ierushalmi Supervised by Mony Orbach Winter 2010 Characterization Presentation Implementation of an Engine Control Unit over Many-Core.
COM181 Computer Hardware Ian McCrumRoom 5B18,
Unit VI. Keil µVision3/4 IDE for 8051 Tool for embedded firmware development Steps for using keil.
Technion – Israel Institute of Technology Department of Electrical Engineering High Speed Digital Systems Lab Spring 2009.
Computer System Architectures Computer System Software
Alon Horn and Oren Ierushalmi Supervised by Mony Orbach Winter 2010 Final Presentation Implementation of an Engine Control Unit over Many-Core System.
RM2D Let’s write our FIRST basic SPIN program!. The Labs that follow in this Module are designed to teach the following; Turn an LED on – assigning I/O.
Matrix Multiplication on FPGA Final presentation One semester – winter 2014/15 By : Dana Abergel and Alex Fonariov Supervisor : Mony Orbach High Speed.
Software Performance Analysis Using CodeAnalyst for Windows Sherry Hurwitz SW Applications Manager SRD Advanced Micro Devices Lei.
Design of a RISC Processor Compatible with ARM Instruction Set AHMET GÜRHANLI LAB: BL405 SUPERVISER: 陳中平 教授.
Chapter 2 Parallel Architecture. Moore’s Law The number of transistors on a chip doubles every years. – Has been valid for over 40 years – Can’t.
Operating Systems. Definition An operating system is a collection of programs that manage the resources of the system, and provides a interface between.
© David Kirk/NVIDIA and Wen-mei W. Hwu, ECE 498AL, University of Illinois, Urbana-Champaign 1 CS 395 Winter 2014 Lecture 17 Introduction to Accelerator.
Outline  Over view  Design  Performance  Advantages and disadvantages  Examples  Conclusion  Bibliography.
RISC Architecture RISC vs CISC Sherwin Chan.
 Virtual machine systems: simulators for multiple copies of a machine on itself.  Virtual machine (VM): the simulated machine.  Virtual machine monitor.
Processor Types and Instruction Sets CS 147 Presentation by Koichiro Hongo.
Lab 2 Parallel processing using NIOS II processors
Parallel Computing.
Department of Computer Science MapReduce for the Cell B. E. Architecture Marc de Kruijf University of Wisconsin−Madison Advised by Professor Sankaralingam.
E X C E E D I N G E X P E C T A T I O N S VLIW-RISC CSIS Parallel Architectures and Algorithms Dr. Hoganson Kennesaw State University Instruction.
Teaching The Principles Of System Design, Platform Development and Hardware Acceleration Tim Kranich
CS 351/ IT 351 Modeling and Simulation Technologies HPC Architectures Dr. Jim Holten.
Hybrid Multi-Core Architecture for Boosting Single-Threaded Performance Presented by: Peyman Nov 2007.
Enhance CMAQ Performance to Meet Future Challenges: I/O Aspect David Wong AMAD, EPA October 20, 2009.
3/12/2013Computer Engg, IIT(BHU)1 INTRODUCTION-1.
Silberschatz, Galvin and Gagne ©2013 Operating System Concepts – 9 th Edition Chapter 4: Threads.
RISC / CISC Architecture by Derek Ng. Overview CISC Architecture RISC Architecture  Pipelining RISC vs CISC.
ECE 526 – Network Processing Systems Design Programming Model Chapter 21: D. E. Comer.
Introduction Goal: connecting multiple computers to get higher performance – Multiprocessors – Scalability, availability, power efficiency Job-level (process-level)
Silberschatz, Galvin and Gagne ©2009Operating System Concepts – 8 th Edition Chapter 4: Threads.
Multi-Core CPUs Matt Kuehn. Roadmap ► Intel vs AMD ► Early multi-core processors ► Threads vs Physical Cores ► Multithreading and Multi-core processing.
Implementing RISC Multi Core Processor Using HLS Language - BLUESPEC Liam Wigdor Instructor Mony Orbach Shirel Josef Semesterial Winter 2013.
IMPLEMENTING RISC MULTI CORE PROCESSOR USING HLS LANGUAGE - BLUESPEC LIAM WIGDOR INSTRUCTOR MONY ORBACH SHIREL JOSEF Winter 2013 One Semester Mid-term.
1 ”MCUDA: An efficient implementation of CUDA kernels for multi-core CPUs” John A. Stratton, Sam S. Stone and Wen-mei W. Hwu Presentation for class TDT24,
Lecture 5. Example for periority The average waiting time : = 41/5= 8.2.
Single Instruction Multiple Threads
Chapter 4: Threads Modified by Dr. Neerja Mhaskar for CS 3SH3.
Introduction to threads
These slides are based on the book:
CS203 – Advanced Computer Architecture
Chapter 4: Multithreaded Programming
Distributed Processors
Embedded Systems Design
The University of Adelaide, School of Computer Science
An example of multiplying two numbers A = A * B;
CSCE 212 Chapter 4: Assessing and Understanding Performance
Computer Architecture
Chapter 4: Threads.
Chapter 4: Threads.
Verilog to Routing CAD Tool Optimization
Compiler Back End Panel
Compiler Back End Panel
CHAPTER 4:THreads Bashair Al-harthi OPERATING SYSTEM
Chapter 4: Threads & Concurrency
Chapter 4: Threads.
Lesson Objectives A note about notes: Aims
CIS 6930: Chip Multiprocessor: Parallel Architecture and Programming
Presentation transcript:

Multicore experiment: Plurality Hypercore Processor Performed by: Anton Fulman Ze’ev Zilberman Supervised by: Mony Orbach Characterization presentation Winter 2008

Overview Plurality has developed a 256 Hypercore processor with unique architecture and programming model, suited for parallel algorithms The main project goal is to build and optimise algorithms for Plurality system, that will later be used for lab multicore experiment

Motivation Parallel algorithms running on multicore processor can lead to significant performance improvement Programming parallel algorithms is different than serial algorithms The main goals of the experiment: – Understand the principles of multicore processing – Learn to work with parallel algorithms – Evaluate and compare performance

System architecture The cores – 256 RISC SPARC based cores – Perform basic arithmetic operations Helper units – One helper unit for each 4 cores – Perform multiplication and division Synchronizer/scheduler – Distributes the tasks between the cores with minimal overhead Shared memory system – Allows any number of cores to access data and instruction memory at every clock cycles

System architecture - cont.

Programming model Task oriented programming model (TOP) The algorithm is partitioned to regular and duplicable tasks The algorithm can be described by a task map, with dependencies between the tasks Resource synchronization between tasks is managed automatically (unlike common multithread programming)

Programming model – cont.

Working environment Eclipse and HAL simulator running under Cygwin environment HAL debugger integrated into Eclipse that allows to monitor the system state – Variables – Registers – Assembly code – Cycles

Project goals Build a few parallel algorithms Write documentation Provide the knowledge base for the multicore experiment

Timeline Environment installation and basic training by Plurality representatives – Done Build simple matrix multiplication algorithm – Done Better understanding of the debugging capabilities Build more algorithms (and more complicated) Midterm presentation – beginning of June

After the midterm Finish writing all the algorithms Write documentation