Oct 31 st 2007University of Utah1 Multi-Cores: Architecture/VLSI Perspective The Hardware-Software Relationship: Date or Dump?

Slides:



Advertisements
Similar presentations
Distributed Systems Major Design Issues Presented by: Christopher Hector CS8320 – Advanced Operating Systems Spring 2007 – Section 2.6 Presentation Dr.
Advertisements

Multiprocessors— Large vs. Small Scale Multiprocessors— Large vs. Small Scale.
Computer Abstractions and Technology
DESIGN AND EVALUATION OF HYBRID FAULT-DETECTION SYSTEMS Qing Xu Kevin Wang.
System Simulation Of 1000-cores Heterogeneous SoCs Shivani Raghav Embedded System Laboratory (ESL) Ecole Polytechnique Federale de Lausanne (EPFL)
Scheduling Algorithms for Unpredictably Heterogeneous CMP Architectures J. Winter and D. Albonesi, Cornell University International Conference on Dependable.
1 xkcd/619. Multicore & Parallel Processing Guest Lecture: Kevin Walsh CS 3410, Spring 2011 Computer Science Cornell University.
Lecture 9: Coarse Grained FPGA Architecture October 6, 2004 ECE 697F Reconfigurable Computing Lecture 9 Coarse Grained FPGA Architecture.
Prof. Srinidhi Varadarajan Director Center for High-End Computing Systems.
Introductory Courses in High Performance Computing at Illinois David Padua.
Lecture 2: Modern Trends 1. 2 Microprocessor Performance Only 7% improvement in memory performance every year! 50% improvement in microprocessor performance.
Kevin Walsh CS 3410, Spring 2010 Computer Science Cornell University Multicore & Parallel Processing P&H Chapter ,
Keeping Hot Chips Cool Thermal Management for Green Computing Yang Ge Professor Qinru Qiu.
June 20 th 2004University of Utah1 Microarchitectural Techniques to Reduce Interconnect Power in Clustered Processors Karthik Ramani Naveen Muralimanohar.
1 Lecture 26: Storage Systems Topics: Storage Systems (Chapter 6), other innovations Final exam stats:  Highest: 95  Mean: 70, Median: 73  Toughest.
1 Multi - Core fast Communication for SoPC Multi - Core fast Communication for SoPC Technion – Israel Institute of Technology Department of Electrical.
Chapter Hardwired vs Microprogrammed Control Multithreading
CAD and Design Tools for On- Chip Networks Luca Benini, Mark Hummel, Olav Lysne, Li-Shiuan Peh, Li Shang, Mithuna Thottethodi,
September 28 th 2004University of Utah1 A preliminary look Karthik Ramani Power and Temperature-Aware Microarchitecture.
UCB November 8, 2001 Krishna V Palem Proceler Inc. Customization Using Variable Instruction Sets Krishna V Palem CTO Proceler Inc.
Projects Using gem5 ParaDIME (2012 – 2015) RoMoL (2013 – 2018)
Research Directions for On-chip Network Microarchitectures Luca Carloni, Steve Keckler, Robert Mullins, Vijay Narayanan, Steve Reinhardt, Michael Taylor.
Exploring the Tradeoffs of Configurability and Heterogeneity in Multicore Embedded Systems + Also Affiliated with NSF Center for High- Performance Reconfigurable.
Load Balancing Dan Priece. What is Load Balancing? Distributed computing with multiple resources Need some way to distribute workload Discreet from the.
Course Outline DayContents Day 1 Introduction Motivation, definitions, properties of embedded systems, outline of the current course How to specify embedded.
Computer System Architectures Computer System Software
Lecture 6: Introduction to Distributed Computing.
McPAT: An Integrated Power, Area, and Timing Modeling Framework for Multicore and Manycore Architectures Runjie Zhang Dec.3 S. Li et al. in MICRO’09.
UW-Madison Computer Sciences Vertical Research Group© 2010 A Unified Model for Timing Speculation: Evaluating the Impact of Technology Scaling, CMOS Design.
Thanks to Silberschatz, Galvin and Gagne  Operating System Concepts Chapter 1: Introduction n What is an Operating System? n Mainframe Systems.
 What is an operating system? What is an operating system?  Where does the OS fit in? Where does the OS fit in?  Services provided by an OS Services.
STRATEGIC NAMING: MULTI-THREADED ALGORITHM (Ch 27, Cormen et al.) Parallelization Four types of computing: –Instruction (single, multiple) per clock cycle.
Programming Paradigms for Concurrency Part 2: Transactional Memories Vasu Singh
Last Time Performance Analysis It’s all relative
Parallel and Distributed Systems Instructor: Xin Yuan Department of Computer Science Florida State University.
Compiler BE Panel IDC HPC User Forum April 2009 Don Kretsch Director, Sun Developer Tools Sun Microsystems.
1 Lecture 1: CS/ECE 3810 Introduction Today’s topics:  Why computer organization is important  Logistics  Modern trends.
Sogang University Advanced Computing System Chap 1. Computer Architecture Hyuk-Jun Lee, PhD Dept. of Computer Science and Engineering Sogang University.
1 Computer Architecture Research Overview Rajeev Balasubramonian School of Computing, University of Utah
Is Out-Of-Order Out Of Date ? IA-64’s parallel architecture will improve processor performance William S. Worley Jr., HP Labs Jerry Huck, IA-64 Architecture.
Dynamic Verification of Cache Coherence Protocols Jason F. Cantin Mikko H. Lipasti James E. Smith.
[Tim Shattuck, 2006][1] Performance / Watt: The New Server Focus Improving Performance / Watt For Modern Processors Tim Shattuck April 19, 2006 From the.
Heterogeneous Multikernel OS Yauhen Klimiankou BSUIR
1 Computer Architecture Research Overview Focus on: Transactional Memory Rajeev Balasubramonian School of Computing, University of Utah
CAPS project-team Compilation et Architectures pour Processeurs Superscalaires et Spécialisés.
VTU – IISc Workshop Compiler, Architecture and HPC Research in Heterogeneous Multi-Core Era R. Govindarajan CSA & SERC, IISc
Embedding Constraint Satisfaction using Parallel Soft-Core Processors on FPGAs Prasad Subramanian, Brandon Eames, Department of Electrical Engineering,
80-Tile Teraflop Network-On- Chip 1. Contents Overview of the chip Architecture ▫Computational Core ▫Mesh Network Router ▫Power save features Performance.
A few issues on the design of future multicores André Seznec IRISA/INRIA.
Spring 2006 Wavescalar S. Swanson, et al. Computer Science and Engineering University of Washington Presented by Brett Meyer.
Sun Starfire: Extending the SMP Envelope Presented by Jen Miller 2/9/2004.
A Common Machine Language for Communication-Exposed Architectures Bill Thies, Michal Karczmarek, Michael Gordon, David Maze and Saman Amarasinghe MIT Laboratory.
Computer Science and Engineering Power-Performance Considerations of Parallel Computing on Chip Multiprocessors Jian Li and Jose F. Martinez ACM Transactions.
Hybrid Multi-Core Architecture for Boosting Single-Threaded Performance Presented by: Peyman Nov 2007.
SR: 599 report Channel Estimation for W-CDMA on DSPs Sridhar Rajagopal ECE Dept., Rice University Elec 599.
1 Lecture 12: Advanced Static ILP Topics: parallel loops, software speculation (Sections )
Computer Organization Yasser F. O. Mohammad 1. 2 Lecture 1: Introduction Today’s topics:  Why computer organization is important  Logistics  Modern.
Cluster computing. 1.What is cluster computing? 2.Need of cluster computing. 3.Architecture 4.Applications of cluster computing 5.Advantages of cluster.
Page 1 2P13 Week 1. Page 2 Page 3 Page 4 Page 5.
Processor Performance & Parallelism Yashwant Malaiya Colorado State University With some PH stuff.
Computer Architecture: Multi-Core Processors: Why? Prof. Onur Mutlu Carnegie Mellon University.
Fall 2012 Parallel Computer Architecture Lecture 4: Multi-Core Processors Prof. Onur Mutlu Carnegie Mellon University 9/14/2012.
Lynn Choi School of Electrical Engineering
Fault-Tolerant NoC-based Manycore system: Reconfiguration & Scheduling
Parallel and Multiprocessor Architectures – Shared Memory
Compiler Back End Panel
Real-Time Systems Group
Compiler Back End Panel
Chapter 1 Introduction.
What time is it?. What time is it? Major Concepts: a data structure model: basic representation of data, such as integers, logic values, and characters.
Presentation transcript:

Oct 31 st 2007University of Utah1 Multi-Cores: Architecture/VLSI Perspective The Hardware-Software Relationship: Date or Dump?

University of Utah Embedded Applications --Spencer  From discreet cochlear implants to high-end biomedical imaging!  Multi-cores speed up performance by 50x!  Creating new application domains!

University of Utah How to use “multiple” cores? Oct 31 st Parallel programming Synchronization Deadlock Livelock Memory management

University of Utah Oct 31 st How to use “multiple” cores? Program = Communication + Computation  Global restructuring and parallelization

University of Utah Oct 31 st Structured “Communication” Lang: StreamIt, MPI Compilers: RAW, CoGenE Architecture: TRIPS, HWRT  Key: Help other levels and leverage communication

University of Utah Another Constraint? Oct 31 st Parallel programming Synchronization Deadlock Livelock Memory management Hey.. Surprise!!! Communication Scheduling

University of Utah Another Constraint? Oct 31 st Oh God!!! Communication Scheduling

University of Utah Focus of Architecture Research  Reduce the load of programmers Hardware transactional memory Aggressive pre-fetching Dynamic reconfiguration at every possible level  Keep the architectural innovations transparent to compilers or programmers Learn from the mistakes of ITANIUM ! Remember the success of OOO execution Oct 31 st

University of Utah 9 Reliability Issues --Niti  Shrinking transistor sizes & lower voltages  Increased transient faults, process variations – leakage power and frequency variations, hard errors, interconnect noise  Many-core – “Many cores” may not work reliably  Some cores will end up providing redundancy  Heterogeneous cores may be able to help  Simple in-order cores can provide redundancy at low cost  The compute power gain of many-core can get offset by reliability requirements of the system

University of Utah Oct 31 st On-Chip Sensor Networks --Nathaniel, Amlan Analog sensors everywhere! Need to monitor power, voltage droop, variation, critical paths, delays, slew rates, etc. Control system to react to changes. In multi-core, sensor network will only grow. Xeon and Itanium processors JSSC Jan 06 & 07 On-chip sensors