ClearSpeed CSX620 Overview. References ClearSpeed Technical Training Slides for ClearSpeed Accelerator 620, software version 3.0, Slide Sets 1-6, Presentor:

Slides:



Advertisements
Similar presentations
Intel Pentium 4 ENCM Jonathan Bienert Tyson Marchuk.
Advertisements

ITCS 3181 Logic and Computer Systems 2015 B. Wilkinson slides3.ppt Modification date: March 16, Addressing Modes The methods used in machine instructions.
Slides created by: Professor Ian G. Harris PIC Development Environment MPLAB IDE integrates all of the tools that we will use 1.Project Manager -Groups.
Khaled A. Al-Utaibi  Computers are Every Where  What is Computer Engineering?  Design Levels  Computer Engineering Fields  What.
Higher Computing: Unit 1: Topic 3 – Computer Performance St Andrew’s High School, Computing Department Higher Computing Topic 3 Computer Performance.
Extensible Processors. 2 ASIP Gain performance by:  Specialized hardware for the whole application (ASIC). −  Almost no flexibility. −High cost.  Use.
Introduction CS 524 – High-Performance Computing.
Aug. 24, 2007ELEC 5200/6200 Project1 Computer Design Project ELEC 5200/6200-Computer Architecture and Design Fall 2007 Vishwani D. Agrawal James J.Danaher.
Performance D. A. Patterson and J. L. Hennessey, Computer Organization & Design: The Hardware Software Interface, Morgan Kauffman, second edition 1998.
IXP1200 Microengines Apparao Kodavanti Srinivasa Guntupalli.
Hitachi SR8000 Supercomputer LAPPEENRANTA UNIVERSITY OF TECHNOLOGY Department of Information Technology Introduction to Parallel Computing Group.
The PTX GPU Assembly Simulator and Interpreter N.M. Stiffler Zheming Jin Ibrahim Savran.
1-1 Embedded Software Development Tools and Processes Hardware & Software Hardware – Host development system Software – Compilers, simulators etc. Target.
PlayStation 2 Architecture Irin Jose Farid Momin Quy Ngo Olivia Wong.
CS1104 – Computer Organization PART 2: Computer Architecture Lecture 1 Introduction.
COM181 Computer Hardware Ian McCrumRoom 5B18,
A Flexible Architecture for Simulation and Testing (FAST) Multiprocessor Systems John D. Davis, Lance Hammond, Kunle Olukotun Computer Systems Lab Stanford.
Robotics Research Laboratory Louisiana State University.
บทนำสู่คอมพิวเตอร์. Outline Computer Concepts Computer Components Software OS How to write a program? Program development.
COMPUTER ORGANIZATION CSCE 230 Final Project. OVERVIEW  Implemented RISC processor  VHDL  Test program created to demonstrate abilities.
An Introduction Chapter Chapter 1 Introduction2 Computer Systems  Programmable machines  Hardware + Software (program) HardwareProgram.
Chapter 2 Computer Clusters Lecture 2.3 GPU Clusters for Massive Paralelism.
MICE III 68000/20/30 MICETEK International Inc. CPU MICEIII MICEView Examples Contents Part 1: An introduction to the MC68000,MC68020 and Part.
Computer Systems 1 Fundamentals of Computing Von Neumann & Fetch Execute Cycle.
RM2D Let’s write our FIRST basic SPIN program!. The Labs that follow in this Module are designed to teach the following; Turn an LED on – assigning I/O.
Introduction to Computer Systems Topics: Theme Four great realities of computer systems Chap 1 in “Computer Systems” book “The Class That Gives.
Copyright © 2007 Heathkit Company, Inc. All Rights Reserved PC Fundamentals Presentation 27 – A Brief History of the Microprocessor.
COMP3221 lec04--prog-model.1 Saeid Nooshabadi COMP 3221 Microprocessors and Embedded Systems Lecture 4: Programmer’s Model of Microprocessors
4 November 2008NGS Innovation Forum '08 11 NGS Clearspeed Resources Clearspeed and other accelerator hardware on the NGS Steven Young Oxford NGS Manager.
EKT 422 Computer Architecture
An Ultra-High Performance Architecture for Embedded Defense Signal and Image Processing Applications September 24, 2003 Authors Ken Cameron
2015/10/22\course\cpeg323-08F\Final-Review F.ppt1 Midterm Review Introduction to Computer Systems Engineering (CPEG 323)
PDCS 2007 November 20, 2007 Accelerating the Complex Hessenberg QR Algorithm with the CSX600 Floating-Point Coprocessor Yusaku Yamamoto 1 Takafumi Miyata.
Associative Functions implemented on ClearSpeed CSX600 Mike Yuan.
EG280 Computer Science for Engineers Fundamental Concepts Chapter 1.
Accelerating the Singular Value Decomposition of Rectangular Matrices with the CSX600 and the Integrable SVD September 7, 2007 PaCT-2007, Pereslavl-Zalessky.
Associative Functions implemented on ClearSpeed CSX600 Mike Yuan.
Computer Organization & Assembly Language © by DR. M. Amer.
Blackfin Array Handling Part 1 Making an array of Zeros void MakeZeroASM(int foo[ ], int N);
Department of Industrial Engineering Sharif University of Technology Session# 6.
Hardware Benchmark Results for An Ultra-High Performance Architecture for Embedded Defense Signal and Image Processing Applications September 29, 2004.
Playstation2 Architecture Architecture Hardware Design.
WorldScape Defense Company, L.L.C. Company Proprietary Slide 1 An Ultra-High Performance Scalable Processing Architecture for HPC and Embedded Applications.
 System Requirements are the prerequisites needed in order for a software or any other resources to execute efficiently.  Most software defines two.
Static DLX processor Understanding its architecture and available toolset.
1 TM 1 Embedded Systems Lab./Honam University ARM Microprocessor Programming Model.
Computer Operation. Binary Codes CPU operates in binary codes Representation of values in binary codes Instructions to CPU in binary codes Addresses in.
Chapter 1 slides1 What is C? A high-level language that is extremely useful for engineering computations. A computer language that has endured for almost.
CPS4200 System Programming Spring 1 Systems Programming Chapter 1 Background I.
A next-generation many-core processor with reliability, fault tolerance and adaptive power management features optimized for embedded.
1 CS 192 Lecture 4 Winter 2003 December 8-9, 2003 Dr. Shafay Shamail.
Evaluating Register File Size
ACOE301: Computer Architecture II Labs
Computer Organization & Assembly language
NT1110 Computer Structure and Logic
Microcomputer Systems 1
Associative Functions implemented on ClearSpeed CSX600
Introduction to Computer Systems Engineering
Introduction to System Programming
Mobile Development Workshop
Computer Electronic device Accepts data - input
Introduction to Computer Systems
Apparao Kodavanti Srinivasa Guntupalli
ClearSpeed CSX620 Overview
Computer Electronic device Accepts data - input
Introduction to Computer Systems
THE ECE 554 XILINX DESIGN PROCESS
System Programming By Prof.Naveed Zishan.
THE ECE 554 XILINX DESIGN PROCESS
Presentation transcript:

ClearSpeed CSX620 Overview

References ClearSpeed Technical Training Slides for ClearSpeed Accelerator 620, software version 3.0, Slide Sets 1-6, Presentor: Brian Summers (senior engineer), December 2007 –Acknowledgement: Many slides used here are from Slide Set 1. ClearSpeed Introductory Programming Manual, January 2008

Topics Overview of ClearSpeed Board –ClearSpeed Technology Company –Accelerators –ClearSpeed and HPC –Hardware Overview –Performance –Software Development Kit (SDK) –Application Examples –Help and Support Topics omitted from ClearSpeed Overview –Installing Hardware and Software –Most topics in SDK overview - Some will be covered later E.g., C n Language, C n Libraries, compiler, debugging C n, assembler, linker, simulator, graphics profiler, libraries. –Moving Data –Tuning Tips

ClearSpeed CSX600 Accelerator Board A PCI-X card equipped with two ClearSpeed CSX600 coprocessors

Performance Specifications of CSX600 Sustained double-precision performance of 25 GFLOPS on DGEMM 10 W max power consumption 250 MHz clock speed Transfer speed of internal memory: 96 Gbyes/s Transfer speed of external memory: 3.2 Gbytes/s

Multi-threaded Array Processing (MTAP) architecture of CSX600 Mono execution unit - process non-parallel data - handle program flow control Poly execution unit - 96 PEs - 6KB SRAM - dual 64-bit FPU - integer ALU - 32/64-bit floating-point multiplier & adder - 128B register files

C n language Similar to standard C Main difference is poly variables Example code: #include // Output support #include // Extra functions to support features of hardware int main() { poly int n; n = get_penum(); // individual PE number printfp("PE number: %d\n", n); // Output different message per PE return 0; } poly short get_penum(): number of current PE mono short get_num_pes(): number of PEs on CSX processor

Note: Do not contact ClearSpeed about a homework problem, answering a question, etc. They expect these questions to be professional level questions from owners of their CSX 620 boards – not student questions about their class or homework.