We think you have liked this presentation. If you wish to download it, please recommend it to your friends in any social system. Share buttons are a little bit lower. Thank you!
Presentation is loading. Please wait.
Published byArthur Evrard
Modified about 1 year ago
The Current Challenges in DataFlow Supercomputer Programming Berlin January 2013
A Classification of Supercomputer Systems
Intel Nehalem E5520 Quad-core CPU Computation CapacityMemory Capacity # coresClock frequency Peak perfor mance Power L1 cacheMemory SizeBandwidthSizeBandwidth 42.27 GHz 36 GFLO Ps 80W128 kB291 GB/snot limited25 GB/s Cell/B.E Computation CapacityMemory Capacity # coresClock frequency Peak perfor mance Power CPU cacheLocal store Memo ry size Bandwidth SizeBandwidthSizeBandwidth 1+8 hetero3.2 GHz 230.4 GFLO Ps 135W512kB44 GB/s8*256KB204.8 GB/s16 GB25 GB/s ClearSpeed CSX700 Computation CapacityMemory Capacity # coresClock frequency Peak perfor mance Power CPU cacheLocal storeMemory Bandwidth to host Size BandwidthSizeBandwidth 2+192 hetero250 MHz 96 GFLO PS 11.4W24KB2*128KB192 GB/s 2*8 GB 2*4 GB/s4 GB/s SGI RASC Accelerator board (2 x Virtex4 LX200) max 120W Computation CapacityMemory Capacity # LUTs# FFs# DSP48E Clock freque ncy Peak perfor mance Power Block RAMsOn board memory Bandwidth to host #Size Band width 200448 x2 96 x2 200 MHz 47 GFLO Ps 120W336 0.7 MB 40 MB 16 GB/s 6.4 GB/s
Maxeler Max2 FPGA Acceleration Card Computation CapacityMemory Capacity # LUTs# FFs# DSP48Es Clock freque ncy Peak performancePower Block RAMsOn board memory Bandwidth to host #Size Band width SizeBandwidth 414720 384 150M Hz 116 GFLOPs55W648 2.8 MB 1519 GB/s 12GB28GB/s4GB/s Convey coprocessor HC-1 Computation CapacityMemory Capacity # LUTs# FFs# DSP48E clock freque ncy peak performancePower Block RAMsOn board memory bandwidth to host #Sizebandwidthsizebandwidth 4* 207360 4* 192n/a80 GFLOPs100W4*288 4*1.2 5 MB n/a8 GB80 GB/s1066 MT/s NVidia GTX580 Computation CapacityMemory Capacity # Multiprocessors# cores clock freque ncy peak performancePower shared memoryon board memory bandwidth to host Size band width sizebandwidth 16512 1.54 GHz 1.58 TFLOPs244W768 KBN/A6 GB192.4 GB/s8 GB/s AMD ATI HD5870 Computation CapacityMemory Capacity # Multiprocessors# cores clock freque ncy peak performancePower shared memoryon board memory bandwidth to host size Band width sizebandwidth 201600 850 MHz 2.72 TFLOPs188W640 KB 2176 GB/s 6 GB153.6 GB/s8 GB/s
Table of Contents 0. Classification.pptx 1. Anegdotic.pptx 2. MaxelerAlabamaSlidesWoVeljkoFinal.pptx 3. Maxeler-examples1.pptx 4. 01_Introduction.pptx 5. 02_ProgrammingMaxCompiler.pptx 6. 03_MoreMaxCompiler.pptx 7. 04_Numerics.pptx 8. 05_Scheduling.pptx 9. 06_LoopsAndCyclicGraphs.pptx 10. 07_ElementaryFunctions.pptx 11. Maxeler-examples.pptx 12. StudentsWorldwide.pptx 13. AlgGrossPitaevskii-real.pptx 14. paperCACM.pdf 15. Discusion.pptx
Sobolev(+Node 6, 7) Showcase +K20m GPU Accelerator.
The Internet (Gaming) Windows XP or later 1.7 GHz Intel or AMD Processor 512 MB of RAM DirectX 8.1 graphics card Sound card (These requirements are based.
Computer Performance. Hard Drive - HDD Stores your files, programs, and information. If it gets full, you can’t save any more. Measured in bytes (KB,
Parallel Computers Today Oak Ridge / Cray Jaguar > 1.75 PFLOPS Two Nvidia 8800 GPUs > 1 TFLOPS Intel 80- core chip > 1 TFLOPS TFLOPS = floating.
Processor Development The following slides track three developments in microprocessors since Clock Speed – the speed at which the processor can carry.
BELIEVE BOX. CPU : Intel Black Shield - 12 Cores - 5GHz Graphic Card : Nvidia GeForce 900GTX MHz Clock Mb Ram GDDR5 Hard Drive : 7200 RPM.
System Requirements are the prerequisites needed in order for a software or any other resources to execute efficiently. Most software defines two.
Introduction to Hardware. What is binary? We use the decimal (base 10) number system Binary is the base 2 number system Ten different numbers are used.
Multithreaded FPGA Acceleration of DNA Sequence Mapping Edward Fernandez, Walid Najjar, Stefano Lonardi, Jason Villarreal UC Riverside, Department of Computer.
Presenter MaxAcademy Lecture Series – V1.0, September 2011 Introduction and Motivation.
Company LOGO High Performance Processors Miguel J. González Blanco Miguel A. Padilla Puig Felix Rivera Rivas.
Computer Graphics Graphics Hardware CO2409 Computer Graphics Week 12.
Different CPUs CLICK THE SPINNING COMPUTER TO MOVE ON.
Shared memory systems. What is a shared memory system Single memory space accessible to the programmer Processor communicate through the network to the.
Release dateJanuary 2014 Max. Memory512GB Cost$750-$1000 (£ ) Pre-order valueN/A Dimensions12 x 12.4 x 2.9 in high Generation8 th ConnectivityYes.
COMPUTER COMPARISON Period 4 By : Matthew Walker Joseph Deahn Philip Wymer Joshua Deloraya.
Performance and Energy Efficiency of GPUs and FPGAs Betkaoui, B.; Thomas, D.B.; Luk, W., "Comparing performance and energy efficiency of FPGAs and GPUs.
The Central Processing Unit Processor and Main Memory.
Acceleration of the Smith– Waterman algorithm using single and multiple graphics processors Author : Ali Khajeh-Saeed, Stephen Poole, J. Blair Perot. Publisher:
Open-source routing at 10Gb/s Olof Hagsand (KTH) Robert Olsson (Uppsala U) Bengt Görden (KTH) SNCNW May 2009 Project grants: Internetstiftelsen (IIS) Equipment:
Real Parallel Computers. Modular data centers Background Information Recent trends in the marketplace of high performance computing Strohmaier, Dongarra,
HPCC Mid-Morning Break High Performance Computing on a GPU cluster Dirk Colbry, Ph.D. Research Specialist Institute for Cyber Enabled Discovery.
GPU Programming with CUDA – Accelerated Architectures Mike Griffiths
Accelerating Machine Learning Applications on Graphics Processors Narayanan Sundaram and Bryan Catanzaro Presented by Narayanan Sundaram.
Jie Chen. 30 Multi-Processors each contains 8 cores at 1.4 GHz 4GB GDDR3 memory offers ~100GB/s memory bandwidth.
HPCC Mid-Morning Break Dirk Colbry, Ph.D. Research Specialist Institute for Cyber Enabled Discovery Introduction to the new GPU (GFX) cluster.
Sun Fire™ E25K Server Keith Schoby Midwestern State University June 13, 2005.
Exploiting Disruptive Technology: GPUs for Physics Chip Watson Scientific Computing Group Jefferson Lab Presented at GlueX Collaboration Meeting, May 11,
My great Computer TOMMY H. My Great Computer Its main function of the is to play game, can show high equality picture Can process the application.
GPU System Architecture Alan Gray EPCC The University of Edinburgh.
PR-DLSR Motherboard Training for TSD & RMA engineers.
Monte-Carlo method and Parallel computing An introduction to GPU programming Mr. Fang-An Kuo, Dr. Matthew R. Smith NCHC Applied Scientific Computing.
Processors Menu INTEL Core™ i Processor INTEL Core™ i Processor INTEL Core i Processor INTEL Core i Processor AMD A K.
1)Leverage raw computational power of GPU Magnitude performance gains possible.
Parallelization and Characterization of Pattern Matching using GPUs Author: Giorgos Vasiliadis 、 Michalis Polychronakis 、 Sotiris Ioannidis Publisher:
4 Dec 2006 Testing the machine (X7DBE-X) with 6 D-RORCs 1 Evaluation of the LDC Computing Platform for Point 2 SuperMicro X7DBE-X Andrey Shevel CERN PH-AID.
Christopher Mitchell CDA 6938, Spring The Discrete Cosine Transform In the same family as the Fourier Transform Converts data to frequency domain.
Unit C-Hardware & Software1 GNVQ Foundation Unit C Bits & Bytes.
Cosc 2150 Current CPUs Intel and AMD processors. Notes The information is current as of Dec 5, 2014, unless otherwise noted. The information for this.
Table 4, Ali Kergaye, Matthew Martinez, Peterona Mahuru,Riley Sorensen, Samuel Delgado What kind of computer do we want?
Prof. JunDong Cho VADA Lab. Project.
Hardware Trends. Contents Memory Hard Disks Processors Network Accessories Future.
Accelerating Statistical Static Timing Analysis Using Graphics Processing Units Kanupriya Gulati and Sunil P. Khatri Department of ECE, Texas A&M University,
By: Kellen Freeman, Chase Perez, and Gage Green. We are doing this presentation on video cards. You will probably hear about computer parts that you might.
NICS RP Update TeraGrid Round Table March 10, 2011 Ryan Braby NICS HPC Operations Group Lead.
Copyright © 2007 Heathkit Company, Inc. All Rights Reserved PC Fundamentals Presentation 27 – A Brief History of the Microprocessor.
The versatile hardware accelerator framework for sparse vector calculations Michał Karwatowski 1,2, Kazimierz Wiatr 12 1 AGH University of Science and.
GENERAL PURPOSE FIFO ON VIRTEX-6 FPGA ML605 BOARD MIDTERM PRESENTATION STUDENTS: OLEG KORENEV EUGENE REZNIK SUPERVISOR: ROLF HILGENDORF 1 Semester: spring.
© 2017 SlidePlayer.com Inc. All rights reserved.