DSP Lecture Series DSP Memory Architecture Dr. E.W. Hu Nov. 28, 2000.

Slides:



Advertisements
Similar presentations
Memory Interleaving.
Advertisements

DSPs Vs General Purpose Microprocessors
Lecture 4 Introduction to Digital Signal Processors (DSPs) Dr. Konstantinos Tatas.
Computer System Organization Computer-system operation – One or more CPUs, device controllers connect through common bus providing access to shared memory.
CPU Review and Programming Models CT101 – Computing Systems.
Khaled A. Al-Utaibi  Computers are Every Where  What is Computer Engineering?  Design Levels  Computer Engineering Fields  What.
CS-334: Computer Architecture
FIU Chapter 7: Input/Output Jerome Crooks Panyawat Chiamprasert
Digital Signal Processing and Field Programmable Gate Arrays By: Peter Holko.
1 Lecture 15: DRAM Design Today: DRAM basics, DRAM innovations (Section 5.3)
1 SHARC ‘S’uper ‘H’arvard ‘ARC’hitecture Nagendra Doddapaneni ER hit HAR ect VARD ure SUP Arc.
Midterm Tuesday October 23 Covers Chapters 3 through 6 - Buses, Clocks, Timing, Edge Triggering, Level Triggering - Cache Memory Systems - Internal Memory.
1 CS402 PPP # 1 Computer Architecture Evolution. 2 John Von Neuman original concept.
GCSE Computing - The CPU
Basic Computer Organization CH-4 Richard Gomez 6/14/01 Computer Science Quote: John Von Neumann If people do not believe that mathematics is simple, it.
5.1 Chaper 4 Central Processing Unit Foundations of Computer Science  Cengage Learning.
Kathy Grimes. Signals Electrical Mechanical Acoustic Most real-world signals are Analog – they vary continuously over time Many Limitations with Analog.
Group 5 Alain J. Percial Paula A. Ortiz Francis X. Ruiz.
Using Programmable Logic to Accelerate DSP Functions 1 Using Programmable Logic to Accelerate DSP Functions “An Overview“ Greg Goslin Digital Signal Processing.
GallagherP188/MAPLD20041 Accelerating DSP Algorithms Using FPGAs Sean Gallagher DSP Specialist Xilinx Inc.
ARM Processor Architecture
Input / Output CS 537 – Introduction to Operating Systems.
Khaled A. Al-Utaibi Memory Devices Khaled A. Al-Utaibi
Input/Output. Input/Output Problems Wide variety of peripherals —Delivering different amounts of data —At different speeds —In different formats All slower.
C.S. Choy95 COMPUTER ORGANIZATION Logic Design Skill to design digital components JAVA Language Skill to program a computer Computer Organization Skill.
Higher Computing Computer Systems S. McCrossan 1 Higher Grade Computing Studies 2. Computer Structure Computer Structure The traditional diagram of a computer...
Chapter 1 Computer System Overview Dave Bremer Otago Polytechnic, N.Z. ©2008, Prentice Hall Operating Systems: Internals and Design Principles, 6/E William.
Dr. Rabie A. Ramadan Al-Azhar University Lecture 6
Survey of Existing Memory Devices Renee Gayle M. Chua.
CHAPTER 3 TOP LEVEL VIEW OF COMPUTER FUNCTION AND INTERCONNECTION
Computers organization & Assembly Language Chapter 0 INTRODUCTION TO COMPUTING Basic Concepts.
Microprocessor Dr. Rabie A. Ramadan Al-Azhar University Lecture 2.
Advanced Computer Architecture 0 Lecture # 1 Introduction by Husnain Sherazi.
Main Memory CS448.
3 Computing System Fundamentals
The variety Of Processors And Computational Engines CS – 355 Chapter- 4 `
OCR GCSE Computing © Hodder Education 2013 Slide 1 OCR GCSE Computing Chapter 2: CPU.
MEMORY ORGANIZTION & ADDRESSING Presented by: Bshara Choufany.
Input-Output Organization
Introduction to Microprocessors
Computer Architecture Lecture 32 Fasih ur Rehman.
Computer Hardware A computer is made of internal components Central Processor Unit Internal External and external components.
Computer Architecture 2 nd year (computer and Information Sc.)
Stored Programs In today’s lesson, we will look at: what we mean by a stored program computer how computers store and run programs what we mean by the.
DIGITAL SIGNAL PROCESSORS. Von Neumann Architecture Computers to be programmed by codes residing in memory. Single Memory to store data and program.
Lecture#15. Cache Function The data that is stored within a cache might be values that have been computed earlier or duplicates of original values that.
Overview von Neumann Architecture Computer component Computer function
IT3002 Computer Architecture
Jeffrey Ellak CS 147. Topics What is memory hierarchy? What are the different types of memory? What is in charge of accessing memory?
1 Device Controller I/O units typically consist of A mechanical component: the device itself An electronic component: the device controller or adapter.
©Brooks/Cole, 2003 Chapter 1 Introduction. ©Brooks/Cole, 2003 Figure 1-1 Data processor model This model represents a specific-purpose computer not a.
Chapter 11 System Performance Enhancement. Basic Operation of a Computer l Program is loaded into memory l Instruction is fetched from memory l Operands.
RAM RAM - random access memory RAM (pronounced ramm) random access memory, a type of computer memory that can be accessed randomly;
Gujarat technological university active learning assignment ON ARCHITECTURE OF AVR MICROCONTROLLER at c. K. pithawala college of engineering and technology.
Computer Architecture Furkan Rabee
GCSE Computing - The CPU
Computing Systems Organization
Edexcel GCSE Computer Science Topic 15 - The Processor (CPU)
Embedded Systems Design
Teaching Computing to GCSE
Digital Signal Processors
Introduction to Digital Signal Processors (DSPs)
CISC AND RISC SYSTEM Based on instruction set, we broadly classify Computer/microprocessor/microcontroller into CISC and RISC. CISC SYSTEM: COMPLEX INSTRUCTION.
Chapter 4 Introduction to Computer Organization
Chapter 5 Computer Organization
GCSE Computing - The CPU
Presentation transcript:

DSP Lecture Series DSP Memory Architecture Dr. E.W. Hu Nov. 28, 2000

Computer Architecture and VLSI Technology Pioneer: Lynn Conway In the 1950s, while working at IBM, Lynn Conway conceived the idea of multi- issue processors, the forerunner of today’s VLIW processors?

Fixed-point DSP datapath

What is memory architecture The characteristics of the organization of memory and its interconnection with the processor’s datapath is called memory architecture. Memory architecture determines the memory bandwidth which is a critical factor that affects the performance of a DSP.

Memory bandwidth In general, bandwidth w is defined as the rate at which the words can be written to (store) or read from the memory. For a DSP, it is convenient to think of how many instruction cycles are needed to complete a read or write operation. If everything else is the same, the smaller the number of instruction cycles, the higher the bandwidth.

Why DSP applications needs large memory bandwidth A high performance datapath is only part of a high-performance processor. DSP applications are typical computation- intensive, which requires large amount of data to be moved to and from the memory quickly (between the datapath(s) and the memory module (s), as described in the next slide.

Typical DSP applications: the FIR or finite impulse response filter

At each “tap”, four memory accesses are needed for FIR application Fetch the MAC instruction in memory Read the data value from memory(a ‘sample’ from the signal) Read the appropriate coefficient from memory (known constant for a particular filter) Write the data value to memory (next location in the delay line)

The Von Neumann architecture for general-purpose processors

The Harvard architecture: design basis for most DSPs; more than two accesses per cycle

Variations of the Harvard architecture allow still more memory accesses per instruction cycle

Typical DSPs with two or three independent memory banks Analog Devices ADSP-21xx AT&T DSP 16xx Zilog Z893xx Motorola DSP5600x, DSP563xx, DSP96002

Other approaches to achieve multiple accesses to memories per cycle Examples of some other approaches multiple, sequential accesses per instruction cycle over a single set of buses (meaning each access takes less than one cycle), e.g., Zoran ZR3800. Multi-ported memories that allow multiple concurrent memory accesses over two or more independent sets of buses (Fig 5.4), e.g., AT&T DSP32xx. Allows read/write operation to proceed at the same time under restricted circumstances, e.g., AT&T DSP16xx.

Using cache memory to reduce memory accesses On-chip program cache reduces memory accesses There are so many different implementations of program caches: Single instruction repeat buffer Multiple-instruction cache (e.g., stores a block of 16 instructions) Single-sector instruction cache that stores some number of most recently used instructions.

Using “modulo addressing” technique to reduce memory accesses To be discussed in the next seminar: memory addressing modes

Using “algorithmic approaches” to reduce memory accesses Algorithms are used to exploit data locality to reduce memory accesses. DSP algorithms that operate on blocks of input data often fetch the same data from memory multiple times during execution, as in the case of FIR filter computation. In the example that follows, the filter operates on a block of two input samples. Instead of computing output samples one at a time, the filter instead computes two output samples at a time, allowing it to reuse previously fetched data. This effectively reduces the memory bandwidth required from one instruction fetch and two data fetches to one instruction fetch and one data fetch per instruction cycle.

Illustration of algorithmic approach

Memory wait states Wait states are states in which the processor cannot execute its program because it is waiting for access to memory due to, for example Slow memory Bus sharing

On-chip ROM for low-cost embedded applications On-chip ROM (usually small, 256 to 36K words) is used to store small application programs and constant data for low-cost embedded applications.

External memory interfaces

External memory interfaces: manual caching If a section of often-used program code is stored in a slow, off-chip memory, it is programmer’s responsibility to move the code to faster on-chip RAM, either at system start-up or when that section of program is needed.

Dynamic memory Most DSPs use static RAM, which is faster and easier to interface, but it is more expensive. For low-cost high-volume product, the designer might need to consider dynamic RAM, especially the static-column DRAM.

Direct memory access DMA allows data transfer to take place (to/from processor’s memory) without the involvement of the processor itself. It is typically used to improve the performance for I/O devices.

Customization Some vendors are flexible enough to customize it chip-design for their customers (customizable DSPS).