Presentation is loading. Please wait.

Presentation is loading. Please wait.

Memory Memory 10/9 - 2004 INF5060: Multimedia data communication using network processors.

Similar presentations


Presentation on theme: "Memory Memory 10/9 - 2004 INF5060: Multimedia data communication using network processors."— Presentation transcript:

1 Memory Memory 10/9 - 2004 INF5060: Multimedia data communication using network processors

2 2004 Carsten Griwodz & Pål HalvorsenINF5060 – multimedia communication using network processors Overview  Memory on the IXP cards  Kinds of memory  Its features  Its accessibility  Microengine assembler  Memory management

3 2004 Carsten Griwodz & Pål HalvorsenINF5060 – multimedia communication using network processors Kinds of Memory Microengine general purpose registers 128 registersOn chip StrongARM instruction cache 16 KbytesOn chip StrongARM data cache8 KbytesOn chip StrongARM mini cache512 bytesOn chip Scratch(pad)4 KbytesOn chip Instruction store64 KbytesOn chip FlashROM8 Mbytes SRAM8 Mbytes SDRAM256 Mbytes

4 2004 Carsten Griwodz & Pål HalvorsenINF5060 – multimedia communication using network processors IX Bus Unit IXP Functional Units Ethernet MAC (other IX devices) IX Bus StrongARM Core IXP Network Processor SRAM Unit SDRAM Unit PCI Bus Unit Microengine Various busses PCI Bus Host machine PCI-to-PCI bridge SDRAM (up to 256 MB) SRAM (up to 8 MB) Flash ROM (up to 8 MB) Memory Mapped I/O devices 64 bit/33Mhz 64 bit/116Mhz 32 bit/116Mhz 64 bit/104Mhz

5 2004 Carsten Griwodz & Pål HalvorsenINF5060 – multimedia communication using network processors Kinds of Memory  Physical memory on the IXP1200 is contiguous  Memory in general is not byte-addressable  Memory units emulate byte addressing for the StrongARM  Big endian architecture  StrongARM: big endian mode  Microengines are big endian Memory typeAddressable data unit (bytes) Relative access time (cycles) Scratch(pad)412-14 SRAM416-20 SDRAM832-40

6 2004 Carsten Griwodz & Pål HalvorsenINF5060 – multimedia communication using network processors Terms  Careful ! Inconsistencies !  Wording in Intel IXP manuals  Word: 16 bit  Longword: 32 bit  Quadword: 64 bit  Wording in StrongARM and other ARM manuals  Halfword: 16 bit  Word: 32 bit

7 2004 Carsten Griwodz & Pål HalvorsenINF5060 – multimedia communication using network processors Kinds of Memory  Memory accessible to StrongARM  Mapped into a single address space  Memory accessible to microengines  Individually mapped  Separate assembler instructions for each kind Device 0 SRAM Unit Device 1 PCI Unit Device 2 Reserved Device 3 StrongARM Core System Device 4 Reserved Device 5 AMBA Translation Unit Device 6 SDRAM Unit 0000 4000 0000 8000 0000 9000 0000 A000 0000 B000 0000 C000 0000 FFFF SDRAM Scratchpad Microengine registers SRAM

8 2004 Carsten Griwodz & Pål HalvorsenINF5060 – multimedia communication using network processors Memory: memory, cache memory, registers  StrongARM core caches  Microengine registers  SDRAM  SRAM  IX Bus Unit: Scratch(pad) memory

9 StrongARM

10 2004 Carsten Griwodz & Pål HalvorsenINF5060 – multimedia communication using network processors StrongARM Core Features  A general purpose processor  With MMU  16 Kbytes instruction cache  Round robin replacement  8 Kbytes data cache  Round robin replacement  Write-back cache, cache replacement on read, not on write  512 byte mini-cache for data that is used once and then discarded  To reduce flushing of the main data cache  Instruction code stored in SDRAM

11 2004 Carsten Griwodz & Pål HalvorsenINF5060 – multimedia communication using network processors IX Bus Unit StrongARM Core Access  Full access to  SDRAM Unit  SRAM Unit  incl. FlashROM  PCI Bus Unit  Access to microengine’s  Program code  Status registers  Program counters  Access to IX bus unit’s  Status registers  Scratch memory StrongARM Core SRAM Unit SDRAM Unit PCI Bus Unit Microengine

12 Microengines

13 2004 Carsten Griwodz & Pål HalvorsenINF5060 – multimedia communication using network processors Microengine Features  4 hardware contexts  2K x 32 bit instruction control store  Every instruction is 32 bits long  No instruction cache  Instructions downloaded onto the microengine by the StrongARM  Not loaded from RAM on demand  5-stage instruction pipeline  Blocks for reference operations  Deferred execution to reduce context switch penalty  256 registers  32 bit registers  Load and store architecture  Must bring data into registers, work, write to destination  Single cycle access in registers  Use “reference command” to fetch into registers  Yield/sleep during fetch execution

14 2004 Carsten Griwodz & Pål HalvorsenINF5060 – multimedia communication using network processors IX Bus Unit Microengine Access  Full access to  SDRAM Unit  SRAM Unit  IX Bus Unit  Access to StrongARM  Interrupts  Trigger status register reads  Access to PCI bus unit  Initiate DMA with SDRAM  Access to other microengines  None  Access to self  Inter-thread signaling  No access to own instruction code SRAM Unit SDRAM Unit PCI Bus Unit StrongARM Core MicroEngine Microengine

15 2004 Carsten Griwodz & Pål HalvorsenINF5060 – multimedia communication using network processors Microengine Registers From: IXP1200 Family Hardware Reference Manual

16 2004 Carsten Griwodz & Pål HalvorsenINF5060 – multimedia communication using network processors Microengine Registers  256 registers  128 general purpose registers  Arranged in two banks A and B  Instructions with 2 input registers  From different banks  Otherwise assembler warning  128 transfer registers  Transfer registers are not general purpose registers  Ports to their neighboring functional unit  64 SDRAM transfer registers  Transfer to and from SDRAM  32 read / 32 write  64 SRAM transfer registers  Transfer to and from everything but SDRAM  32 read / 32 write  4 busses can be used in parallel  By different threads  Loading transfer registers  64 bytes at once from one functional unit to another  128 bytes at once from the IX bus

17 SDRAM

18 2004 Carsten Griwodz & Pål HalvorsenINF5060 – multimedia communication using network processors General features  Recommended use  StrongARM instruction code  Large data structures  Packets during processing  64-bit addressed (8 byte aligned, quadword aligned)  256 Mbytes  928 Mbytes/s peak bandwidth  Higher bandwidth than SRAM  Higher latency than SRAM  Access  StrongARM  Microengines  StrongARM takes precedence  PCI DMA on behalf of microengines  Direct access to IX Bus Unit’s Transmit and Receive FIFO

19 2004 Carsten Griwodz & Pål HalvorsenINF5060 – multimedia communication using network processors Special features  Byte, word, longword access supported through a read-modify- write access to quadwords  Speed penalty  Direct path from SDRAM to IX Bus Transmit and Receive FIFOs  Controlled by microengines  Up to 64 bytes transferable without microengine involvement  Byte aligner between SDRAM and IX Bus  For sending to the Transmit FIFO  Shift bytewise when e.g. header length has changed  Can only be used by microengines in the t_fifo_wr command

20 SRAM

21 2004 Carsten Griwodz & Pål HalvorsenINF5060 – multimedia communication using network processors General features  Recommended use  Lookup tables  Free buffer lists  Data buffer queue lists  32-bit addressed (4 byte aligned, word aligned)  8 Mbytes  464 Mbytes/s peak bandwidth  Lower bandwidth than SDRAM  Lower latency than SDRAM  Access  StrongARM  Microengines  StrongARM takes precedence

22 2004 Carsten Griwodz & Pål HalvorsenINF5060 – multimedia communication using network processors Accessing SRAM  StrongARM access  Byte, word and longword access  Bit operations through SRAM Alias Address Space  Bit, byte, word write supported through read-modify-write  Microengine access  Bit and longword access only  Up to 8 longwords with one command  Bit write supported through read-modify-write  Bit operations within instructions

23 2004 Carsten Griwodz & Pål HalvorsenINF5060 – multimedia communication using network processors Special features  Atomic push/pop operations  For maintaining lists  8 entry push/pop register list  Microengines  Named commands  StrongARM  Dedicated memory addresses  Don’t cache these memory areas  Atomic bit test, set and clear  For synchronized access  Microengine  Use a write transfer register  Specify bits to test, read, or write  Reading the bit changes the write transfer register  StrongARM  Special macros for read-modify-write operations  Blocks until operation is completed  Don’t cache this memory

24 2004 Carsten Griwodz & Pål HalvorsenINF5060 – multimedia communication using network processors Special features  8 entry CAM (content addressable memory) for read locks  For synchronized access  8 concurrent locks on memory  Protect from StrongARM and microengines  Read, unlock and write_unlock  Microengines  sram assembler command  Waits until locks is released  StrongARM  3 separate 8 MByte mapped memory regions  Failed locking is indicated by flags, read always successful  Don’t cache these memory areas

25 2004 Carsten Griwodz & Pål HalvorsenINF5060 – multimedia communication using network processors StrongARM Core Memory Map Device 1 PCI Unit Device 2 Reserved Device 3 StrongARM Core System Device 4 Reserved Device 5 AMBA Translation Unit Device 6 SDRAM Unit 0000 4000 0000 8000 0000 9000 0000 A000 0000 B000 0000 C000 0000 FFFF Device 0 SRAM Unit Slow Port3840 0000 – 385F FFF Command FIFO Test3800 0080 – 3800 00FF SRAM CSRs3800 0000 – 3800 0028 List 7 Pop operations2780 0000 – 27FF FFFF List 6 Pop operations2700 0000 – 277F FFFF List 5 Pop operations2680 0000 – 26FF FFFF List 4 Pop operations2600 0000 – 267F FFFF List 3 Pop operations2580 0000 – 25FF FFFF List 2 Pop operations2500 0000 – 257F FFFF List 1 Pop operations2480 0000 – 24FF FFFF List 0 Pop operations2400 0000 – 247F FFFF List 7 Push operations2380 0000 – 23FF FFFF List 6 Push operations2300 0000 – 237F FFFF List 5 Push operations2280 0000 – 22FF FFFF List 4 Push operations2200 0000 – 227F FFFF List 3 Push operations2180 0000 – 21FF FFFF List 2 Push operations2100 0000 – 217F FFFF List 1 Push operations2080 0000 – 21FF FFFF List 0 Push operations2000 0000 – 207F FFFF Bit Test & Set1980 0000 – 19FF FFFF Bit Test & Clear1900 0000 – 197F FFFF Bit Write Set1880 0000 – 18FF FFFF Bit Write Clear1800 0000 – 187F FFFF CAM Unlock1600 0000 – 167F FFFF Write Unlock1400 0000 – 147F FFFF Read Lock1200 0000 – 127F FFFF Read/Write1000 0000 – 107F FFFF BootROM0000 0000 – 007F FFFF

26 2004 Carsten Griwodz & Pål HalvorsenINF5060 – multimedia communication using network processors Memory Map for SRAM addresses Physical Device FunctionStrongARM Address Space (byte addressing) Microengine SRAM instruction command Microengine Address Space (longword addressing) SlowPort 3840 0000 – 385F FFFread/write70 0000 – 7F FFFF SRAM CSRs 3800 0000 – 3800 0013read/write60 0000 – 60 0080 SRAMPop operations2400 0000 – 27FF FFFFpop00 0000 – 1F FFFF SRAMPush operations2000 0000 – 23FF FFFFpush00 0000 – 1F FFFF SRAMBit Test & Set1980 0000 – 19FF FFFFbit_wr (test_and_set_bits)00 0000 – 1F FFFF SRAMBit Test & Clear1900 0000 – 197F FFFFbit_wr (test_and_clear_bits) 00 0000 – 1F FFFF SRAMBit Write Set1880 0000 – 18FF FFFFbit_wr (set_bits)00 0000 – 1F FFFF SRAMBit Write Clear1800 0000 – 187F FFFFbit_wr (clear_bits)00 0000 – 1F FFFF SRAMUnlock1600 0000 – 167F FFFFunlock00 0000 – 1F FFFF SRAMWrite Unlock1400 0000 – 147F FFFFwrite_unlock00 0000 – 1F FFFF SRAMRead Lock1200 0000 – 127F FFFFread_lock00 0000 – 1F FFFF SRAMRead/Write1000 0000 – 107F FFFFread/write00 0000 – 1F FFFF BootROM 0000 0000 – 007F FFFFread/write20 0000 – 3F FFFF

27 IX Bus Unit

28 2004 Carsten Griwodz & Pål HalvorsenINF5060 – multimedia communication using network processors “FBI” Engine Interface IX Bus Unit SDRAM Unit Microengines Ethernet MAC (other IX devices) Transmit FIFO Receive FIFO Hash Units Status Registers IX Bus StrongARM IXP Network Processor IX Bus Unit Scratchpad

29 2004 Carsten Griwodz & Pål HalvorsenINF5060 – multimedia communication using network processors Scratch Memory: General Features  Recommended use  Passing messages between processors and between threads  Semaphores, mailboxes, other IPC  32-bit addressed (4 byte aligned, word aligned)  4 Kbytes  Has an atomic autoincrement instruction  Only usable by microengines

30 2004 Carsten Griwodz & Pål HalvorsenINF5060 – multimedia communication using network processors StrongARM Core Memory Map Device 0 SRAM Unit Device 1 PCI Unit Device 2 Reserved Device 3 StrongARM Core System Device 4 Reserved Device 5 AMBA Translation Unit Device 6 SDRAM Unit 0000 4000 0000 8000 0000 9000 0000 A000 0000 B000 0000 C000 0000 FFFF Scratchpad Memory B004 4000 – B004 4FFF IX Bus Unit CSRB004 0000 ME5 Transfer RegsB000 6800 ME4 Transfer RegsB000 6000 ME3 Transfer RegsB000 5800 ME2 Transfer RegsB000 5000 ME1 Transfer RegsB000 4800 ME0 Transfer RegsB000 4000 ME5 CSRB000 2800 ME4 CSRB000 2000 ME3 CSRB000 1800 ME2 CSRB000 1000 ME1 CSR B000 0800 ME0 CSR B000 0000 ME = microengine

31 Microengine Assembler

32 2004 Carsten Griwodz & Pål HalvorsenINF5060 – multimedia communication using network processors Using Microengine Registers  Programming  Context-relative addressing  Each threads can have its own window of registers (one 4 th of the total), so they can’t overwrite each other  Absolute addressing  Register is visible to all threads  Context-relative vs. absolute addressing  Decided on a per-instruction basis  Assembler  Supports symbolic names  Assigns registers from the different kinds  Programmer  must take care concerning the number of registers used  can hint the assembler to assign (transfer) registers contiguously  Context-relative addressing of the registers  Threads are only able to address their own register share  This is more typically used  Assembler notations  symbolic_register_name – general purpose register  $symbolic_register_name – SRAM transfer register  $$symbolic_register_name – SDRAM transfer register  Absolute addressing  Threads can use more than their share of registers  Threads can communicate via registers  Assembler notations  @symbolic_register_name – general purpose register  @$symbolic_register_name – SRAM transfer register  @$$symbolic_register_name – SDRAM transfer register

33 2004 Carsten Griwodz & Pål HalvorsenINF5060 – multimedia communication using network processors Microengine Assembler  ALU  alu[dest_reg, A_operand, alu_op, B_operand]  Perform addition, subtraction, bit operations  dest_reg  transfer register (TR), general purpose register (GPR) or nothing  A_operand  TR, GPR, immediate data, or nothing  B_operand  TR, GPR, or immediate data  ALU_SHF  alu_shf[dest_reg, A_operand, alu_op, B_operand, B_op_shift_cnt]  Like ALU, but shift B_operand before evaluation  dest_reg  Context-relative TR, GPR, or nothing  A_operand  TR, GPR, immediate data, or nothing  B_operand  TR, GPR, or immediate data

34 2004 Carsten Griwodz & Pål HalvorsenINF5060 – multimedia communication using network processors Microengine Assembler  BR_BCLR, BR_BSET  br_bclr[reg, bit_position, label#]  Branch if the given bit (0-32) in register reg is cleared or set, respectively  reg  Context-relative TR or GPR  BR=BYTE, BR!=BYTE  Br=byte[reg, byte_spec, byte_compare_value, label#]  Ranch if the indicated byte (0-3) of register reg is of the constant value byte_compare_value, or not, respectively  reg  Context-relative TR or GPR

35 2004 Carsten Griwodz & Pål HalvorsenINF5060 – multimedia communication using network processors Microengine Acess to SDRAM  Read, write, Receive FIFO read, Transmit FIFO write  sdram[sdram_cmd, $$sdram_xfer_reg, source_op_1, source_op_2, ref_count], optional_token  Parameters  sdram_cmd  read: read from SDRAM to TRs  write: write from TRs to SDRAM  r_fifo_rd: read from Receive FIFO to SDRAM  t_fifo_wr: write to Transmit FIFO from SDRAM  $$sdram_xfer_reg  The first of a set of contiguous TRs for read and write operations  One ref_count requires to TRs  source_op_1/2  Specifies the address to read from or to write to  ref_count  Values between 1 and 8 are valid  optional_token  ctx_arb allows other threads to run until memory operation is complete  ctx_swap switches context to the next thread  The (complicated) indirect_ref option must be used r_fifo_rd and t_fifo_wr

36 2004 Carsten Griwodz & Pål HalvorsenINF5060 – multimedia communication using network processors Microengine Access to SRAM (1/2)  Read, write, read and lock, write and unlock, unlock, …  sram[sram_cmd, $sram_xfer_reg, source_op_1, source_op_2, ref_count] optional_token  sram_cmd  Read or write  $ sram_xfer_reg  the first of ref_count contiguous TRs  source_op_1+source_op_2  Specifies the address to read from or to write to  ref_count  The number of longwords read or written  sram[read_lock, $sram_xfer_reg, source_op_1, source_op_2, ref_count] optional_token  Like sram[read, …]  But lock the address source_op_1+source_op_2  sram[write_unlock, $sram_xfer_reg, source_op_1, source_op_2, 1] optional_token  Write one TR to source_op_1+source_op_2 and unlock the address  sram[unlock, --, source_op_1, source_op_2, 1] optional_token  Unlock the address specified by souce_op_1+source_op_2

37 2004 Carsten Griwodz & Pål HalvorsenINF5060 – multimedia communication using network processors Microengine Access to SRAM (2/2)  …, bit operations, push, pull  sram[bit_wr, $bit_mask, source_op_1, source_op_2, bit_op] optional_token  As with scratch memory but with the larger address space  $ bit_mask is a write TR holds mask on input and optional results  sram[push, --, source_op_1, source_op_2, queue_num] optional_token  Add source_op_1 and source_op_2 to get an address  Push the address onto queue queue_num  sram[pop, $popped_list, --, --, queue_num] optional_token  Pop an address from queue queue_num  Store the pointer in the TR $ popped_list

38 2004 Carsten Griwodz & Pål HalvorsenINF5060 – multimedia communication using network processors Microengine Access to Scratch Memory  Read, write, bit operations, in-place increment  scratch[bit_wr, $sram_xfer_reg, source_op_1, source_op_2, bit_op], optional_token  Bit operations  scratch[read, $sram_xfer_reg, source_op_1, source_op_2, ref_count], optional_token  Read into transfer registers  scratch[write, $sram_xfer_reg, source_op_1, source_op_2, ref_count], optional_token  Write from transfer registers  scratch[incr, --, source_op_1, source_op_2, 1], optional_token  In-place increment by 1  Parameters  source_op1/2  Context-relative transfer registers (TRs) or immediate values  Sum between 0 and 1023  $sram_xfer_reg  For read and write: the first of a set of contiguous TRs to be read or written  For bit_wr: a TR containing a bit mask  ref_count  Number of longwords read or written  Between 1 and 8  bit_op  set_bits, clear_bits, test_and_set_bits, test_and_clear_bits  For the test_ operations, the write TR is modified

39 2004 Carsten Griwodz & Pål HalvorsenINF5060 – multimedia communication using network processors Microengine Assembler  Ordering problems  Example immed[$$temp, 0x1234] sdram[write,$$temp,base,0,1], ctx_swap, defer[1] immed[$$temp,0x5678]  The wrong value may be written  Writing and context swapping are deferred  The register modification may overtake  Address of a register  It is possible to determine the address of a register .local a_gp_reg  immed[a_gp_reg,&$an_sram_reg] .endlocal

40 Memory Management

41 2004 Carsten Griwodz & Pål HalvorsenINF5060 – multimedia communication using network processors Resource Manager  Task  Used by StrongARM code  For microACEs and microACE applications to interface with microengines  API  Load code into microengines  Enable/disable microengines  Get/set microengine configuration and resource assignment  Send and receive packets to and from microcode blocks  Allocate and access uncached SRAM, SDRAM and Scratch memory

42 2004 Carsten Griwodz & Pål HalvorsenINF5060 – multimedia communication using network processors Resource Manager  Data structures  RmMemoryHandle  Opaque handle identifying memory allocated by the resource manager  typedef int RmMemoryHandle

43 2004 Carsten Griwodz & Pål HalvorsenINF5060 – multimedia communication using network processors Resource Manager  RmMalloc  Allocate a particular kind of memory  RM_SRAM  RM_SDRAM  RM_SCRATCH  Some SRAM and SDRAM is already used by the ASL, some SDRAM is used by Linux, the rest can be used freely by microACEs for data structures of its choosing  The memory is not cached  The memory is not protected by an MMU, and the virtual address is the same for all processes  Returned pointers are always aligned (SDRAM to 8 bytes, SRAM and Scratch to 4 bytes)  Requested sizes are rounded to alignment  This allocation is not efficient  microACEs should allocate all memory they need at once and manage it themselves  ix_error RmMalloc( RmMemoryType in_memory_type, unsigned char* out_mem_handle_ptr, int in_size_in_bytes );  RmFree  Released memory allocated by RmMalloc  ix_error RmFree( unsigned char* ptr );

44 2004 Carsten Griwodz & Pål HalvorsenINF5060 – multimedia communication using network processors Resource Manager  Translating between virtual and physical addresses  The microengines map memory differently into their address space then the StrongARM  StrongARM addresses make no sense and have to be translated to offsets from the start of each particular kind of memory (and back)  RmGetPhysOffset  ix_error RmGetPhysOffset( RmMemoryType in_memory_type, unsigned char* in_data_ptr, unsigned int* out_offset );  Translate address in_data_ptr in RmAlloc’d memory to its offset from the given memory type  The offset is in words (4 byte units) for SRAM and Scratch, and in quadwords (8 byte units) for SDRAM  RmGetVirtualAddress  ix_error RmGetVirtualAddress( RmMemoryType in_memory_type, unsigned char** out_buffer_ptr, unsigned int in_offset);  Take the physical offset from the base of the given memory type and translate it into a virtual address valid for the StrongARM

45 2004 Carsten Griwodz & Pål HalvorsenINF5060 – multimedia communication using network processors Summary  Memory on the IXP cards  Kinds of memory  Its features  Its accessibility  Microengine assembler  Resource Manager functionsStrong


Download ppt "Memory Memory 10/9 - 2004 INF5060: Multimedia data communication using network processors."

Similar presentations


Ads by Google