High Dynamic Range Emeka Ezekwe M11 Christopher Thayer M12 Shabnam Aggarwal M13 Charles Fan M14 Manager: Matthew Russo 6/26/2015 1.

High Dynamic Range Emeka Ezekwe M11 Christopher Thayer M12 Shabnam Aggarwal M13 Charles Fan M14 Manager: Matthew Russo 6/26/2015 1

Agenda 2  Project Description Charles  MarketingShabnam  Behavioral DescriptionEmeka  Design ProcessChris  Floorplan EvolutionShabnam  Design SpecificationsChris  LayoutCharles  ConclusionEmeka

Charles Fan Project Description 3

4  High Dynamic Range??  Bright colors are BRIGHT  Dark colors are DARK  Details are seen CLEARLY  Otherwise…  Colors and lights look distorted & bland  FP HDR Format requires 48 bits per pixel  Problem: Too much storage space & memory bandwidth!!  Solution: HDR encoding yields 6:1 compression  OUR GOAL: Implement efficient HDR decoding in hardware  6:1 pixel compression  Increases useable storage space by 6 fold  decrease memory bandwidth by 6 fold  Effectively increases performance

Shabnam Aggarwal Marketing 6

7  AMD’s ATI Mobility Radeon X1900  48-bit floating point HDR HDR Compression is currently NOT supported Performance hit deters developers  Windows Vista also now requires a high end GPU to realize its full graphics potential.  Laptops & portable devices are using dedicated processors for graphics  OLED (Organic Light Emitting Diode) Displays are being developed by Sony  Contrast Ratio: 1000000:1

Marketing 9  Our decoder is designed to interface between specially encoded textures stored on the GPU’s memory and one of the GPU’s texture caches that feed into the shader processor.  Each ROP on (**ATI) is capable of processing 4 pixels per clock cycle. We plan for our hardware to decode the texture information for 4 pixels during each clock cycle.  This decoder will allow smaller textures to be stored in the GPU’s memory, which will allow graphics cards to provide the same functions with less memory.  Ultimately, this decoder can provide savings in cost, power consumption, heat dissipation, and size in current graphics cards. Our HDR Decoder!!

Marketing 10  Our HDR Decoder:  Smaller textures stored in GPU’s memory  Same functions…less memory  Savings in:  Cost  Power consumption  Heat dissipation  Size  HDR is the next generation of display technology

Emeka Ezekwe Behavioral & Algorithmic Description 11

Algorithmic Description  Encoding  Break texture into 4X4 pixel blocks.  Extract luminance value of each pixel.  Normalize red and blue values and average over each 2X2 block. Green can be recalculated while decoding.  Allocate more bits to luminance values.  After encoding, a 4X4 block of pixels can be compressed from 48 bpp to 8 bpp.

Algorithmic Description  Decoding (Luminance values)  Reconstruct Lp 1 Logical shift 1 Integer addition  Calculate GQ 1 Integer addition  Calculate final pixel values 3 floating-point multiplications  Total calculations 1 logical shift + 2 Integer additions + 3 floating-point multiplications

Data Flow 14 Find G Reg 7 7 4 4 4 4 8 Compute 1 pixel Compute 1 pixel Compute 1 pixel Compute 1 pixel Int to FP Reg 16 Reg 16 Reg 16 Reg 16 Reg 16 Reg 16 Reg 16 Reg 16 Reg 16 Reg 16 Reg 16 Reg 16 Serialize output Serialize output Serialize output Serialize output

Chris Thayer Design Process 15

Design Process 16  Goal: Speed  400 MHz  4 pixels per cycle, 4 cycles per block  Architectural decisions  No denormal support in Floating Point Multiplier  Pipelined design  Storing input values  Integer Multiplication  Wallace trees  Booth encoding  Critical adders  Carry select  Integer- Floating Point Conversion

 Circuit level decisions  Mirror FA’s to reduce carry-chain delay  Two different HA’s  AOI/OAI gates  Gate sizing along critical paths  Utilize Q and ~Q outputs from registers  Clock buffers built into register blocks  Double/Triple strapped VDD and GND  Repeaters to break up long wires  Balanced clock tree  Device Folding Design Process

Verification Process 18  C Implementation  Structural Verilog  Gate Level Schematic  Layout  Major Modules  Pipeline Stages  Global Signals

Shabnam Aggarwal Floorplan Evolution 19

Floorplan Evolution

Chris Thayer Design Specifications 21

Design Specifications 22  Delays  Stage one pipeline: 1.8 ns  Stage two pipeline: 1.53ns  Stage three pipeline: 2.479ns  Skew  Stage one: x  Stage two: x  Stage three: x  Resulting Clock Speed: 500 MHz  2 BILLION pixels per second  Size: 442x453 microns  Aspect Ratio: 1:1.024  Transistors: 42,772  Density: 0.21 T/micron^2

Charles Fan Layout 23

Floating Point Multiplier Layout 24 Pretty beautiful

Floating Point Multiplier Data Flow

Poly Layer 26

Metal One Layer 27

Metal Two Layer 28

Metal Three Layer 29

Metal Four Layer 30

Questions?

High Dynamic Range Emeka Ezekwe M11 Christopher Thayer M12 Shabnam Aggarwal M13 Charles Fan M14 Manager: Matthew Russo 6/26/2015 1.

Similar presentations

Presentation on theme: "High Dynamic Range Emeka Ezekwe M11 Christopher Thayer M12 Shabnam Aggarwal M13 Charles Fan M14 Manager: Matthew Russo 6/26/2015 1."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

High Dynamic Range Emeka Ezekwe M11 Christopher Thayer M12 Shabnam Aggarwal M13 Charles Fan M14 Manager: Matthew Russo 6/26/2015 1.

Similar presentations

Presentation on theme: "High Dynamic Range Emeka Ezekwe M11 Christopher Thayer M12 Shabnam Aggarwal M13 Charles Fan M14 Manager: Matthew Russo 6/26/2015 1."— Presentation transcript:

Similar presentations

About project

Feedback