Next Generation ISA Itanium / IA-64. Operating Environments IA-32 Protected Mode/Real Mode/Virtual Mode - if supported by the OS IA-64 Instruction Set.

Slides:



Advertisements
Similar presentations
EZ-COURSEWARE State-of-the-Art Teaching Tools From AMS Teaching Tomorrow’s Technology Today.
Advertisements

Computer Architecture Lecture 7 Compiler Considerations and Optimizations.
CPE 731 Advanced Computer Architecture ILP: Part V – Multiple Issue Dr. Gheith Abandah Adapted from the slides of Prof. David Patterson, University of.
Dynamic Branch PredictionCS510 Computer ArchitecturesLecture Lecture 10 Dynamic Branch Prediction, Superscalar, VLIW, and Software Pipelining.
1 Advanced Computer Architecture Limits to ILP Lecture 3.
Microprocessors General Features To be Examined For Each Chip Jan 24 th, 2002.
Instruction Set Architecture
Computer Organization and Architecture
Computer Organization and Architecture
Intel Itanium 2 Processor Intel’s Server Solution Raymond Ball April 2, 2004.
Instruction Level Parallelism (ILP) Colin Stevens.
Chapter 15 IA-64 Architecture No HW, Concentrate on understanding these slides Next Monday we will talk about: Microprogramming of Computer Control units.
Microprocessors Introduction to ia64 Architecture Jan 31st, 2002 General Principles.
EECS 470 Superscalar Architectures and the Pentium 4 Lecture 12.
Choice for the rest of the semester New Plan –assembler and machine language –Operating systems Process scheduling Memory management File system Optimization.
State Machines Timing Computer Bus Computer Performance Instruction Set Architectures RISC / CISC Machines.
Chapter 12 CPU Structure and Function. Example Register Organizations.
NYU DARPA DIS kick-off September 24, Comparing IA-64 and HPL-PD NYU.
Operating Systems Béat Hirsbrunner Main Reference: William Stallings, Operating Systems: Internals and Design Principles, 6 th Edition, Prentice Hall 2009.
PSU CS 106 Computing Fundamentals II Introduction HM 1/3/2009.
Chapter 15 IA-64 Architecture. Reflection on Superscalar Machines Superscaler Machine: A Superscalar machine employs multiple independent pipelines to.
Chapter 21 IA-64 Architecture (Think Intel Itanium)
IA-64 Architecture (Think Intel Itanium) also known as (EPIC – Extremely Parallel Instruction Computing) a new kind of superscalar computer HW 5 - Due.
COMP381 by M. Hamdi 1 Commercial Superscalar and VLIW Processors.
Chapter 15 IA-64 Architecture or (EPIC – Extremely Parallel Instruction Computing)
Intel Pentium 4 Processor Presented by Presented by Steve Kelley Steve Kelley Zhijian Lu Zhijian Lu.
Group 5 Alain J. Percial Paula A. Ortiz Francis X. Ruiz.
IA-64 ISA A Summary JinLin Yang Phil Varner Shuoqi Li.
Simultaneous Multithreading: Maximizing On-Chip Parallelism Presented By: Daron Shrode Shey Liggett.
IT253: Computer Organization Lecture 4: Instruction Set Architecture Tonga Institute of Higher Education.
Caltech CS184b Winter DeHon 1 CS184b: Computer Architecture [Single Threaded Architecture: abstractions, quantification, and optimizations] Day3:
1 Appendix B Classifying Instruction Set Architecture Memory addressing mode Operations in the instruction set Control flow instructions Instruction format.
INTRODUCTION Crusoe processor is 128 bit microprocessor which is build for mobile computing devices where low power consumption is required. Crusoe processor.
The Arrival of the 64bit CPUs - Itanium1 นายชนินท์วงษ์ใหญ่รหัส นายสุนัยสุขเอนกรหัส
Anshul Kumar, CSE IITD CS718 : VLIW - Software Driven ILP Example Architectures 6th Apr, 2006.
Fall 2012 Chapter 2: x86 Processor Architecture. Irvine, Kip R. Assembly Language for x86 Processors 6/e, Chapter Overview General Concepts IA-32.
Transmeta and Dynamic Code Optimization Ashwin Bharambe Mahim Mishra Matthew Rosencrantz.
Hardware Support for Compiler Speculation
Spring 2003CSE P5481 VLIW Processors VLIW (“very long instruction word”) processors instructions are scheduled by the compiler a fixed number of operations.
Introducing The IA-64 Architecture - Kalyan Gopavarapu - Kalyan Gopavarapu.
Chapter 8 CPU and Memory: Design, Implementation, and Enhancement The Architecture of Computer Hardware and Systems Software: An Information Technology.
Crosscutting Issues: The Rôle of Compilers Architects must be aware of current compiler technology Compiler Architecture.
Lecture 04: Instruction Set Principles Kai Bu
Transmeta’s New Processor Another way to design CPU By Wu Cheng
DSP Architectures Additional Slides Professor S. Srinivasan Electrical Engineering Department I.I.T.-Madras, Chennai –
Unit II Intel IA-64 and Itanium Processor By N.R.Rejin Paul Lecturer/VIT/CSE CS2354 Advanced Computer Architecture.
IA64 Complier Optimizations Alex Bobrek Jonathan Bradbury.
1 Aphirak Jansang Thiranun Dumrongson
IA-64 Architecture Muammer YÜZÜGÜLDÜ CMPE /12/2004.
Protection in Virtual Mode
15-740/ Computer Architecture Lecture 3: Performance
CS203 – Advanced Computer Architecture
CPE 731 Advanced Computer Architecture ILP: Part V – Multiple Issue
Henk Corporaal TUEindhoven 2009
Henk Corporaal TUEindhoven 2011
Sampoorani, Sivakumar and Joshua
CC423: Advanced Computer Architecture ILP: Part V – Multiple Issue
* From AMD 1996 Publication #18522 Revision E
Computer Architecture
Midterm 2 review Chapter
Additional ILP topic #5: VLIW Also: ISA topics Prof. Eric Rotenberg
CSC3050 – Computer Architecture
Lecture 4: Instruction Set Design/Pipelining
Chapter 11 Processor Structure and function
Presentation transcript:

Next Generation ISA Itanium / IA-64

Operating Environments IA-32 Protected Mode/Real Mode/Virtual Mode - if supported by the OS IA-64 Instruction Set

Instruction Set Transition Model Overview The Processor can execute IA-32 or IA-64 based upon the instruction set which can be switched by the following instructions. Jmpe (IA-32 inst) - jump to an IA-64 target instruction and change the IS to IA-64 br.ia (IA-64 inst) - branch to IA-32 target instruction and change the IS to IA-32 rfi (IA-64) - “return from interruption” to return to IA-32 or IA-64 instruction.

IA-64 Instruction Set Features. Explicit Parallelism Features to enhance instruction level parallelism. - Speculation (minimizes mem latency impact) - Predication - Software pipelining of loops Improved High performance floating-point architecture new multimedia instructions 2 bit cache hint field for placement of cache lines in cache hierarchy which is encoded by the compiler.

IA-64 Instruction Set Features Register Stack- avoids unnecessary spilling and filling of registers at procedure calls and return interfaces through compiler controlled renaming. The callee execs an ‘alloc’ instruction specifying the no. of registers it expects to use. Register Rotation - allows concurrent exec of multiple iterations of loops. Multimedia Support and streaming SIMD Extensions - alongwith newer MMX instructions, IA-64 multimedia instructions treet general registers as concatenations of 8- bit, 16-bit or two 32-bit elements.

Architectural Overview.

IA-64 Execution Environment Application Register State (registers available to application programs) -128 general purpose 64-bit registers (+1 bit NaT). -IA-32 integer and segment registers are contained in GR8-GR31 when in IA-32 mode bit floating point registers (+1bitNaTVal) (IA-32, same as GPR) bit predicate registers bit branch registers (used for IA-64 branching) -64 bit IP

IA-64 Execution Environment -38 bit Current Frame Marker (CFM) - state that describes the current general register stack frame bit Application Registers. - Special purpose IA-64 and IA-32 application registers. -64 bit Performance Monitor Data Registers -monitor performance of hardware. -6 bit User Mask - independent single bit values used for performance monitors, alignment traps and to monitor FPR usage. -Processor Identifiers (CPUID) registers that describe processor implementation dependant IA-64 features.

IA-64 Execution Environment Memory -Memory is addressed with 64-bit pointers -Memory can be accessed in units of 1,2,4,8,16 bytes. -User mask controls whether loads /stores use little-endian or big-endian byte ordering of IA-64 references. Instruction Encoding and sequencing.

Application Programming Model

Using IA-64 Instructions Format [qp]mnemonic.[comp] dest=srcs Expressing Parallelism ld8 r1=r5 ;; //first group add r4=r5,r6 sub r5=r7,r8 st8 r6=r12 //second group Bundles and Templates. Bundle boundaries enclosed in curly braces and contains template specification and 3 instructions

IA-64 Optimizations Refer IA-64 App Developers Architecture Guide for details of : -Memory Ref -Predication, Control Flow, Instruction Stream -Software pipelining and loop support -Floating Point Applications

The Technology Behind Crusoe™ Processors Low-power x86-Compatible Processors Implemented with Code Morphing™ Software

Technology Overview First Practical Demonstration that a microprocessor can be implemented as a hardware/software hybrid. Hardware Engine logically surrounded by software layer. VLIW engine executing 4 instructions /clock cycle Software layer surrounding the CPU called the Code Morphing Software dynamically morphs x86 instructions into native VLIW instructions. Code Morphing support built into underlying hardware. Offers opportunity to improve performance without changing underlying hardware.

Hardware Support for Code Morphing 128 bit Molecule FADD ADD LD BRCC Floating point unit Integer ALU Load/store Unit Branch unit #0

Code Morphing Software Code Morphing Software simplifies chip hardware. Less Hardware, consumes low power, lower hear dissipation. Complemented by code morphing software change, hardware engine’s native instruction set can be changed arbitrarily without affecting any x86 software. Transparent re-compilation and optimization of software. Translations are performed over group of instructions once and used repeatedly.

Code Morphing Software For repeating blocks of code, the code morphing s/w uses the translations from the translation buffer while optimizing the block further. Execution modes for x86 code range from interpretation to translation using very simple code generation to highly optimized code. Translator adds code to collect information about block execution and branch history.

Hardware Support for Code Morphing Exceptions and speculations. -Shadow registers. -gated store buffer. -due to hardware implementation, commit operations are ‘free’ -load/store has an alias hardware load load and protect. Store store under alias mask. -translated bit feature to handle self modifying code.