Gameboy to Intel x86 Static Binary Translator

Slides:



Advertisements
Similar presentations
Assembly Language – 1.
Advertisements

Microprocessors Typical microprocessor controlled devices: Camera, mobile phone, stereo, mp3 player, electronic toys… High-level microprocessor controlled.
The 8051 Microcontroller and Embedded Systems
Princess Sumaya Univ. Computer Engineering Dept. Chapter 2: IT Students.
Fall EE 333 Lillevik 333f06-l4 University of Portland School of Engineering Computer Organization Lecture 4 Assembly language programming ALU and.
CEN 226: Computer Organization & Assembly Language :CSC 225 (Lec#3) By Dr. Syed Noman.
Assembly Process. Machine Code Generation Assembling a program entails translating the assembly language into binary machine code This requires more than.
COE Computer Organization & Assembly Language
Chapter 3 Assembly Language: Part 1. Machine language program (in hex notation) from Chapter 2.
Recap – Our First Computer WR System Bus 8 ALU Carry output A B S C OUT F 8 8 To registers’ input/output and clock inputs Sequence of control signal combinations.
Choice for the rest of the semester New Plan –assembler and machine language –Operating systems Process scheduling Memory management File system Optimization.
Henry Hexmoor1 Chapter 10- Control units We introduced the basic structure of a control unit, and translated assembly instructions into a binary representation.
8051 ASSEMBLY LANGUAGE PROGRAMMING
ARM C Language & Assembler. Using C instead of Java (or Python, or your other favorite language)? C is the de facto standard for embedded systems because.
Topic 1: Introduction to Computers and Programming
Chapter 6 Memory and Programmable Logic Devices
Joseph L. Lindo Assembly Programming Sir Joseph Lindo University of the Cordilleras.
Assembly & Machine Languages
Writing an Assembly-language program Atmel assembly language CS-280 Dr. Mark L. Hornick 1.
A Simple Tour of the MSP430. Light LEDs in C LEDs can be connected in two standard ways. Active high circuit, the LED illuminates if the pin is driven.
The CPU The Central Presentation Unit Main Memory and Addresses Address bus and Address Space Data Bus Control Bus The Instructions set Mnemonics Opcodes.
4-1 Chapter 4 - The Instruction Set Architecture Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring.
Enabling the ARM Learning in INDIA ARM DEVELOPMENT TOOL SETUP.
Instruction Set Architecture
Computer Programming I. Today’s Lecture  Components of a computer  Program  Programming language  Binary representation.
5-1 Chapter 5 - Languages and the Machine Department of Information Technology, Radford University ITEC 352 Computer Organization Principles of Computer.
Computer Science 101 How the Assembler Works. Assembly Language Programming.
Computer Architecture And Organization UNIT-II Structured Organization.
5-1 Chapter 5 - Languages and the Machine Principles of Computer Architecture by M. Murdocca and V. Heuring © 1999 M. Murdocca and V. Heuring Principles.
4-1 Chapter 4 - The Instruction Set Architecture Principles of Computer Architecture by M. Murdocca and V. Heuring © 1999 M. Murdocca and V. Heuring Principles.
Lecture Set 4 Programming the 8051.
1.4 Representation of data in computer systems Instructions.
EE 345 Class Notes EE345 Midterm Review Dr. Jane Dong.
Digital Computer Concept and Practice Copyright ©2012 by Jaejin Lee Control Unit.
The Assembly Process Computer Organization and Assembly Language: Module 10.
1 EKT 225 MICROCONTROLLER I CHAPTER ASSEMBLY LANGUAGE PROGRAMMING.
1 Basic Processor Architecture. 2 Building Blocks of Processor Systems CPU.
Control units In the last lecture, we introduced the basic structure of a control unit, and translated our assembly instructions into a binary representation.
Translating Assembly Language to Machine Language.
Digital Computer Concept and Practice Copyright ©2012 by Jaejin Lee Control Unit.
Representation of Data Binary Representation of Instructions teachwithict.weebly.com.
Computer Operation. Binary Codes CPU operates in binary codes Representation of values in binary codes Instructions to CPU in binary codes Addresses in.
Hello world !!! ASCII representation of hello.c.
Presentation 2: A More Detailed Look Advanced VLSI Design (ECE 1193) Kent Nixon, Tom Nason, Enes Eken, and Christopher Lukas January 17, 2013.
Operating Systems A Biswas, Dept. of Information Technology.
First Foray into Programming (the hard way). A reminder from last lesson: A machine code instruction has two parts:  Op-code  Operand An instruction.
DATE S. S.. Sandstone The sandstone carries out the following tasks: 1. Set up target platform environment, 2. Load a bootable image into memory, 3. Relinquish.
Instruction Set Architectures Continued. Expanding Opcodes & Instructions.
1 Contents: 3.1 Instruction format and Addressing Modes 3.2 Instruction Introduction Chapter 3 Instruction system.
Instruction Set Architecture
INTRODUCTION TO AVRASSEMBLY PROGRAMMING
Control Unit Lecture 6.
Assembly Language Programming of 8085
COMPUTER ORGANIZATION & ASSEMBLY LANGUAGE
The 8051 Microcontroller and Embedded Systems
Computer Architecture and Organization Miles Murdocca and Vincent Heuring Chapter 4 – The Instruction Set Architecture.
William Stallings Computer Organization and Architecture 8th Edition
Chapter 7 LC-2 Assembly Language.
MIPS Instruction Encoding
Instruction encoding We’ve already seen some important aspects of processor design. A datapath contains an ALU, registers and memory. Programmers and compilers.
MIPS Instruction Encoding
Instruction Set Architectures Continued
Chapter 4 Instruction Set.
Branch instructions We’ll implement branch instructions for the eight different conditions shown here. Bits 11-9 of the opcode field will indicate the.
Instruction encoding We’ve already seen some important aspects of processor design. A datapath contains an ALU, registers and memory. Programmers and compilers.
Control units In the last lecture, we introduced the basic structure of a control unit, and translated our assembly instructions into a binary representation.
8051 ASSEMBLY LANGUAGE PROGRAMMING
Review: The whole processor
Example 1: (expression evaluation)
Computer Architecture and System Programming Laboratory
Presentation transcript:

Gameboy to Intel x86 Static Binary Translator Jim Clark David Galos

The Nintendo Gameboy CPU – 8 bit Sharp LR35902 running at 4.19 MHz, custom for Gameboy but similar to Intel 8080 and Zilog Z80 8kB VRAM and 8kB working RAM Game code stored on changeable cartridges

To run this program on a different architecture……… Emulation Each CPU opcode translated into a function which affects the “registers” in the same way CPU registers emulated using data structure in high level language Entire area of mapable memory stored in data structure in high level language Binary Translation Each CPU opcode translated from its original architecture to the targets equivalent CPU registers mapped directly from source to target Entire area of mapable memory stored in targets

Registers

An emulation approach……. Map each register into a variable in the high level language

Our binary translation approach….. Map each register from the source architecture into an equivalent in the target

Memory Access 0000-3FFF ROM Bank 0 4000-7FFF Switchable ROM bank Video RAM A000-BFFF External RAM C000-CFFF Working RAM 0 D000-DFFF Working RAM 1 E000-FDFF Same as C000-DDFF FE00-FE9F Sprite Atribute Table FEA0-EEFF Not Usable FF00-FF7F I/O Ports FF80-FFFE High RAM FFFF Interrupt Enable Register

An emulation approach……. Access memory through functions

Our binary translation approach….. Setup the .data section in an x86 asm file Address through labels

CPU Instructions

An emulation approach……. Translate each opcode into a function

Our binary translation approach….. Translate each opcode to its equivalent on the target architecture ADD A,E becomes addb %cl, %ah CP A,B becomes cmpb %bh, %ah NOP becomes nop

Emulation within binary translation Need to account for peripherals The generated .asm file assembled, then linked with a high-level C program Call the “fake_stuff” after each instruction, then return Nesecary to emulate the effects of the LCD controller, button input, DMA etc.

How our translator works A program, written in C, generates an x86 asm file from the given Gameboy ROM file Object files are generated from this .asm file and the C program containing the “fake stuff” These files are linked, resulting in the output of a single Windows .exe file

Generating an x86 asm file Input is a Gameboy ROM file, a 32K binary As this input file is simply a binary there is no way to distinguish code from data This was our first hurdle in the project

Code or Data? Consider the following 3 bytes arbitrarily pulled from Tetris: This could be: A 1 byte instruction followed by 2 bytes of data A 2 bye instruction followed by 1 byte of data A 3 byte instruction 1 byte of data, a 1 byte instruction, and another byte of data……………

If the series of data is interpreted as a 3 byte instruction, 21 corresponds to the instruction LD HL,d16 which loads immediate 16 bit data into register HL. Thus this instruction would load the value 0xB0E8 into register HL Another way this could be interpreted is if 21 were a byte of data followed by the 2 byte instruction E8 B0. This is also a valid opcode, and translates into ADD SP,r8.

Further complicating things, this sequence could be interpreted as 2 bytes of data, 21 and E8, followed by the single byte instruction B0. B0 is a valid opcode as well, and translates into OR B. Finally, this may not even be code! It could simply be 3 bytes of data. As you can see, it is very difficult to distinguish code from data as they are intermixed throughout the ROM.

How we solve this problem When generating the .data section, treat each byte in the entire file as if it is data .global _data0574 _data0574: .byte 0x21 .global _data0575 .byte 0xe8 .global _data0576 .byte 0xb0

How we solve the problem When generating the .code section, assume each byte is a complete instruction, but DON’T skip over the extra bytes we pulled in! .global _code0574 _code0574: movb $0xb0e8,%dx _code0575: addl $0x8838, %esi andl $0x0000ffff, %esi _code0576: orb %bh, %ah

Why this works If we are writing to an address, we know it’s data, so we append an offset to _data0000 If we are jumping to an address, we know it’s code, so we append an offset to _code0000