Presentation is loading. Please wait.

Presentation is loading. Please wait.

Prelude to Multiprocessing Detecting cpu and system-board capabilities with CPUID and the MP Configuration Table.

Similar presentations


Presentation on theme: "Prelude to Multiprocessing Detecting cpu and system-board capabilities with CPUID and the MP Configuration Table."— Presentation transcript:

1 Prelude to Multiprocessing Detecting cpu and system-board capabilities with CPUID and the MP Configuration Table

2 CPUID Recent Intel processors provide a ‘cpuid’ instruction (opcode 0x0F, 0xA2) to assist software in detecting a CPU’s capabilities If it’s implemented, this instruction can be executed in any of the processor modes, and at any of its four privilege levels But this ‘cpuid’ instruction might not be implemented (e.g., 8086, 80286, 80386)

3 Intel x86 EFLAGS register 0000000000 IDID VIPVIP VIFVIF ACAC VMVM RFRF 0 NTNT IOPL OFOF DFDF IFIF TFTF SFSF ZFZF 0 AFAF 0 PFPF 1 CFCF 3116 150 21 Software can ‘toggle’ the ID-bit (bit #21) in the 32-bit EFLAGS register if the processor is capable of executing the ‘cpuid’ instruction

4 But what if there’s no EFLAGS? The early Intel processors (8086, 80286) did not implement any 32-bit registers The FLAGS register was only 16-bits wide So there was no ID-bit that software could try to ‘toggle’ (to see if ‘cpuid’ existed) How can software be sure that the 32-bit EFLAGS register exists within the CPU?

5 Detecting 32-bit processors There’s a subtle difference in the way the logical shift/rotate instructions work when register CL contains the ‘shift-factor’ On the 32-bit processors (e.g., 80386+) the value in CL is truncated to 5-bits, but not so on the 16-bit CPUs (8086, 80286) Software can exploit this distinction, in order to tell if EFLAGS is implemented

6 Detecting EFLAGS # Here’s a test for the presence of EFLAGS mov $-1, %ax# a nonzero value mov $32, %cl# shift-factor of 32 shl %cl, %ax# do logical shift or %ax, %ax# test result in AX jnz is32bit# EFLAGS present jmp is16bit# EFLAGS absent

7 Testing for ID-bit ‘toggle’ # Here’s a test for the presence of the CPUID instruction pushfl# copy EFLAGS contents pop%eax# to accumulator register mov%eax, %edx# save a duplicate image btc$21, %eax# toggle the ID-bit (bit 21) push%eax# copy revised contents popfl# back into EFLAGS pushfl# copy EFLAGS contents pop%eax# back into accumulator xor%edx, %eax# do XOR with prior value bt$21, %eax# did ID-bit get toggled? jcy_cpuid# yes, can execute ‘cpuid’ jmpn_cpuid# else ‘cpuid’ unimplemented

8 How does CPUID work? Step 1: load value 0 into register EAX Step 2: execute ‘cpuid’ instruction Step 3: Verify ‘GenuineIntel’ character- string in registers (EBX,EDX,ECX) Step 4: Find maximum CPUID input-value in the EAX register

9 load 1 into EAX and execute CPUID Processor model and stepping information is returned in register EAX Version and Features Extended Family ID Extended Model ID Type Family ID Model Stepping ID 27 20 19 16 13 12 11 8 7 4 3 0

10 Some Feature Flags in EDX HTTHTT PGEPGE APICAPIC PSEPSE DEDE VMEVME FPUFPU 93 28 HTT = HyperThreading Technology (1 = yes, 0 = no) PGE = Page Global Entries (1=yes, 0=no) APIC = Advanced Programmable Interrupt Controller on-chip (1 = yes,0 = no) PSE = Page-Size Extensions (1 = yes, 0 = no) DE = Debugging Extensions (1=yes, 0=no) VME = Virtual-8086 Mode Enhancements (1 = yes, 0 = no) FPU = Floating-Point Unit on-chil (1=yes, 0=no) 12013

11 Some Feature Flags in ECX VMXVMX 5 VMX = Virtual Machine Extensions (1 = yes, 0 = no)

12 Multiprocessor Specification It’s an industry standard, allowing OS software to use multiple processors in a uniform way OS software searches in three regions of the physical address-space below 1-megabyte for a “paragraph-aligned” data-structure of length 16- bytes called the MP Floating Pointer Structure: –Search in lowest KB of Extended Bios Data Area –Search in topmost KB of conventional 640K RAM –Search in the 128KB ROM-BIOS (0xE0000-0xFFFFF)

13 MP Floating Pointer Structure This structure may contain an ID-number for one a small number of standard SMP system architectures, or may contain the memory address for a more extensive MP Configuration Table having entries that specify a “customized” system architecture The machines in our classroom employ the latter of these two options

14 An example record The MP Configuration Table will contain a record for each logical processor CPU Flags BP (bit 1), EN (bit 0) Local-APIC version Local-APIC ID Entry Type 0 CPU signature (stepping, model, family) Feature Flags reserved (=0) BP = Bootstrap Processor (1=yes, 0=no), EN = Enabled (1=yes, 0=no)

15 Our ‘mpinfo.cpp’ utility We created a Linux utility that will display the system-information contained in the MP Configuration Table (in hex format) You can refer to the ‘MP Specification 1.4’ document (online) to interpret this display This utility needs a device-driver ‘dram.c’ to be pre-installed (in order that it be able to directly access the system’s memory)

16 A processor’s Local-APIC The purpose of each processor’s APIC is to allow the CPUs in a multiprocessor system to send messages to one another and to manage the delivery of the interrupt-requests from the various peripheral devices to one (or more) of the CPUs in a dynamically programmable way Each processor’s Local-APIC has a variety of registers, all ‘memory mapped’ to paragraph- aligned addresses within the 4KB page at physical-address 0xFEE00000

17 Local-APIC’s register-space APIC 0xFEE00000 4GB physical address-space 0x00000000 RAM

18 Analogies with the PIC Among the registers in a Local-APIC are these (which had analogues in the older 8259 PIC’s design: –IRR: Interrupt Request Register (256-bits) –ISR: In-Service Register (256-bits) –TMR: Trigger-Mode Register (256-bits) For each of these, its 256-bits are divided among eight 32-bit register addresses

19 New way to do ‘EOI’ Instead of using a special End-Of-Interrupt command-byte, the Local-APIC contains a dedicated ‘write-only’ register (named the EOI Register) which an Interrupt Handler writes to when it is ready to signal an EOI # issuing EOI to the Local-APIC mov$0xFEE00000, %ebx# address of the cpu’s Local-APIC movl$0, %fs:0xB0(%ebx)# write any value into EOI register # Here we assume segment-register FS holds the selector for a segment-descriptor # for a ‘writable’ 4GB-size expand-up data-segment whose base-address equals 0

20 Each CPU has its own timer! Four of the Local-APIC registers are used to implement a programmable timer It can privately deliver a periodic interrupt (or one-shot interrupt) just to its own CPU –0xFEE00320: Timer Vector register –0xFEE00380: Initial Count register –0xFEE00390: Current Count register –0xFEE003E0: Divider Configuration register

21 Timer’s Local Vector Table Interrupt ID-number MODEMODE MASKMASK BUSYBUSY 7 0 12 1617 0xFEE00320 MODE: 0=one-shot 1=periodic MASK: 0=unmasked 1=masked BUSY: 0=not busy 1=busy

22 Timer’s ‘Divide-Configuration’ reserved (=0) 3 2 1 0 0xFEE003E0 0 Divider-Value field (bits 3, 1, and 0) 000 = divide by 2 001 = divide by 4 010 = divide by 8 011 = divide by 16 100 = divide by 32 101 = divide by 64 110 = divide by 128 111 = divide by 1

23 Initial and Current Counts Initial Count Register (read/write) 0xFEE00380 Current Count Register (read-only) 0xFEE00390 When the timer is programmed for ‘periodic’ mode, the Current Count is automatically reloaded from the Initial Count register, then counts down with each CPU bus-cycle, generating an interrupt when it reaches zero

24 Using the timer’s interrupts Setup your desired Initial Count value Select your desired Divide Configuration Setup the APIC-timer’s LVT register with your desired interrupt-ID number and counting mode (‘periodic’ or ‘one-shot’), and clear the LVT register’s ‘Mask’ bit to initiate the automatic countdown operation

25 In-class exercise #1 Run the ‘cpuid.cpp’ Linux application (on our course website) to see if the CPUs in our classroom implement HyperThreading (i.e., multiple logical processors in a cpu) Then run the ‘mpinfo.cpp’ application, to see if the MP Base Configuration Table has entries for more than one processor If both results hold true, then we can write our own multiprocessing software in H235!

26 In-class exercise #2 Run the ‘apictick.s’ demo (on our CS 630 website) to observe the APIC’s ‘periodic’ interrupt-handler drawing ‘T’s onscreen It executes for ten-milliseconds (the 8254 is used here to create that timed delay) Try reprogramming the APIC’s Divider Configuration register, to cut the interrupt frequency in half (or perhaps to double it)


Download ppt "Prelude to Multiprocessing Detecting cpu and system-board capabilities with CPUID and the MP Configuration Table."

Similar presentations


Ads by Google