WTM’13, Prague, April 14, Post-Silicon Debugging of Transactional Memory Tests Carla Ferreira, João Lourenço {carla.ferreira, Ophir Friedler, Wisam Kadry, Amir Nahir, Vitali Sokhin {ophirf, wisamk, nahir, IBM ResearchUniversidade Nova de Lisboa
WTM’13, Prague, April 14, Post Silicon Post-silicon validation elements: 1. Stimulating the design under test 2. Detecting erroneous behavior 3. Localizing the root cause of the problem 4. Providing a fix.
WTM’13, Prague, April 14, Stimulation 1. Test generation 2. Execution 3. Consistency checking 4. Repeat… Forever! Silicon Accelerator Generation Checking Execution OS services Test Template Topology Architectural Model Exerciser Image (Threadmill)
WTM’13, Prague, April 14, Detection Consistency checking Run the same test-case from the same initial architectural state. Expect the same final architectural state ori r10,r0,170 stb r10,0(r6) lbz r11,0(r6)... Initial State R0 = 0x1, R1 = 0x2 … Final State R0 = 0xA, R1 = 0xB … Micro-architectural state varies! Caches, page misses, pre-fetching, thread priorities
WTM’13, Prague, April 14, Detection And what if two different final states are manifested? ori r10,r0,170 stb r10,0(r6) lbz r11,0(r6)... Initial State R0 = 0x1, R1 = 0x2 … Final State R0 = 0xA, R1 = 0xB … ori r10,r0,170 stb r10,0(r6) lbz r11,0(r6)... Initial State R0 = 0x1, R1 = 0x2 … Final State R0 = 0xC, R1 = 0xB … MIS-COMPARE Final State R0 = 0xA, R1 = 0xB … Final State R0 = 0xC, R1 = 0xB …
WTM’13, Prague, April 14, Localization approach 1. A test-case that produces a mis-compare is found 2. Fast-forward to that test-case on a software simulator (a.k.a. Reference model) 3. Execute test case on the reference model instruction by instruction and extract information
WTM’13, Prague, April 14, Localization Reduce number of resources and instructions that might be the root cause of the mis-compare Study the effect of transactions in the test-case on the final state. Justification: Force erroneous behaviour on reference model and re- create the mis-compare results
8 R1 R4R3R2 Localization = suspicious instruction subset
WTM’13, Prague, April 14, Concluding remarks Debug automation effectively reduces the debugging effort. Graph analysis holds the potential automate the localization of suspicious resources and instructions Future work: - Study the impact of escaped stores in transaction aborts - experiment with larger (real-world) cases
WTM’13, Prague, April 14, Questions