Presentation is loading. Please wait.

Presentation is loading. Please wait.

C OMBINING P RE -S ILICON V ERIFICATION B RAINS WITH P OST -S ILICON P LATFORM M USCLE Reviewer: Shuo-Ren, Lin 2012/5/11 ALCom 1.

Similar presentations


Presentation on theme: "C OMBINING P RE -S ILICON V ERIFICATION B RAINS WITH P OST -S ILICON P LATFORM M USCLE Reviewer: Shuo-Ren, Lin 2012/5/11 ALCom 1."— Presentation transcript:

1 C OMBINING P RE -S ILICON V ERIFICATION B RAINS WITH P OST -S ILICON P LATFORM M USCLE Reviewer: Shuo-Ren, Lin 2012/5/11 ALCom 1

2 A BSTRACT Post-silicon functional validation challenges Light overhead associated with the testing procedures High-quality validation Post-silicon functional exerciser A application Load it to the system, and it generates test-cases, executes them, and checks the results. Novel solution Prepare partial data which can incorporate with exerciser Generate these useful data in advance during pre-silicon verification 2012/5/11 2 ALCom

3 O UTLINE Introduction Solution scheme Floating-point Address translation Memory management Conclusions 2012/5/11 3 ALCom

4 I NTRODUCTION Post-silicon validation High-quality validation Minimize overhead Post-silicon exerciser Non-OS Fast and light High validation quality Two conflict requirements Bridge between pre-silicon and post-silicon world Based on a common verification plan and similar languages for test-template [4] 2012/5/11 4 ALCom

5 I NTRODUCTION Contributions Enable a post-silicon exerciser (Threadmill) to generate sophisticated stimuli and keep the exerciser efficient and simple. Prepare the input data off the platform and integrate it into the exerciser image. Three domains Floating-point data operands: FPgen Address translation paths: DeepTrans Memory access management: CSP solver, Genesys-Pro, and X-Gen 2012/5/11 5 ALCom

6 S OLUTION S CHEME Off-platform data-generation Use well-established pre-silicon techniques and tools to ensure that the generated data is high quality. Static build Generate large amounts of data by pre-silicon tools Data can be reused across many different exerciser images Dynamic build Sufficiently efficient for constructing a new exerciser image in a reasonable amount of time Three main roles: filter data generate new data based on available inputs (test template and configuration) create optimized data structures for efficient retrieval by exerciser during execution time 2012/5/11 6 ALCom

7 F LOATING -P OINT Static build FPgen achieve a high level of coverage for all supported inputs and outputs Take 24 hours to generate a set of 1000 inputs for 390 floating point instructions on a single Pentium 6 machine Dynamic build For each instruction which intend to have in the test-cases, this work randomly and uniformly pick up a subset of the entries and create a smaller table. Execution Randomly select operands from the table prepared in the dynamic-build stage 2012/5/11 7 ALCom

8 F LOATING -P OINT E XPERIMENT I 10 FP instructions: fadd, fsub, fmul, fdiv, and fsqurt in both single and double variations 12 types of FP number forms: normalized, infinity, … 3 operands: 2 inputs and 1 output  360 event – (unreachable event) = 280 different events FPgen only generates legal interesting operand values 2012/5/11 8 ALCom

9 F LOATING -P OINT E XPERIMENT II AND III Trailing zeroes for divide and square root Extending the output format to 52 bits (2*52 events) Almost equal (32 events) Mixed and FPgen have the same coverage since the defined coverage models include only legal events. 2012/5/11 9 ALCom

10 A DDRESS T RANSLATION Address translation Support virtual spaces Memory protection Caching mechanisms Under Threadmill Invoke the corresponding micro-architectural mechanisms Enable the generation of memory management stimuli 2012/5/11 10 ALCom

11 A DDRESS T RANSLATION Coverage requirements Produce a set of valid translation paths that cover a large physical memory region Generate translation paths that activate all possible inter- thread and inter-processor memory sharing scenarios Generate all the types of address translations possible for any given translation mode. (e.g. different segment and page size) Requirement related to the representation of translation data Compact and easy to extract Contain translation path for all the physical memory blocks Possess the required properties (e.g. allow accesses to the same cache line from different threads) 2012/5/11 11 ALCom

12 A DDRESS T RANSLATION Static build stage Use DeepTrans to construct address translation sets for the entire physical memory Divide the entire physical memory space into primary blocks, and randomly split each primary memory blocks into one or more secondary blocks. DeepTrans generate the address translations that cover an entire primary block for each primary partition and all its sub- partitions Generation of a single set of address translations spends 20 hours on a single Pentium 6 Linux machine 2012/5/11 12 ALCom

13 A DDRESS T RANSLATION Dynamic build stage Need additional information (test template and configuration) Filter out the partitions that do not possess any of the desired memory properties Reduce the remaining partitions Build the address translation data structure Containing the memory mapping data for every secondary memory block Build book-up tables for Threadmill to easily retrieve the desired memory blocks at execution time 2012/5/11 13 ALCom

14 A DDRESS T RANSLATION E XPERIEMNT Threadmill prefers Cacheable pages since it can stress the HW caches Prefer small page because of memory management flexibility small: 4k and 6k, medium: 1M, large: 8M, huge: 8G Collisions: a physical page covered by a translation path with a larger page size contains a physical page covered by another translation path with a smaller page size 2012/5/11 14 ALCom Pre-generated translation collisions (no bias)

15 M EMORY M ANAGEMENT Target multi-threaded systems Collision: Having different threads access the same memory location Tend to generate test-cases which can trigger memory access collisions  increase the bug detection potential Threadmill use multi-pass consistency checking Run a test-case multiple times with the same resource and ensure the outputs are the same each time write-write and write-read are not checked by Threadmill but still can be generated 2012/5/11 15 ALCom

16 M EMORY M ANAGEMENT To stress the design  place the intervals at interesting locations Cache-line or page/segment crossing Memory having certain attributes e.g. non-cacheable memory or memory obeying different consistency rule Various memory affinity: memory located on a different chip 2012/5/11 16 ALCom

17 M EMORY M ANAGEMENT Threadmill’s high-level architecture 2012/5/11 17 ALCom Dynamic build stage Run time

18 M EMORY M ANAGEMENT CSP solver Run in dynamic-build stage  should be efficient Input: requirements, test-template, and system configuration A pair of CSP variables, interval start and interval length Generate Memory for the code and data areas Memory for test-cases Memory accessed by the generated load/store instructions Hard constraints All intervals are disjoint All intervals reside in the available memory space Others. (e.g. user-defined memory allocation request) Soft constraints  for high quality test-cases generation Direct interval allocation to interesting areas 2012/5/11 18 ALCom

19 M EMORY M ANAGEMENT Three ownership owned interval : only owner thread can write read-only interval : All threads can read unowned interval : All threads can access Memory access Cacheable Non-cacheable Construct a primary look-up table for fast and simple random choice 2012/5/11 19 ALCom

20 M EMORY M ANAGEMENT E XPERIMENT 3 configuration, 8G Mem, 10 intervals/ownership, 50 load/store instructions First experiment Randomly allocated and uniformly distributed intervals Second experiment Apply CSP solver 2012/5/11 20 ALCom

21 2012/5/11 ALCom 21

22 2012/5/11 ALCom 22

23 2012/5/11 ALCom 23

24 2012/5/11 ALCom 24

25 2012/5/11 ALCom 25


Download ppt "C OMBINING P RE -S ILICON V ERIFICATION B RAINS WITH P OST -S ILICON P LATFORM M USCLE Reviewer: Shuo-Ren, Lin 2012/5/11 ALCom 1."

Similar presentations


Ads by Google