Background Brand naming (WB GL 2.20, Guidance Note No. 4) AMD complaints EU opened inquiry (October 2004) on whether France, Netherlands, Sweden, and Finland illegally favor Intel processor maker The U.S. Office of Management and Budget re-enforced (April, 2005) to all federal purchasers that brand-name specifications are forbidden and should not be associated with single manufacturer
Background 5 EU Member states have issued guidelines suggesting the use of benchmarks instead of trademarks and technical features such as clock rate – more guidelines are in the process of being developed U.S. OMB recommended rather than issue brand name specifications for microprocessors, agencies should either: 1) articulate a benchmark for performance; or 2) specify the requirements for applications and interoperability. Benchmarks for microprocessors can be specific for functions such as Internet content creation, office applications, or mail servers. Benchmarks may also measure the overall performance of computers.
Benchmarks Each benchmark tries to answer the question: “What computer should I buy?” Clearly, the answer to the question is “The system that does the job with the lowest cost-of-ownership”. Cost-of-ownership includes project risks, programming costs, operations costs, hardware costs, and software costs. It’s difficult to quantify project risks, programming costs, and operations costs. In contrast, computer performance can be quantified and compared.
Benchmarks Domain specific No single metric possible The more general the benchmark, the less useful it is for anything in particular. A benchmark is a distillation of the essential attributes of a workload
BAPCo* Consortium BAPCo (Business Applications Performance Corporation) is a non-profit consortium, whose charter is to develop and distribute a set of objective performance benchmarks based on popular computer applications and industry standard operating systems. BAPCo members as of September 29, 2005 Source: on September 29, 2005
Sysmark2004SE concept Identification of business usage categories of Personal Computers, followed by determination of the types and characteristics of the output created by users in those categories. These interactions are converted into instructions (or “scripts”) and integrated into BAPCo’s automated benchmarking environment resulting in candidate workloads for final placement in the benchmark suite in order to arrive at a balanced workload. A key participant in the development process is the application expert provided by member companies. These application experts have at least five years of professional experience working with the applications.
Identifying Usage Categories For SYSmark 2004 SE, BAPCo identified two distinct business usage categories: Internet Content Creation: Tasks for creation of content for a website with an enhanced user experience: web pages with text, images, video and animations. Office Productivity: Tasks common to business users: processing , preparing documents and presentations, data management and data analysis
User Output For SYSmark 2004 SE, the following types of output were identified: Internet Content Creation: Digital Images, Digital Video, Animation, Encoded Media, Web pages and 3-D Rendered Images. Office Productivity: Text Documents, Spreadsheets, Presentations, s, Databases, Transcribed documents, Virus Free documents, Compressed Files, Browsed Files and Portable Documents.
SysMark 2004 SE The fundamental performance unit in SysMark 2004 SE is “Response Time”. Response time, in the context of SysMark 2004 SE, is defined as the time it takes the computer to complete a task that has been initiated by the automated script. SysMark 2004 SE adds the individual response times of all operations within a group (e.g. 2D creation) and uses the total response time to compare the respective groups on 2 systems (using the calibration system as the base).
Benchmarking examples 1. Before Intel Pentium IV, 3.0 GHz, 800 MHz FSB, 1 MB Cache After x86-microprocessor with a performance giving a minimum score of 193 under the benchmark Sysmark 2004 rating 2. Before Intel Pentium 4, 3 GHz or equivalent After x86 microprocessor with the following performance scores: between 165 and 205 under the Sysmark 2004 overall office productivity benchmark between 200 and 235 under the Sysmark 2004 overall internet content creation benchmark between 180 and 220 under the Sysmark 2004 rating
Client Benchmark Summary Choose benchmarks that measure relevant usage models Desktop Sysmark 2004 SE (productivity & content creation) Sysmark 2007 Preview Mobile MobileMark 2005 (mobile performance & battery life) MobileMark 2007 Consider cost of implementation Use benchmarks from industry consortia Use of benchmarks doesn’t necessarily solve all problems
Benchmarking servers More complex issue Selection of benchmarking entities SPEC TPC SAP ORACLE
SPEC The Standard Performance Evaluation Corporation (SPEC) is a non-profit corporation formed to establish, maintain and endorse a standardized set of relevant benchmarks that can be applied to the newest generation of high-performance computers. SPEC develops benchmark suites and also reviews and publishes submitted results from our member organizations and other benchmark licensees.
SPEC members and associates SPEC Members: 3DLabs * Acer Inc. * Advanced Micro Devices * Apple Computer, Inc. * ATI Research * Azul Systems, Inc. * BEA Systems * Borland * Bull S.A. * CommuniGate Systems * Dell * EMC * Exanet * Fabric7 Systems, Inc. * Freescale Semiconductor, Inc. * Fujitsu Limited * Fujitsu Siemens * Hewlett-Packard * Hitachi Data Systems * Hitachi Ltd. * IBM * Intel * ION Computer Systems * JBoss * Microsoft * Mirapoint * NEC - Japan * Network Appliance * Novell * NVIDIA * Openwave Systems * Oracle * P.A. Semi * Panasas * PathScale * The Portland Group * S3 Graphics Co., Ltd. * SAP AG * SGI * Sun Microsystems * Super Micro Computer, Inc. * Sybase * Symantec Corporation * Unisys * Verisign * Zeus Technology * SPEC Associates: California Institute of Technology * Center for Scientific Computing (CSC) * Defence Science and Technology Organisation - Stirling * Duke University * JAIST * Kyushu University * Leibniz Rechenzentrum - Germany * National University of Singapore * New South Wales Department of Education and Training * Purdue University * Queen's University * Rightmark * Stanford University * Technical University of Darmstadt * Texas A&M University * Tsinghua University * University of Aizu - Japan * University of California - Berkeley * University of Central Florida * University of Illinois - NCSA * University of Maryland * University of Modena * University of Nebraska, Lincoln * University of New Mexico * University of Pavia * University of Stuttgart * University of Texas at Austin * University of Texas at El Paso * University of Tsukuba * University of Waterloo * VA Austin Automation Center * SPEC Supporting Members: EP Network Storage Performance Lab * SuSE Linux AG *
SPEC benchmarks CPU Graphics/Applications HPC/OMP Java Client/Server Mail Servers Network File System Web Servers
SPEC CPU benchmark SPEC CPU2000 V1.3 Technology evolves at a breakneck pace. SPEC CPU2000 is the next-generation industry- standardized CPU-intensive benchmark suite. SPEC designed CPU2000 to provide a comparative measure of compute intensive performance across the widest practical range of hardware. The implementation resulted in source code benchmarks developed from real user applications. These benchmarks measure the performance of the processor, memory and compiler on the tested system.
Typical excuses from PIUs We do not know what is it It is too complex to use it It is technically unjustified - our experts have different views We use the clause “or equivalent performance” (… but then rejecting anything but Intel) Intel based computers are more stable (whatever it means) And finally Where is it written in Bank documents that benchmarks must be used ?
Bank activities AMD presentation (June 2005) Intel presentation (September 2005) White Paper, November 2007: Technical Specifications in the Public Procurement of Computers IT thematic group discussions Suppliers comments WB Technical guidance (?)
Benchmarking in IT Brand names Fiduciary Forum 2008 Washington DC, March 2008