Presentation on theme: "SAS Performance on SPARC T4 + Solaris: Customer experience performance study from the U.S. Bureau of Labor Statistics Edmond Cheng, Economist, Bureau of."— Presentation transcript:
2Bureau of Labor Statistics The Bureau of Labor Statistics of the U.S. Department of Labor is the principal Federal agency responsible for measuring labor market activity, working conditions, and price changes in the economy. Its mission is to collect, analyze, and disseminate essential economic information to support public and private decision-making. As an independent statistical agency, BLS serves its diverse user communities by providing products and services that are objective, timely, accurate, and relevant.HistoryThe BLS has provided essential economic information to support public and private decision-making since That’s way before there were calculator and printers; statisticians had to manually process survey and calculate statistics by hand.VisionThe Bureau of Labor Statistics will meet the information needs of a rapidly changing U.S. and global economy by continuously improving its products and services, investing in its work force, and modernizing its business processes.
3Industry EmploymentEdmond and Steven are both member of the Division of Industry Data Development in BLS, which are tasked with the development/maintenance of the IT system and production of several major industry employment statistics products.NotablyThe monthly payroll number you hear the 1st Friday of each monthThe monthly real earning reportThe monthly job openings and labor turnover statisticsThe annual green jobs numberWhy are these statistics so important? They are a measure of the current economic conditions. The change in employment is a key indicator for the state of the economy.Congress uses CES data to help make policy decisions. The Fed, Bureau of Economic Analysis, and other Statistical Agencies uses BLS employment, hours, earnings data as inputs into their model. State and local government uses these data to measure economic health of State and areas and to guide monetary policy decisions. Businesses may use CES data to negotiate contracts, select building sites, forecast market demand for their products, and develop marketing strategies. Plus, there are other uses in the academics, labor organization, and researches.
4Operation and Business Process Survey Frame & Sample DesignQuestionnaire Design & TestingData Collection & Cycle ManagementData Processing & Validation / Micro EditingEstimation, Data Tabulation & Macro EditingMacro modeling, seasonal adjustmentData Dissemination / PublicationMaintaining the operation to produce these economic statistics needs tremendous planning, coordination, budgeting, human effort, and IT resources.Taking the monthly payroll survey as an example,The data collection sends, collects, and process surveys from over 400,000 establishments each month from all over U.S and thru different collection modes. In a relative short amount of time, the data has to be validated, edited, and reconciled before they can go into estimation. The data will feed into the macro modeling, tabulation, editing, and adjustment before they become relevant and reliable estimates. Once all that is completed, the data are reported to the program analysts for review and verification. At last, the official statistics are disseminated to BLS publication office for public press release.
5SAS Solutions and Others SAS Base 9.2 SAS AppDev Studio SAS/ACCESS SAS/Connect SAS/ETS SAS/Graph SAS/IML SAS/IntrNet SAS/Share SAS/STATSAS® Business IntelligenceSAS Enterprise Guide 4.3 SAS Enterprise Guide BI Server Data Integration Server Metadata Server Microsoft Office IntegrationOthersSAS software and solutions for data processing, statistical analysis, reporting, and data warehousing.For example,SAS Base ETL, customized statistical models, functions, reportingSAS AppDev Studio Java-based applicationSAS ACCESS, Connect Access to different database and platformsSAS IntrNet Web-client applicationSAS ETS, IML, STAT Statistical needs
6Oracle Servers SPARC T4-2 SERVER Processor Eight-core 2.85GHz SPARC T4 processorTwo processors per system, maximum 128 threadsEight floating-point unitsDual multithreaded 10 GbE PCI integrated onto chipServer platforms chosen to run SASLong history of using UNIX servers and Solaris OS for the production system.multi-users, multi-tasking, resources-sharingsecure, expandable, manageable, performancecompatible with the software and other needs of our officeSun Fire V Sun Fire V Sun Fire V Sun E3500Sun Fire T SPARC M SPARC M4000SPARC T4-2 (certification)
7Performance Test Servers Baselines Server ModelLinux LabLinux HP BladeSun Fire T5240SPARC Enterprise M3000SPARC T4-2Operating SystemRed Hat Enterprise Linux Server release 6.3 (Santiago)Solaris 10ProcessorIntel Xeon E5430 CPUIntel Xeon X5550UltraSPARC T2+SPARC64 VIISPARC T4Specs2 CPU, 2.66Ghz, quad core2 CPUs, 1.2 GHz, 6-core1 CPU, 2.75 GHz, quad-core2 CPU 2.85 GHz, 8-coreThread896128Ram14GB16GB32GB128GBSAS Version126.96.36.199.3A summary on hardware configurations and SAS installations.We want to know how SPARC T4-2 performances compares to our existing UNIX SPARC servers, as well as to the Linux servers we have in lab.All the SAS software and servers were configured similarly. Then we ran benchmarking tests using some of the identical production jobs selected from the current SAS system.The key points are:Comparable results between different configurationsThe performance positively/negatively might affect SAS users and production of timely/accurate statistics[Extra Information]Other misc test setup information:All SAS jobs were restricted to using 1024mb of memory using the sasv9.cfg file.No Solid State Drives were used on any server for this testing.Solaris servers file systems are ZFS and Linux servers are ext4.
8SAS DATA and PROC Steps Test #1: A quick performance check The first performance test is a self-contained program using a ZIPCODE database available with all SAS/BASE installations. The program runs common PROC procedures used in most offices. The CPU time and Real time are recorded. This gives a quick look on how the test servers measure against results recorded from previous testing.… Relevant result highlights…Database attributes Name: SAS ZIPCODE Size: GB Number of obs: millions Number of cols: 19Test Setup Duplicate the SASHELP ZIPCODE database by 500 times. Run sort procedure, calculate summary means, and run regression.
9Single-Threading Processing Test #2: Single production job (single-thread)The second test perform record linkage using the BLS Establishment Longitude Database. The program runs thru series of complex logics over 13 consecutive quarters for about 8.5 million establishments. This test is taken from a production job. It is a close simulation of typical SAS production which runs in the office.…Highlight relevant results…Database attributes Name: Longitude Database Size: x 1.0 GB Number of obs: millions Number of cols: 41Test Setup Merging 13 databases by specified linkage rules. Run logic procedures and mathematics computation to produce a final database table.
10Multi-Theading Processing Test #3: Four production jobs (multi-threads)The setup for Test #3 prepares four identical version of Test #2 program, and then starting all four SAS jobs simultaneously. The results provide different measures as how each server performs when running multiple concurrent threads.…Highlight relevant results…Database attributes Name: Longitude Database Size: x 1.0 GB Number of obs: millions Number of cols: 41Test Setup Running four Test #2 programs #2 at the same time.
11PROC IML Statistical Modeling Test #4: Single and Multi statistical procedureThe final test runs a set of SAS IML statistical procedures which performs combination of matrix algebra, statistical modeling, sample and estimates replications. The process which we know are both CPU and memory intensive. And here is the result. Remarks: the ‘eight threads’ testing was not performed for M3000.…Highlight relevant results…Database attributes Name: None Size: N/A Number of obs: N/A Number of cols: N/ATest Setup Running a SAS/STAT PROC IML in a single thread and eight concurrent threads.
12Contacts Edmond Cheng Steven Holmes U.S. Bureau of Labor Statistics 2 Massachusetts Avenue, NEWashington, DC 20212Steven HolmesU.S. Bureau of Labor Statistics2 Massachusetts Avenue, NEWashington, DC 20212Any opinions expressed in this paper are those of the author and do not constitute policy of the Bureau of Labor Statistics.