Presentation on theme: "1 Architectural Musings Rethinking Computer Systems Architecture Christopher Vick June 3, 2012."— Presentation transcript:
1 Architectural Musings Rethinking Computer Systems Architecture Christopher Vick firstname.lastname@example.org June 3, 2012
2 Vision Talk Mobile computing and current technologies fundamentally change key parameters and constraints for computer system architecture Vast new opportunities for research of great interest to and great relevance for industry Introduction
3 Outline Computer System Architecture Then (Circa 1970) Scarce Resources & Bottlenecks Optimizations Now (Mobile Computing Platforms) Scarce Resources & Bottlenecks Optimizations? Qualcomm Research Questions?
5 Computer System Architecture Hardware The 5 classic components (Patterson & Hennessy) Input, Output, Memory, Datapath, Control Software System Virtual Machine (Hypervisor, VM, or VMM) Operating System Compilers & Tools Definitions The way components fit together The arrangement of the various devices in a complete computer system or network The instruction set plus a model of the execution of the instruction set (Amdahl et al) Computer System Architecture The selection and combination of hardware and software components to assemble an effective computer system
7 Effective An optimization problem Many variables Selection of hardware/software components Selection of interfaces/interconnects Many constraints Physical, sociological, technical & cost constraints Scarce Resources and Bottlenecks Maximize utilization of scarce resources Minimize impact of bottlenecks
9 Scarce Resources CPU Cycles CPUs expensive Slow clock rates Memory Locations Random Access Memory expensive Address/Data paths into CPU expensive Skilled Programmers Relatively new discipline Poor language and tools support
10 Bottlenecks Programmer Productivity Software development slow and expensive Low level programming paradigms Memory Latency RAM latency gated overall speed (~2-3 MHz) Small RAM backed by vastly slower storage I/O Bandwidth Limited CPU connectivity Crude communication mechanisms
11 Optimizations Time Sharing Effective sharing of limited resource Virtual Memory Effective sharing, and backing with cheaper alternative Hardware Improvements Smaller features provide more resource and faster clock Large Scale Integration Better signaling to improve bandwidth High Level Programming Languages Broadens productive programmer community Abstracts away some hardware complexity
12 Examples Digital PDP 11 16-bit address space Orthogonal instruction set Memory mapped I/O Unix, DOS, many others IBM System 370 24-bit address space Virtual Memory VMS, VM/370, DOS/VS Backward compatibility with System 360
14 Scarce Resources Energy Fixed Energy Budget for mobile devices Thermal issues at all scales Tradeoff between performance and energy Shrinks no longer significantly improving consumption Memory Bandwidth Providing bandwidth is expensive Memory interconnect consumes significant energy
15 Bottlenecks Memory Latency Increasing gap between CPU speed and DRAM latency Physical distance to DRAM devices a factor Concurrency Shortage of programmers who can handle this Inadequate language/tools support I/O Bandwidth/Latency Wireless bandwidth lower than wired Consumes large amounts of energy
16 Example HTC One Processor: 1.5 GHz Dual Core Qualcomm MSM8960 OS: Android 4.0 (ICS) Memory RAM: 1 GB DDR2 Memory Storage: 16 GB onboard storage Display: 4.7" HD super LCD 1280 x 720 Network: LTE CAT3 - DL 100 /UL 50 LTE: 700/AWS WCDMA: 2100/1900/AWS/850 EDGE: 850/900/1800/1900 Battery: 1800 mAh Camera (Main): 8 MP, f/2.0, BSI, 1080p HD Video (Front): 1.3 MP with 720p video Dimensions: 134.8 x 69.9 x 8.9mm This is a General Purpose Computer!
17 Optimizations? Multi-core Aggressive addition of cores and threads Hardware concurrency outstripping software New Concurrent Programming Models/Tools? Memory Subsystem Significant contributor to total energy consumption Adding bandwidth is expensive New technologies addressing some energy issues Wireless bandwidth enhancements (LTE Advanced,etc.) Solutions from desktop/server or embedded worlds may not directly apply in mobile space!
18 Memory System Energy Retaining data (one second) DRAM: ~1-10 pJ/bit self-refresh SRAM: 1200+ pJ/bit, and rising over time [ITRS 2009] 4 pJ/bit (45nm LP, standby) [Barasinski et al., ESSCIRC 08] Flash, PCM, STT RAM…: Zero ! Moving Data 32-bit value: Recompute: 60 pJ (Razor) Send 1mm: 10 pJ Retain in cache for 1 ms: 38 pJ Retain in DRAM for 1 second: 32+ pJ
19 Move less! Caches physically close to CPU Locality, locality, locality (the first rule of chip real estate) Retain less! Power off unused caches lines [Kaxiras et al., ISCA 01] Drowsy caches [Flautner et al., ISCA 02] … with compiler analysis [Zhang et al., Trans. Emb. Comp. Sys. 4(3) 2005] Dont refresh unused DRAM … e.g. with garbage collection [Chen et al., CODES+ISSS 03] Reducing Memory System Energy
20 Maintaining the illusion of a single flat memory address space is too expensive On-chip caches can be major consumers of area and energy Coherence protocols are expensive and difficult to scale Alternative: software-managed memory hierarchies –Tightly-coupled memory (TCM), scratchpads –Do not require tag memory, address comparison logic –More area- and energy-efficient –Help bridge gap between bandwidth and throughput Extending the Memory Model
21 Different programming paradigm: software explicitly orchestrates all transfers between on-chip and off-chip memory areas Major implications on memory management Scratchpad allocation strategies Data partitioning strategies Dynamic relocation between scratchpad and DRAM to track the programs locality characteristics Opportunities for compile-time and runtime optimization Challenges in both Hardware and Software! New Challenges and Opportunities
Qualcomm Research Excellence in Wireless MAY | 2012 WWW.QUALCOMM.COM/RESEARCH
23 State of the Art Capabilities Fostering Innovation Prototype Development Facilities CPU Simulation Clusters Antenna Ranges Outdoor Field Systems 30% of engineers with PhD, 50% Masters Systems, HW, SW, Standards, Test Engineering Ventures, Bus Dev, Technical Marketing, Program Mgmt. Complete Development LabsHuman Resources
24 Global Research and Development Organization UNITED STATESEUROPEASIA San Diego, CA Santa Clara, CA Bridgewater, NJ Cambridge, UK Nuremberg, Germany Vienna, Austria Beijing, China Bangalore and Hyderabad, India Seoul, S. Korea
25 Qualcomm Research & University Relations ACADEMIC COLLABORATION TO FOSTER ADVANCED RESEARCH RESEARCH Ongoing relations with more than 30 US and 25 International Universities Current funding includes MIT, UC Berkeley, Stanford, UCSD, UT Austin, ASU, UIUC, Univ. of Michigan, EPFL, IISc Bangalore, KAIST, Tsinghua Research collaboration spans variety of technical areas Computer vision, multicore processing, context aware computing, machine learning, low power devices,, wireless networks and signal processing, etc.. Qualcomm Innovation Fellowship (QInF) invests on innovative ideas Close interactions between Qualcomm Research engineers, graduate students and professors
26 INNOVATE BEYOND WAN EXCELLING IN ALL FORMS OF WIRELESS TAKE WWAN TO THE NEXT LEVEL IMPROVING WWAN TECHNOLOGY RE-ARCHITECTING NEXT-GEN MOBILE DEVICES BREAKTHROUGH PERFORMANCE TRANSFORMING THE MOBILE USER EXPERIENCE ENABLE SMART APPLICATIONS Qualcomm Research For The Wireless Future
27 Innovate Beyond WAN WIRELESS LOCAL AREA PEANUTWIFI ADVANCED LTE D2D (FLASHLINQ) INNAV Next gen short range ultra-low power radio Multi Gbps WLAN using 5 GHz and 60 GHz band. Next Gen low-power WiFi for Internet of Things Proximal Wireless First Gen device-to- device wireless network Autonomous discovery Direct communications Indoor positioning for indoor location based applications Map tools for Mobile Devices
28 AUGMENTED REALITY LOOKLISTENDASHAWARE Mobile user interface Computer vision for mobile devices Multiple language text detection and recognition With Mobile phone camera view finder Background Audio processing Augmented user experience Efficient video delivery over HTTP for mobile devices Build awareness in mobile devices For enhanced daily life situations Enable Smart Applications ELEVATE THE WIRELESS USER EXPERIENCE
29 Breakthrough Device Performance RE-ARCHITECTING NEX-GEN DEVICES ADVANCED RADIO TECHNOLOGIES MANTICOREGRYPHON New RF front-end and baseband technologies RF/antenna and systems/protocol techniques Concurrent multi-radio operation Advanced mobile device SW platforms Improved user experience Virtual machine design for SoC architecture Enabling higher power efficiency
Your consent to our cookies if you continue to use this website.