- 세부 1 - 이종 클라우드 플랫폼 데이터 관리 브로커 연구 및 개발 Network and Computing Lab.

Slides:

Advertisements

Similar presentations

Session 8: Virtual Memory management

Advertisements

Cache and Virtual Memory Replacement Algorithms

Chapter 11 – Virtual Memory Management

Chapter 11 – Virtual Memory Management

Virtual Memory (II) CSCI 444/544 Operating Systems Fall 2008.

A Survey of Web Cache Replacement Strategies Stefan Podlipnig, Laszlo Boszormenyl University Klagenfurt ACM Computing Surveys, December 2003 Presenter:

A Preliminary Attempt ECEn 670 Semester Project Wei Dang Jacob Frogget Poisson Processes and Maximum Likelihood Estimator for Cache Replacement.

Paging: Design Issues. Readings r Silbershatz et al: ,

© 2004 Wayne Wolf Topics Task-level partitioning. Hardware/software partitioning.  Bus-based systems.

Application-Aware Memory Channel Partitioning † Sai Prashanth Muralidhara § Lavanya Subramanian † † Onur Mutlu † Mahmut Kandemir § ‡ Thomas Moscibroda.

1 Improving Direct-Mapped Cache Performance by the Addition of a Small Fully-Associative Cache and Prefetch Buffers By Sreemukha Kandlakunta Phani Shashank.

Locality-Aware Dynamic VM Reconfiguration on MapReduce Clouds Jongse Park, Daewoo Lee, Bokyeong Kim, Jaehyuk Huh, Seungryoul Maeng.

Cloud Computing Resource provisioning Keke Chen. Outline  For Web applications statistical Learning and automatic control for datacenters  For data.

Lecture 34: Chapter 5 Today’s topic –Virtual Memories 1.

Memory System Characterization of Big Data Workloads

1 CSE 380 Computer Operating Systems Instructor: Insup Lee University of Pennsylvania, Fall 2002 Lecture Note: Memory Management.

CSCE 212 Chapter 7 Memory Hierarchy Instructor: Jason D. Bakos.

Improving Proxy Cache Performance: Analysis of Three Replacement Policies Dilley, J.; Arlitt, M. A journal paper of IEEE Internet Computing, Volume: 3.

CS 104 Introduction to Computer Science and Graphics Problems

CS 300 – Lecture 20 Intro to Computer Architecture / Assembly Language Caches.

1 PATH: Page Access Tracking Hardware to Improve Memory Management Reza Azimi, Livio Soares, Michael Stumm, Tom Walsh, and Angela Demke Brown University.

1 Lecture 10: Large Cache Design III Topics: Replacement policies, prefetch, dead blocks, associativity Sign up for class mailing list Pseudo-LRU has a.

ECE7995 Caching and Prefetching Techniques in Computer Systems Lecture 8: Buffer Cache in Main Memory (IV)

Computer System Overview Chapter 1. Basic computer structure CPU Memory memory bus I/O bus diskNet interface.

An Intelligent Cache System with Hardware Prefetching for High Performance Jung-Hoon Lee; Seh-woong Jeong; Shin-Dug Kim; Weems, C.C. IEEE Transactions.

SyNAR: Systems Networking and Architecture Group Symbiotic Jobscheduling for a Simultaneous Multithreading Processor Presenter: Alexandra Fedorova Simon.

Lecture 33: Chapter 5 Today’s topic –Cache Replacement Algorithms –Multi-level Caches –Virtual Memories 1.

Basics of Operating Systems March 4, 2001 Adapted from Operating Systems Lecture Notes, Copyright 1997 Martin C. Rinard.

Authors: Tong Li, Dan Baumberger, David A. Koufaty, and Scott Hahn [Systems Technology Lab, Intel Corporation] Source: 2007 ACM/IEEE conference on Supercomputing.

Page 19/17/2015 CSE 30341: Operating Systems Principles Optimal Algorithm  Replace page that will not be used for longest period of time  Used for measuring.

1 Distributed Operating Systems and Process Scheduling Brett O’Neill CSE 8343 – Group A6.

Chapter 3: Operating-System Structures System Components Operating System Services System Calls System Programs System Structure Virtual Machines System.

Operating Systems Unit 8: – Virtual Memory management Operating Systems.

A Survey of Distributed Task Schedulers Kei Takahashi (M1)

(1) Scheduling for Multithreaded Chip Multiprocessors (Multithreaded CMPs)

1 Multiprocessor and Real-Time Scheduling Chapter 10 Real-Time scheduling will be covered in SYSC3303.

Papers on Storage Systems 1) Purlieus: Locality-aware Resource Allocation for MapReduce in a Cloud, SC ) Making Cloud Intermediate Data Fault-Tolerant,

The Owner Share scheduler for a distributed system 2009 International Conference on Parallel Processing Workshops Reporter: 李長霖.

10/18: Lecture topics Memory Hierarchy –Why it works: Locality –Levels in the hierarchy Cache access –Mapping strategies Cache performance Replacement.

1 Process Scheduling in Multiprocessor and Multithreaded Systems Matt Davis CS5354/7/2003.

COT 4600 Operating Systems Fall 2009 Dan C. Marinescu Office: HEC 304 Office hours: Tu-Th 3:00-4:00 PM.

MadCache: A PC-aware Cache Insertion Policy Andrew Nere, Mitch Hayenga, and Mikko Lipasti PHARM Research Group University of Wisconsin – Madison June 20,

Embedded System Lab. 정범종 A_DRM: Architecture-aware Distributed Resource Management of Virtualized Clusters H. Wang et al. VEE, 2015.

1 Memory Management. 2 Fixed Partitions Legend Free Space 0k 4k 16k 64k 128k Internal fragmentation (cannot be reallocated) Divide memory into n (possible.

Virtual Memory The memory space of a process is normally divided into blocks that are either pages or segments. Virtual memory management takes.

Precomputation- based Prefetching By James Schatz and Bashar Gharaibeh.

Improving Direct-Mapped Cache Performance by the Addition of a Small Fully-Associative Cache And Pefetch Buffers Norman P. Jouppi Presenter:Shrinivas Narayani.

Embedded System Lab. 오명훈 Addressing Shared Resource Contention in Multicore Processors via Scheduling.

High Performance Embedded Computing © 2007 Elsevier Lecture 10: Code Generation Embedded Computing Systems Michael Schulte Based on slides and textbook.

Sunpyo Hong, Hyesoon Kim

NETW3005 Virtual Memory. Reading For this lecture, you should have read Chapter 9 (Sections 1-7). NETW3005 (Operating Systems) Lecture 08 - Virtual Memory2.

1 Appendix C. Review of Memory Hierarchy Introduction Cache ABCs Cache Performance Write policy Virtual Memory and TLB.

Operating Systems: Summary INF1060: Introduction to Operating Systems and Data Communication.

Scheduling of Reducing Cache Pollution in Multicore Department of Embedded software, Korea university Juho Lim.

1 Contents Memory types & memory hierarchy Virtual memory (VM) Page replacement algorithms in case of VM.

System Components Operating System Services System Calls.

Computer Orgnization Rabie A. Ramadan Lecture 9. Cache Mapping Schemes.

18742 Parallel Computer Architecture Caching in Multi-core Systems

Virtual Memory Management

Main Memory Management

Bank-aware Dynamic Cache Partitioning for Multicore Architectures

Improving cache performance of MPEG video codec

Massachusetts Institute of Technology

COT 4600 Operating Systems Spring 2011

Chapter 2: Operating-System Structures

Contents Memory types & memory hierarchy Virtual memory (VM)

15-740/ Computer Architecture Lecture 14: Prefetching

Virtual Memory: Working Sets

Chapter 2: Operating-System Structures

Presented by Florian Ettinger

Presentation transcript:

- 세부 1 - 이종 클라우드 플랫폼 데이터 관리 브로커 연구 및 개발 Network and Computing Lab.

연구 목표 모바일 클라우드 메타데이터 정의 기법 모바일 클라우드 메타 데이터 기반 자원 관리 및 마이그레이션 기 법 이종 클라우드 인프라 성능을 고려한 자원 및 서비스 프로파일링 기법을 이용한 서비스 성능 향상 및 사용자 SLA 보장 기법 서비스 실행에 필요한 데이터들을 캐싱하여 서비스 속도 및 성능 향상을 위한 데이터 프로비저닝 및 빅데이터 처리를 위한 실시간 데이터 공급 기법 연구 데이터 사용 특성 기반 적합 데이터 콘솔리데이션 및 프로비저닝 기법 연구 빅데이터 처리를 위한 분산 및 이종 데이터의 통합 관리를 위한 데 이터 가상화 기법 연구

모바일 클라우드 메타 데이터 기반 자원 관 리 및 마이그레이션 기법

서비스 및 어플리케이션 프로파일링 클라우드 메타데이터 – 서비스 및 어플리케이션 프로파일 – 자원 프로파일 서비스 및 어플리케이션 프로파일링 –Basic approach The expected execution time profiling by VM types (historical data) Application performance by resource usage profiling –Advanced approach Considering resource contention analysis among applications

Today’s topic Classification scheme [Zhuravlev et al., SIGARCH, 2010] –Classification scheme is for identifying which applications should and should not be scheduled together. –Classification scheme enables the scheduler to predict the performance effects of co-scheduling any group of threads in a shared cache –VM placement & allocation algorithm consists of two components: Classification scheme and The policy

Classification Scheme 1) Stack Distance Competition (SDC) [Chandra et al., HPCA, 2005] (1) Assumption) L2 Cache LRU Replacement Stack Distance Profile –Capturing the temporal Reuse behavior of an application in a fully- or set-associative cache Basic Prediction Approach –For smaller cache

Classification Scheme 1) Stack Distance Competition (SDC) (2) Objective –How two applications compete for the LRU stack positions in the shared cache and estimate the extra misses incurred by each application as a result of this contention Main idea –Constructing a new stack distance profile that merges individual stack distance profiles of threads that run together

Classification Scheme 1) Stack Distance Competition (SDC) (3) SDC Algorithm 1)Each individual profile is assigned a current pointer that is initialized to point to the first stack distance position 2)The algorithm iterates A times over each position in the profile, determining which of the co-runners will be the “winner” for this stack-distance position 3)After Ath iteration, the effective cache space for each thread is computed proportionally to the number of its stack distance counters that are included in the merged profile  The cache miss rate with the new effective cache space is estimated for each co-runner

Classification Scheme 2) Animal Classes [Xie et al., CMP-MSI, 2008] (1) This classification scheme allows classifying applications in terms of their influence on each other when co-scheduled in the same cache Four application classes –Turtle (low use of the shared cache) –Sheep (low miss rate, insensitive to the number of cache ways allocated to it) –Rabbit (low miss rate, sensitive to the number of allocated cache ways) –devil (high miss rate, access the L2 cache very quickly)

Classification Scheme 2) Animal Classes (2) Application Classification Algorithm Symbiosis table –To approximate relative performance degradations for applications that fall within different animal classes –Providing estimates of how well various classes co-exist with each other on the same shared cache This scheme uses stack distance profiles

Classification Scheme 3) Miss rate [Zhuravlev et al., SIGARCH, 2010] [Knauerhase et al., IEEE Micro, 2008] (1) Identifying applications with high miss rates is very beneficial for the scheduler because these applications exacerbate the performance degradation due to memory controller contention, memory bus contention, and prefetching hardware contention To attempt an approximation of the best schedule using the miss rate heuristic, the scheduler will identify high miss rate applications and separate them into different caches, such that no one cache will have a much higher total miss rate than any other cache

Classification Scheme 4) Pain [Zhuravlev et al., SIGARCH, 2010] (1) Cache sensitivity –A measure of how much an application will suffer when cache space is taken away from it due to contention –This can be calculated by first, examining the number of cache hits that will most likely turn into misses when the cache is shared second, assigning to positions in the stack distance profile loss probabilities describing the likelihood that the hits will be lost from each position Loss probability distribution is “i / (n+1)” in this paper –Cache sensitivity formula h(i) is the number of hits to the i-th position in the stack, where i=1 is the MRU and i=n is the LRU for an n-way set associative cache

Classification Scheme 4) Pain (2) Cache intensity –A measure of how much an application will hurt others by taking away their space in a shared cache –Measured using the number of last-level cache accesses per one million instructions The Pain metric –The resulting pain is measured by combining cache sensitivity and intensity

Classification Schemes Evaluation [Zhuravlev et al., SIGARCH, 2010]

데이터 사용 특성 기반 적합 데이터 콘솔리 데이션 및 프로비저닝 기법 연구

Today’s topic Data placement & VM placement in Big data processing –Importance of data placement Input data-intensive workloads such as Map Centralized file system vs Distributed file system –Importance of VM placement Intermediate data-intensive workloads such as Reduce Performance issue such as SLA and resource contention

Data placement & VM placement: Purlieus [Palanisamy et al., SC, 2011] (1) Job classification –Map-input heavy jobs (Input data-intensive workloads) –Reduce-input heavy jobs (Intermediate data- intensive workloads) –Map-and-Reduce-input heavy jobs (Input data-and-Intermediate data-intensive workloads)

Data placement & VM placement: Purlieus (2) Map-input heavy jobs (Input data-intensive workloads) –Input data placement Choosing physical machines only based on the storage utilization and the expected load –VM placement Data locality Choosing physical machines which have the corresponding data

Data placement & VM placement: Purlieus (3) Reduce-input heavy jobs (Intermediate data-intensive workloads) –Input data placement Choosing physical machines with maximum free storage –VM placement Choosing physical machines which are close each other Map-Input-Reduce-input heavy jobs (Intermediate data-intensive workloads) –Considering both

Delay scheduling [Zaharia et al., Eurosys, 2010] If we cannot find the appropriate node which has data for first job in a job queue, delaying the job to find the appropriate node until the certain period. Data locality In streaming situation Processing PM job queue delay!