Presentation is loading. Please wait.

Presentation is loading. Please wait.

Data Parallel Application Development and Performance with Windows Azure Advisor : Professor Gagan Agrawal Present by : Yu Zhang.

Similar presentations


Presentation on theme: "Data Parallel Application Development and Performance with Windows Azure Advisor : Professor Gagan Agrawal Present by : Yu Zhang."— Presentation transcript:

1 Data Parallel Application Development and Performance with Windows Azure Advisor : Professor Gagan Agrawal Present by : Yu Zhang

2 Agenda

3 Motivation

4 Goals

5 The same facilities that a desktop OS provides, but on a set of connected servers: Abstract execution environment Shared file system Resource allocation Programming environments Utility computing 24/7 operation Pay for what you use Simpler, transparent administration

6

7 Windows Azure PaaS ApplicationsWindows Azure Service Model Runtimes.NET 3.5/4, ASP.NET, PHP Operating SystemWindows Server 2008/R2-Compatible OS VirtualizationWindows Azure Hypervisor ServerMicrosoft Blades DatabaseSQL Azure StorageWindows Azure Storage (Blob, Queue, Table) NetworkingWindows Azure-Configured Networking

8 A Windows Azure application is called a “service”  Definition information  Configuration information  At least one “role” Service definition is in ServiceDefinition.csde Defines aspects of a service that cannot be changed without redeployment  Types of roles and static role configuration  Set of configuration settings for a role  Contract with the environment code runs

9 Service configuration is in ServiceConfiguration.cscfg Defines values for properties that can be dynamically updated for a running deployment  Values of a configuration parameter  Number of running instances

10 Definition: Role name Role type VM size (e.g. small, medium, etc.) Network endpoints Code: Web/Worker Role: Hosted DLL and other executables VM Role: VHD Configuration: Number of instances Number of update and fault domains

11 Desktop And Related Azure Concepts

12 Storage Services Public Internet Web Role Load Balancer

13 Storage Service Worker Role Web Role

14 Windows Azure Storage Abstractions

15

16 2 2 1 1 C1C1 C1C1 C2C2 C2C2 1 1 2 2 3 3 4 4 Producers Consumers P2P2 P2P2 P1P1 P1P1 3 3 1 1 2 2 Queue Usage Example

17 Communicating sequential processes  Each process runs in its own local address space.  Processes exchange data and synchronize via message passing. ( Usually, but not always, same code executed by all processes.)  Need to take care of locality, in order to achieve performance – message passing does this explicitly.

18 Azure Parallel Programming Model VMS LB IIS VMS Web Role Worker Role Queue or WCF

19 MPI_Reduce(inbuf, outbuf, count, type, op, root, comm) Inbuf : address of input buffer Outbuf: address of output buffer Count : number of elements in input buffer Type : datatype of input buffer elements Op : operation Root : process id of root process public class WorkerRole : RoleEntryPoint { Public override void Run() { doWork(); var msg = new CloudQueueMessage(); queue.AddMessage(msg); }

20 MPI_Allreduce(inbuf, outbuf, count, type, op, comm) Inbuf : address of input buffer Outbuf: address of output buffer Count : number of elements in input buffer Type : datatype of input buffer elements Op : operation public class WorkerRole : RoleEntryPoint { Public override void Run() { if (queue.Exists()) { var msg = queue.GetMessage(); if (msg != null) { DoWork(); queue1.DeleteMessage(msg); } doWork(); var msg = new CloudQueueMessage(); queue.AddMessage(msg); }

21  Each worker role reads the data from matrix B  Decouple the matrix A into n parts, n is the number of the worker roles.  Each worker role gets one part of matrix A, for a N×N matrix, each worker role has two data sets, one is matrix B, the other is part of matrix A, say A K (1≤k≤n) n is the number of worker roles.  Each worker role computes the A K ×B and add the result to its queue  Web role performs the reduce operation gets the final result.

22 1. Web role calculates the initial means 2.Broadcast the k centroids to all worker roles 3. Each worker role computes distance of each local document vector to the centroids 4. Assign points to closest centroid and compute local MSE (Mean Squared Error) 5. Perform reduction for global centroids and global MSE value 6. Web role broadcast new cnetroids to all worker role until no points move.

23 1. Web role be the master, the other N worker roles are slaves. 2.Master divides the training samples to N subsets, and distributes 1 subset for each worker role. 3.Each individual worker role now computes the distance measures independently and storing the computes measures in a local array 4.When each worker role terminates distance calculation, it transmits a message to the web role indicating end of processing 5.Web role then notes the end of processing for the sender and acquires the computes measures by reduction. 6.After the web role has claimed all distance measures from all WRs, the following steps are performed: Select top k measures Sort all distance measures in ascending order Count the number of classes in the top k measures The input element’s class will belong to the class having the higher count among top k measures

24 What is Windows Communication Foundation (WCF)?  WCF is Microsoft’s implementation of industry standards to provide a communication subsystem enabling applications on one machine (process boundary) or across multiple machines to communicate.  WCF is a core component of the.NET Framework 3.0 and later versions which is included with Windows 7 and Vista platforms as well as the future version of Windows Server.  The WCF API unifies ASMX Web Services,.NET Remoting, distributed transactions and messaging into a single programming model service orientation tenable.  Fundamental to.NET Framework. ASMXWSE.NET Remoting COM+ (Enterpris e Services) MSMQ WCF

25 WCF: Address, Binding, Contract ClientService Message AddressBindingContract Where?How?What? Endpoint ABCABC Endpoints ABC WCF Services are deployed, discovered and consumed as endpoints

26 WCF : Endpoint

27 WCF in Azure maxBufferSize="10485760" maxReceivedMessageSize ="10485760" maxBufferSize="10485760" maxReceivedMessageSize ="10485760"

28 PolymorphismEncapsulationSubclassing 1980s Interface-based Dynamic Loading Runtime Metadata 1990s Object-Oriented Service-Oriented Component-Based Message-basedSchema+Contract Binding via Policy 2000s C&C++ with MPI Queue with Azure WCF with Azure

29 Experimental Evaluation MPIQueueWCF 8 Processors0.0993sec 8.8726sec 4.4533sec 4 Processors0.1656sec13.9872sec 6.349sec 2 processors0.4723sec20.6536sec 11.5783sec MPIQueueWCF 8 Processors0.10232.89021.9234 4 Processors0.25124.12243.4267 2 processors0.54207.62385.5263 MPIQueueWCF 8 Processors0.4272 sec1.0623 sec0.8976 sec 4 Processors1.2567 sec2.3457 sec1.5214 sec 2 processors2.0233 sec5.2356 sec4.1218 sec Time (sec ) Time (sec ) Time (sec ) Time (sec ) Time (sec ) Time (sec ) Matrix Multiplication Kmeans KNN Fastest Read: 31ms Slowest Read: 203ms Fastest Write: 31ms Slowest Write: 234ms Fastest Delete: 0ms Slowest Delete: 593ms simply a reliable method of delivering messages between processes Fastest Read: 31ms Slowest Read: 203ms Fastest Write: 31ms Slowest Write: 234ms Fastest Delete: 0ms Slowest Delete: 593ms simply a reliable method of delivering messages between processes QUEUE Performance

30 Azure VS Traditional Cluster CPU Ram Bandwidth Glenn 2.7Ghz 8 G20 Gbps Azure 1.6Ghz 2 G10 Gbps

31 Conclusion


Download ppt "Data Parallel Application Development and Performance with Windows Azure Advisor : Professor Gagan Agrawal Present by : Yu Zhang."

Similar presentations


Ads by Google