Presentation is loading. Please wait.

Presentation is loading. Please wait.

Blazingly Fast Machine Learning Inference

Similar presentations


Presentation on theme: "Blazingly Fast Machine Learning Inference"— Presentation transcript:

1 Blazingly Fast Machine Learning Inference
Vish Abrams Architect, Cloud Development Machine Learning Team, Oracle Cloud Infrastructure October 22, 2018

2 Safe Harbor Statement The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, timing, and pricing of any features or functionality described for Oracle’s products may change and remains at the sole discretion of Oracle Corporation.

3 Program Agenda 1 Machine Learning Model Serving What is GraphPipe? Advantages Protocol Deep Dive Real World Demo More Info 2 3 4 5 6

4 Program Agenda 1 Machine Learning Inference What is GraphPipe? Performance Protocol Deep Dive Real World Demo More Info 2 3 4 5 6

5 Machine Learning Inference (Model Serving)
Building machine learning models has become much easier due to open source frameworks like TensorFlow and Pytorch Serving machine learning models means putting your trained model onto a server so that it can be accessed by client applications This involves two components: the ML client and the ML server. The client talks to the server using some kind of communication protocol: often JSON over HTTP.

6 ML Client

7 ML Server

8 Program Agenda 1 Machine Learning Model Serving What is GraphPipe? Advantages Protocol Deep Dive Real World Demo More Info 2 3 4 5 6

9 What is GraphPipe? GraphPipe is an open source protocol and collection of software designed to simplify machine learning model deployment and decouple it from framework-specific model implementations.

10 In other words, it turns this:
What is GraphPipe? In other words, it turns this: mxnet server tensorflow serving custom server standard json custom protocol protocol buffers custom client autogenerated client

11 What is GraphPipe? Into this: graphpipe-onnx graphpipe-tf

12 GraphPipe Features A minimalist machine learning transport specification based on flatbuffers Simple reference model servers for Tensorflow, Caffe2, and ONNX. Efficient client implementations in Go, Python, and Java.

13 Why Did we Make It? Production deployments of AI agents are around the corner Model Serving is an important part of production solutions Existing solutions suffer from various problems: Inconsistent Inefficient Custom Clients A standard along with simple implementations moves the industry forward

14 Program Agenda 1 Machine Learning Model Serving What is GraphPipe? Advantages Protocol Deep Dive Real World Demo More Info 2 3 4 5 6

15 Ease of Development Model Servers are written in Go – a very accessible language Flatbuffer code generation makes it easy to produce new clients Open spec makes it possible to integrate with existing servers

16 Protocol Performance

17 Serving Performance

18 Program Agenda 1 Machine Learning Model Serving What is GraphPipe? Advantages Protocol Deep Dive Real World Demo More Info 2 3 4 5 6

19 Flatbuffers Extensible protocol Small code footprint
Near-zero deserialization overhead

20 Protocol Summary

21 Program Agenda 1 Machine Learning Model Serving What is GraphPipe? Advantages Protocol Deep Dive Real World Demo More Info 2 3 4 5 6

22

23 AlphaZero Timeline AlphaGo Beats Fan Hui Oct-16
Paper published in Nature Jan-16 AlphaGo Beats Lee Sedol Mar-16 AlphaGo Beats Ke Jie May-17 AlphaGoZero published Oct-17 AlphaZero published Dec-17

24 AlphaZero Algorithm for training a machine to play any game*
Any game that can be represented with a Markov Process Trained without human information through self play Needs a structured representation of the game state Needs rules for transitioning from one state to the next

25 The Game Playing Black Box
Neural Network Position Move

26 Training the Network Training Labeled Data Neural Network

27 Generating Data Self-Play (MCTS) Neural Network Labeled Data

28 AlphaZero In a nutshell
Neural Network Self-Play Training Labeled Data

29 AlphaZero for Connnect Four
We trained a network to play Connect Four using 150 cycles of this process (and playing about 1,000,000 games during self-play) The network finds the correct move in about 99% of positions We used GraphPipe as part of the training process because we were generating games across a cluster of 5 machines with GPUs But GraphPipe is even more useful for deploying this model so that people can use it How do we deploy our model for use in an application? GraphPIpe!

30 Serving the AlphaZero Trained Network
Position Web Frontend Neural Network GraphPipe GraphPipe Move

31 Live Demo!

32 Live Demo This is an event branded Section Header with Graphic slide ideal for including a picture with a brief title and optional subtitle. This slide can also be used as a Q and A slide. Do not customize this slide with your own background. Subtitle

33 Actual Architecture

34 Program Agenda 1 Machine Learning Model Serving What is GraphPipe? Advantages Protocol Deep Dive Real World Demo More Info 2 3 4 5 6

35 GraphPipe https://oracle.github.io/graphpipe/
e138b7a7c1ef

36 AlphaZero https://azfour.com/
7e36e powered-by-the-alphazero-algorithm-d0c82d6f3ae9 diagram-365f5abf67e0

37 Questions and Answers Subtitle
This is an event branded Section Header with Graphic slide ideal for including a picture with a brief title and optional subtitle. This slide can also be used as a Q and A slide. Do not customize this slide with your own background. Subtitle


Download ppt "Blazingly Fast Machine Learning Inference"

Similar presentations


Ads by Google