How Microsoft Had Made Deep Learning Red-Hot in IT Industry

Slides:



Advertisements
Similar presentations
Zhijie Yan, Qiang Huo and Jian Xu Microsoft Research Asia
Advertisements

Scalable Learning in Computer Vision
Deep Learning Bing-Chen Tsai 1/21.
1 Image Classification MSc Image Processing Assignment March 2003.
Advanced topics.
Classification spotlights
ImageNet Classification with Deep Convolutional Neural Networks
AN INVESTIGATION OF DEEP NEURAL NETWORKS FOR NOISE ROBUST SPEECH RECOGNITION Michael L. Seltzer, Dong Yu Yongqiang Wang ICASSP 2013 Presenter : 張庭豪.
SOMTIME: AN ARTIFICIAL NEURAL NETWORK FOR TOPOLOGICAL AND TEMPORAL CORRELATION FOR SPATIOTEMPORAL PATTERN LEARNING.
Stanford CS224S Spring 2014 CS 224S / LINGUIST 285 Spoken Language Processing Andrew Maas Stanford University Spring 2014 Lecture 16: Acoustic Modeling.
Deep Learning and its applications to Speech EE 225D - Audio Signal Processing in Humans and Machines Oriol Vinyals UC Berkeley.
Comp 5013 Deep Learning Architectures Daniel L. Silver March,
Kuan-Chuan Peng Tsuhan Chen
Artificial Neural Nets and AI Connectionism Sub symbolic reasoning.
A shallow introduction to Deep Learning
Introduction to Deep Learning
Learning to perceive how hand-written digits were drawn Geoffrey Hinton Canadian Institute for Advanced Research and University of Toronto.
Dr. Z. R. Ghassabi Spring 2015 Deep learning for Human action Recognition 1.
Speech Communication Lab, State University of New York at Binghamton Dimensionality Reduction Methods for HMM Phonetic Recognition Hongbing Hu, Stephen.
Deep Convolutional Nets
M. Wang, T. Xiao, J. Li, J. Zhang, C. Hong, & Z. Zhang (2014)
Introduction to Deep Learning
CS 188: Artificial Intelligence Learning II: Linear Classification and Neural Networks Instructors: Stuart Russell and Pat Virtue University of California,
Convolutional Restricted Boltzmann Machines for Feature Learning Mohammad Norouzi Advisor: Dr. Greg Mori Simon Fraser University 27 Nov
Object Recognizing. Deep Learning Success in 2012 DeepNet and speech processing.
Introduction to Convolutional Neural Networks
Neural networks (2) Reminder Avoiding overfitting Deep neural network Brief summary of supervised learning methods.
語音訊號處理之初步實驗 NTU Speech Lab 指導教授: 李琳山 助教: 熊信寬
Machine Learning Artificial Neural Networks MPλ ∀ Stergiou Theodoros 1.
Today’s Topics 11/10/15CS Fall 2015 (Shavlik©), Lecture 21, Week 101 More on DEEP ANNs –Convolution –Max Pooling –Drop Out Final ANN Wrapup FYI:
Xintao Wu University of Arkansas Introduction to Deep Learning 1.
Mastering the Pipeline CSCI-GA.2590 Ralph Grishman NYU.
NTNU Speech and Machine Intelligence Laboratory 1 Autoregressive product of multi-frame predictions can improve the accuracy of hybrid models 2016/05/31.
Survey on state-of-the-art approaches: Neural Network Trends in Speech Recognition Survey on state-of-the-art approaches: Neural Network Trends in Speech.
Brief Intro to Machine Learning CS539
Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.
Olivier Siohan David Rybach
Qifeng Zhu, Barry Chen, Nelson Morgan, Andreas Stolcke ICSI & SRI
CS 388: Natural Language Processing: LSTM Recurrent Neural Networks
Machine Learning for Big Data
Deep learning David Kauchak CS158 – Fall 2016.
Deep Learning Amin Sobhani.
2 Research Department, iFLYTEK Co. LTD.
Chilimbi, et al. (2014) Microsoft Research
Goodfellow: Chap 1 Introduction
Deep Learning Insights and Open-ended Questions
Deep Learning in HEP Large number of applications:
Matt Gormley Lecture 16 October 24, 2016
Lecture 24: Convolutional neural networks
Deep Learning Hung-yi Lee 李宏毅.
Classification of Hand-Written Digits Using Scattering Convolutional Network Dongmian Zou Advisor: Professor Radu Balan.
ECE 6504 Deep Learning for Perception
Deep learning and applications to Natural language processing
Machine Learning: The Connectionist
R-CNN region By Ilia Iofedov 11/11/2018 BGU, DNN course 2016.
Goodfellow: Chap 1 Introduction
Bird-species Recognition Using Convolutional Neural Network
Introduction to Neural Networks
Toward improved document classification and retrieval
Deep Learning Tutorial
cs540 - Fall 2016 (Shavlik©), Lecture 20, Week 11
Deep learning Introduction Classes of Deep Learning Networks
[Figure taken from googleblog
Lecture: Deep Convolutional Neural Networks
SVM-based Deep Stacking Networks
John H.L. Hansen & Taufiq Al Babba Hasan
Deep Learning Some slides are from Prof. Andrew Ng of Stanford.
Deep Learning Authors: Yann LeCun, Yoshua Bengio, Geoffrey Hinton
VERY DEEP CONVOLUTIONAL NETWORKS FOR LARGE-SCALE IMAGE RECOGNITION
Prabhas Chongstitvatana Chulalongkorn University
Presentation transcript:

How Microsoft Had Made Deep Learning Red-Hot in IT Industry Zhijie Yan, Microsoft Research Asia USTC visit, May 6, 2014

Self Introduction @MSRA鄢志杰 Research interests 996 – studied in USTC from 1999 to 2008 Graduate student – studied in iFlytek speech lab from 2003 to 2008, supervised by Prof. Renhua Wang Intern – worked in MSR Asia from 2005 to 2006 Visiting scholar – visited Georgia Tech in 2007 FTE – worked in MSR Asia since 2008 Research interests Speech, deep learning, large-scale machine learning

In Today’s Talk Deep learning becomes very hot in the past few years How Microsoft had made deep learning hot in IT industry Deep learning basics Why Microsoft can turn all these ideas into reality Further reading materials

How Hot is Deep Learning “This announcement comes on the heels of a $600,000 gift Google awarded Professor Hinton’s research group to support further work in the area of neural nets.” – U. of T. website

How Hot is Deep Learning

How Hot is Deep Learning

How Hot is Deep Learning

How Hot is Deep Learning

Microsoft Had Made Deep Learning Hot in IT Industry Initial attempts made by University of Toronto had shown promising results using DL in speech recognition on TIMIT phone recognition task Prof. Hinton’s student visited MSR as an intern, good results were obtained on Microsoft Bing voice search task MSR Asia and Redmond collaborated and got amazing results on Switchboard task, which shocked the whole industry

Microsoft Had Made Deep Learning Hot in IT Industry *figure borrowed from MSR principal researcher Li DENG

Microsoft Had Made Deep Learning Hot in IT Industry Followed by others and results were confirmed in various different speech recognition tasks Google / IBM / Apple / Nuance / 百度 / 讯飞 Continuously advanced by MSR and others Expand to solve more and more problems Image processing Natural language processing Search …

Deep Learning From Speech to Image ILSVRC-2012 competition on ImageNet Classification task: classify an image into 1 of the 1,000 classes in your 5 bets lifeboat airliner school bus Institution Error rate (%) University of Amsterdam 29.6 XRCE/INRIA 27.1 Oxford 27.0 ISI 26.2

Deep Learning From Speech to Image ILSVRC-2012 competition on ImageNet Classification task: classify an image into 1 of the 1,000 classes in your 5 bets lifeboat airliner school bus Institution Error rate (%) University of Amsterdam 29.6 XRCE/INRIA 27.1 Oxford 27.0 ISI 26.2 SuperVision 16.4

Deep Learning Basics Deep learning  deep neural networks  multi-layer perceptron (MLP) with a deep structure (many hidden layers) Output layer W3 Output layer Hidden layer W1 Hidden layer W2 Hidden layer W0 Input layer W1 Hidden layer W0 Input layer

Deep Learning Basics Sounds not new at all? Sounds familiar like you’ve learned in class? Things not change over the years Network topology / activation functions / … Backpropagation (BP) Things changed recently Data  Big data General-purpose computing on graphics processing units (GPGPU) “A bag of tricks” accumulated over the years

E.g. Deep Neural Network for Speech Recognition Three key components that make DNN-HMM work Tied tri-phones as the basis units for HMM states Many layers of nonlinear feature transformation Long window of frames *figure borrowed from MSR senior researcher Dong YU

E.g. Deep Neural Network for Image Classification The ILSVRC-2012 winning solution *figure copied from Krizhevsky, et al., “ImageNet Classification with Deep Convolutional Neural Networks”

Scale Out Deep Leaning Training speed was a major problem of DL Speech recognition model trained with 1,800-hour data (~650,000,000 vector frames) costs 2 weeks using 1 GPU Image classification model trained with ~1,000,000 figures costs 1 weeks using 2 GPUs* How to scale out if 10x, 100x training data becomes available? *Krizhevsky, et al., “ImageNet Classification with Deep Convolutional Neural Networks”

DNN-GMM-HMM Joint work with USTC-MSRA Ph.D. program student, Jian XU (许健, 0510) The “DNN-GMM-HMM” approach for speech recognition* DNN as hierarchical nonlinear feature extractor, trained using a sub-set of training data GMM-HMM as acoustic model, trained using full data *Z.-J. Yan, Q. Huo, and J. Xu, “A scalable approach to using DNN-derived features in GMM-HMM based acoustic modeling for LVCSR”

CMLLR unsupervised adaptation DNN-GMM-HMM DNN-derived features PCA HLDA Tied-state WE-RDLT MMI sequence training CMLLR unsupervised adaptation GMM-HMM modeling of DNN-derived features: combine the best of both worlds

Experimental Results 300hr DNN (18k states, 7 hidden layers) + 2,000hr GMM-HMM (18k states)* Training time reduced from 2 weeks to 3-5 days *Z.-J. Yan, Q. Huo, and J. Xu, “A scalable approach to using DNN-derived features in GMM-HMM based acoustic modeling for LVCSR”

A New Optimization Method Joint work with USTC-MSRA Ph.D. program student, Kai Chen (陈凯, 0700) Using 20 GPUs, time needed to train a 1,800-hour acoustic model is cut from 2 weeks to 12 hours, without accuracy loss The magic is to be published We believe the scalability issue in DNN training for speech recognition is now solved!

Why Microsoft Can Do All These Good Things Research Bridge the gap between academia and industry via our intern and visiting scholar programs Scale out from toy problems to real-world industry-scale applications Product team Solve practical issues and deploy technologies to serve users worldwide via our services All together We continuously improve our work towards larger scale, higher accuracy, and to tackle more challenging tasks Finally We have big-data + world-leading computational infrastructure

If You Want to Know More About Deep Learning Neural networks for machine learning: https://class.coursera.org/neuralnets-2012-001 Prof. Hinton’s homepage: http://www.cs.toronto.edu/~hinton/ DeepLearning.net: http://deeplearning.net/ Open-source Kaldi (speech): http://kaldi.sourceforge.net/ cuda-convent (image): http://code.google.com/p/cuda-convnet/

Thanks!