Presentation is loading. Please wait.

Presentation is loading. Please wait.

Big Data, Big Analytics and Informed Actions

Similar presentations


Presentation on theme: "Big Data, Big Analytics and Informed Actions"— Presentation transcript:

1 Big Data, Big Analytics and Informed Actions
Dr. Wanli Min Director for Data Science Alibaba Group 12/18/2014

2 Agenda Introduction Data Revolution
Use Cases of Big Data in E-Commerce Beyond E-Commerce

3 Introduction Dr. Wanli Min 闵万里 (山景)
PhD in Statistics from the University of Chicago, 2004 IBM T. J. Watson Research Center, New York IBM Singapore, Singapore Google, Mountain View, California Alibaba, Hangzhou , China

4 Technology Drives Innovation
+ Intelligent Instrumented Interconnected

5 Booming Ecommerce China has the largest online population
Online shopping is gaining popularity among a wide range of social groups

6 Ecommerce Leader: Alibaba
Notes: (1) Though minority investments or joint ventures (2) Through contractual arrangements with Ant Financial Services Group, our related company that operates Alipay

7 Ecommerce Reaches Offline Shopping
12.12 shoppers got 50% off with Alipay Wallet at offline retail stores

8 Ecommerce Big Data Ecom Offline retail Consumer
Internet Internet of Things Offline retail People-centric: Internet enables ecommerce and drives big data

9 Ecommerce Big Data Source: New York Times, August 5, 2009

10 Data Revolution is Ongoing
Social evolution leads to inevitable Data Revolution Big data IT tech Semiconductor Modern Physics Industrial Revolution Agricultural IBM sold PC business 2005, 2006: Smarter Planet Smarter City 2004 Y2K 1980 1967 1947 1937 1905 1844 1730

11 Information Explosion Drives Data Revolution
Internet drives big data: Information Explosion Early days of Internet: “Copyright Do not redistribute.” Nowadays, information sharing is pervasive 2014 Click to share

12 Rising Power of Data Data Influence Exploded in the Past Decades
New Product New Company New Politician New Geo-Politics

13 Data Processing Product
In 1995, processing data of MB

14 Data-driven New Business
2004, processing data of GB, Google went public

15 Data Enables Politician
In 2013, processing data of TB, the U.S. President invited you to town hall meetings to discuss hot issues

16 Big Data Reshaped GeoPolitics
Democracy is a perfect case for big data usage Obama Campaign in 2008, 2012 Source: CNN, November 8, 2012 Source: Uchicago News, April 17, 2013 Facebook, Twitter,Blog,Poll… Where are the persuadable voters ?

17 IT to DT IT ——> DT (Data Technology) August 2009, New York Times
“ The sexy job in the next 10 years will be statisticians” - Hal Varian, Google Chief Economist “What’s ubiquitous and cheap?” “Data.” “What’s scarce is the analytical ability to utilize that data.” In October 2008, Alibaba set its long-term strategy: Alibaba is a data company Alibaba empowers itself to make cloud computing as utility available to the public In February 2014, Alibaba announced its latest strategy : 云+端,Data Technology

18 Outlook of Big Data How big is “Big Data”?
- If sample data covers nearly everywhere in the entire probability space, then the inference from such sample data is less dependent on specific model Volume, Velocity, Variety, Veracity Sufficient Statistics vs. Big Data Ergodicity

19 Outlook of Big Data Internet  Internet of Things  Big Data
Data Application vs. Data Storage/Data Warehouse Unstructured data vs. Structured data User behavior data , personalized offering Online advertisement: mass display user targeting Real-time bidding (RTB): pay for traffic  pay for audience

20 Environmental feature
Value Add by Variety of Data Advertisement CTR lift 1,235% 550% 132% Environmental feature Demographics feature Behavioral feature Source: Acxiom Chief Analytics Officer, Dr. Jie Cheng

21 Connect & Combine Data Connect multiple data sources to generate collective & collaborative value Mutually beneficial Waze: crowd sourcing Airline companies: code share / alliance

22 Alibaba Embraces Big Data Revolution
Data Scientists are very popular at Silicon Valley and worldwide Alibaba Data Science Team Mission Improve business efficiency by enabling data-driven operation in LOB Create new data product to support multiple LOBs Utilize data to drive business innovation

23 Data Science in E-Commerce
Product Planning Reduced JuHuaSuan manual workload Increase Tmall “瞄一眼”revenue per slot User Targeting Sina Weibo Targeting O2O Xiami Music Product Buyers matching

24 Apps Supported by Data Science

25 Apps Supported by Data Science
Walle Model

26 A Graph-based Framework
Construct dynamic activity graph Buyers Product i wij u Puv Wij could be multi-dimension vector. Could be at clustering level j v Sellers Wij is a summary statistics of links between buyers i & j conditioning on given basket of product. Puv is a summary statistics of product u & v joint activity conditioning on given group of buyers.

27 A Graph-based Framework
Discover potential buyers of a particular product Buyers Product i wij u Puv j v Sellers Product search: for target product u, search its top KNN on {Puv} and Construct such activity graph conditioning on top KNN Transform edge distance as 1 / Starting from the nodes of known buyers, vote by shortest path Approximate Dijkstra's algorithm wij

28 Use Case: sales planning
Predict product sales Lots of features (DSR, user, price, etc.) Large scale model training Productized (app)

29 Predict Best Selling Product
Business Flow: Sellers register for sales event: discount & quantity Operators choose who will participate the sale event Model: Sales prediction for each proposed product Objective: Help operators select best-selling products

30 Who will buy this perfume? Will he/she buy perfume?
Predict Best Selling Product Gradient Boosting Decision Tree Parallelized Multiple Decision Tree,sample size of a billion Who will buy this perfume? age? occupation? rating? 不买 信息量要求比较低: 我们最多只需要问两个问题,就可以给出结论 Will he/she buy perfume?

31 Use Case: User Targeting
Operators Offline sellers What is my target users’ profile ? Where are my hidden customers? How to prioritize dissemination to different groups ? Is my brand image well aligned with targeted customers? Do I need to reach potential homebuyers? Do I need to reach lottery players? Who are Nike fans? What is the best channel to reach my targeted customers? Where can I find the hidden elite customers?

32 Use Case: User Targeting
Traditional User Profiling (user tags): KYC (Know Your Customer): Demographics profiling Post-event Too many user tags, often confusing to users Users’ intent is hard to infer Cannot differentiate importance/relevance of different tags

33 Use Case: User Targeting
Big Data enables user targeting CYC (Catch Your Customer) Fuse data from multiple sources in the eco system Propensity Model , semi-supervised / supervised Give answer to: Who, when, where, what Discriminative of different tags.

34 Use Case : User Targeting
Objective: show one product’s promotion ads to buyers Take a typical Internet ads campaign record for example - Post-event analysis:calculate targeted users’ likelihood and assign to groups Note: Categorize users by likelihood into 6 groups corresponding to descending scores 5, 4, 3, 2, 1, 0 Source: Alibaba Group’s project in 2013

35 Many Challenges for Data Scientist
Help people discover products of distinctive characteristics

36 Personalization on Mobile Taobao
高大上 女性

37 Personalized Recommendation

38 Beyond E-Commerce: HealthCare
Source: news.alibaba.com, March 18, 2014 + Intelligent Instrumented Interconnected

39 Pandemic Risk Map Background
Pandemic disease in highly populated city requires early detection, prompt action, mass awareness Problems & Challenges Reported cases are in isolation, no predictive view Medical treatment cost, loss of productivity and workforce Solution Aggregate silo data in optimal resolution Predict risk in future Extrapolation to city-wide area Visualization on map cross platform Business Case Singapore project of X-Dengue

40 Mobile: Connected HealthCare
Disease management Medical Home Health Wellness Medication Adherence Mobile Care Management – e-prescribing, doctor/hospital directory Remote Video Coaching Real-time Biometric Display Wellness, Medical Devices Integration Data Capture Trend / Chart Storage, Store and Forward Questionnaire Alert ( , SMS) Alarm Video Chat DATA Analytics Server Interface Reminders

41 Prevention, Prediction, Participation, Personalization
Patient Info Patient matching 4P Applications Use Diagnosis Support assessment analytics Practice Management Resource Allocation (Patient-Physician Matching) EMR Patient characterization Predictive Analytics Care management Patient Segmentation (Utilization Patterns) 41 41 41

42 Big Data got its limitations !
Leinweber, David J.: “Stupid data miner tricks: overfitting the S&P 500.”  The Journal of Investing 16.1 (2007): S&P500 has 99% correlations with : 1. Bangladesh Cheese production 2. American Cheese production 3. Total number of Sheep in USA & Bangladesh

43 Gracias Grazie Merci Danke Obrigado Traditional Chinese Italian Thai
Spanish Merci French Russian Obrigado Brazilian Portuguese Arabic Danke German Simplified Chinese Japanese


Download ppt "Big Data, Big Analytics and Informed Actions"

Similar presentations


Ads by Google