Download presentation
Presentation is loading. Please wait.
1
The role of the data scientist
Why a car part manufactory company needs data experts ICT Innovation Andrea Condorelli, Manager Data Scientist Statale di Milano, Italy 5° of June, 2017
2
MAGNETI MARELLI Quick overview on MM with a special focus on most innovative products
WHY BIG DATA An overview on some hot topics/opportunities/ongoing projects in manufactory regarding Big Data FROM PRODUCT TO SERVICES PARADIGM Some companies where the business model moved from selling pieces of hardware to selling services WHO WORKS WITH DATA Data Scientist vs Data Engineer: identikit of the perfect data expert ENTERPRISE DATA SCIENTIST TOOLBOX Technologies and frameworks, security and privacy in companies OPPORTUNITIES IN MAGNETI MARELLI Q&A
4
HR Leadership Development Team
HR ORGANIZATION Giovanni Quaglia CHRO Stefano Facchetti Head of Leadership Development and Process & Systems Donatella Callerio Staffing and Recruitment Manager / Leadership Development Finance & ICT Marta Ragazzi Staffing and Recruitment / Leadership Development ICT
5
Data Science team ICT GOVERNANCE ICT INNOVATION Dario Castello
CIO ICT INNOVATION Luca Demarchi Head of ICT Innovation Condorelli Andrea Data Science Manager Valentina Arrigoni Alberto Catena Data Science team
6
MAGNETI MARELLI
7
Company Overview Magneti Marelli is an international company committed to the design and production of hi-tech systems and components for the automotive sector. AUTOMOTIVE LIGHTING (Headlamp, Rearlamp, Lighting and Body Electronics) POWERTRAIN (Gasoline and Diesel engine control, Electric Motor, Inverter and Transmission) ELECTRONICS (Instrument Clusters, Infotainment & Telematics) SUSPENSION SYSTEMS AND SHOCK ABSORBERS (Suspension Systems, Shock Absorbers and Dynamic Systems) EXHAUST SYSTEMS (Manifolds, Catalytic converter, Diesel Particulate Filter and Mufflers) PLASTIC COMPONENTS AND MODULES (Bumper, Dashboard, Central Console, Pedals, Hand Brake Levers and Fuel System) AFTERMARKET PARTS & SERVICES (Mechanical, Body Work, Electrics and Electronic and Consumables) MOTORSPORT (Injection Systems, Electronic Control Units, Hybrid Systems, Telemetry Systems, Electric Actuators)
8
PP: Production Plant R&D: R&D Center AC: Application Center
Magneti Marelli Worldwide Presence PP: Production Plant R&D: R&D Center AC: Application Center FRANCE PP - R&D – AC GERMANY PP - R&D – AC UK POLAND PP - AC CZECH REP. PP - AC USA PP - R&D – AC SLOVAKIA PP MEXICO PP - AC RUSSIA PP - AC SERBIA PP BRASIL PP – R&D - AC CHINA PP - R&D – AC ARGENTINA PP JAPAN AC MALAYSIA PP - AC ITALY PP – R&D – AC INDIA PP – R&D - AC KOREA SPAIN PP - AC TURKEY PP - AC 12 R&D Centers 5.9% R&D (of sales) 7.9 bn € Sales 2016 86 Production units 5.8% Investments (of sales) 30 Application Centers 42,830 Employees Sales (€ bn) 2009 ACT 2010 ACT 2011 ACT 2012 ACT 2013 ACT 2014 ACT 2015 ACT 2016 ACT
9
Organization COUNTRY/REGION REPRESENTATIVES GLOBAL KEY ACCOUNT
LATAM NAFTA COUNTRY/REGION REPRESENTATIVES INDIA INFORMATION & COMMUNICATION TECHNOLOGY CHINA HUMAN RESOURCES MANUFACTURING JAPAN TECHNOLOGY INNOVATION QUALITY MARKETING COMMUNICATION PROJECT MANAGEMENT OFFICE RISK GOVERNANCE GLOBAL KEY ACCOUNT PURCHASING GENERAL AFFAIRS BUSINESS DEVELOPMENT FINANCE BUSINESS AREAS MOTORSPORT CENTRAL FUNCTIONS AUTOMOTIVE LIGHTING SHOCK ABSORBERS AFTER MARKET PARTS & SERVICES ELECTRONICS EXHAUST SYSTEMS PLASTIC COMPONENTS & MODULES POWERTRAIN SUSPENSION SYSTEMS
10
WHY BIG DATA
11
How to turn data in money
Piece cost reduction: decrease number of scraps lower stocks enhance productivity Making new business: sell new services
12
Finite Product Warehouse
Complexity behind a “simple” product Factory Material Warehouse Pre Production Lines Raw materials External Logistic Internal Logistic Customer WIP Warehouse External Logistic Assembly Lines External Logistic WIP Finite Product Warehouse Internal Logistic
13
Industry 4.0
14
Finite Product Warehouse
Industry 4.0 – Traceability and IOT Factory External Logistic Material Warehouse Pre Production Lines Step 1 Step 2 Step 3 Raw material Material ID Step 1 data Machine 1 … Leave TS Item ID Arrive TS Lot ID Arrive TS Supplier Material info Item ID ts Step 1… Internal Logistic WIP Warehouse Assembly Lines Finite Product Warehouse Internal Logistic Step 4 Step 3 Step 2 Step 1
15
Industry 4.0 – Traceability and IOT
Enhancing recall campaigns Deeply understanding of each process Compute the real cost of each piece
16
Industry 4.0 – Predictive Quality
Step 1 Worker Machine Parameters Machine Sensors Step 2 Worker Machine Parameters Machine Sensors Step 3 Worker Machine Parameters Machine Sensors Material Warehouse WIP Warehouse SCRAP SCRAP SCRAP SCRAP SCRAP SCRAP
17
? ? Industry 4.0 – Predictive Quality SCRAP Step 1 Step 2 Step 3
Worker Machine Parameters Machine Sensors Step 2 Worker Machine Parameters Machine Sensors Step 3 Worker Machine Parameters Machine Sensors ? Material Warehouse SCRAP ?
18
Industry 4.0 – Predictive Quality
Classification/Prediction problem “Given a context, predict the probability the specific item will arrive to the following station/it will be discarded” A scrap could be done due to several reasons: Human error Some HW/SW machine failure Material problem Wrong process/issues on line design
19
Industry 4.0 – Predictive Quality
The context is pretty hard to describe (feature engineering): Each piece worked before each scrap has a very similar context Tasks are complex and different People are involved, it is hard to quantify: fatigue experience in a given task mood stress A lot of machines are sensorless The data change over time The problem is not linear and has “memory”
20
Industry 4.0 – Predictive Quality
Data Extraction Feature Engineering Classification Show Results Precision: very high zero
21
Descriptive Statistics
Industry 4.0 – Predictive Quality Data Extraction X Feature Engineering Classification Show Results Precision: Medium Low Descriptive Statistics Visual Exploration Visual Exploration Cleaning Data
22
Descriptive Statistics
Industry 4.0 – Predictive Quality Data Extraction Feature Engineering Classification Show Results Precision: >Medium >Low Lesson learned: some shift must be filtered out we must add additional pieces of information Descriptive Statistics Visual Exploration Visual Exploration Cleaning Data
23
Industry 4.0 – Predictive Quality
Data Extraction Hard Cleaning CLEAN DATA Feature Engineering Classification Show Results Precision: High High
24
Industry 4.0 – Predictive Quality
Reducing scraps working on “critical” context Simulating different context to “explore” new configurations (e.g., one arm bandit on team configurations) Reducing the cost of each scrap
25
FROM PRODUCT TO SERVICES PARADIGM
26
Rolls Royce 1904 F H Royce is founded in 1904 by Charles Stewart Rolls and Frederick Henry Royce 1915 The Rolls-Royce Eagle was the first aero engine to be developed by Rolls-Royce Limited. 1987 In April 1987 the government offered for sale all Rolls-Royce plc shares. 1996 Birth of TotalCare® as a service for America Airlines for motor repairing 2013 47% of total revenue (7.3B£) on plane engines are from services 2016 80% of Rolls-Royce engines are not sold, but rented out on a hourly basis.
27
WHO WORKS WITH DATA
28
Data team members Data Scientist Data Engineer Data Architect
29
Data Scientist Data Scientist Definition: Must know:
“Data Scientist (n.): Person who is better at statistics than any software engineer and better at software engineering than any statistician.” Josh Wills, Slack Director of Data Engineering Must know: Python, Sql, Supervised/Unsupervised models, linear algebra, statistic Main everyday tasks: Formalizing any given problem into specific research questions and looking for State of the Art solutions for them Designing and developing Proof of Concepts and Prototypes to show the real value behind data and algorithms Translating Proof of Concepts into something Business people can understand and creating stunning presentation
30
Data Engineer Data Engineer Definition: Must know:
A Data Engineer is a Data Scientist who prefers talking about infrastructures and design patterns over Bayesian statistics and XGBoost classifier Must know: Java/Scala, Python, Sql and noSQL DBs, design patterns Main everyday tasks: Moving Proof of Concepts from data scientist playground to production Designing, constructing, installing, testing and maintaining highly scalable data management systems Employing a variety of languages and tools (e.g. scripting languages) to marry systems together
31
Data Architect Data Architect Definition: Must know:
A Data Architect is a aged Data Scientist/Engineer with a lot of experience in enterprise infrastructures. Must know: Hadoop, SQL and noSQL DBs, design patterns, enterprise infrastructures, security, lambda/gamma architecture, Docker Main everyday tasks: Building data products from scraps Design and deploy complex data processing workflows Deploy Big Data ready environments to support Data Engineer and Data Scientist work
32
Ideal world Real world The team Data Superman 1*
1 Data Architect/Data team manager 2-3 Data engineer to support the development and go-lives 3-5 Data Scientist for fast prototyping and complex models/analysis 1* 2-3* 3-5*
33
Data Superman Learn how to use the whole data toolbox He must be able to face the majority of IT challenges by yourself, from bash to Dockerfile
34
Data Superman Learn how to use the whole data toolbox He must be able to face the majority of IT challenges by yourself, from bash to Dockerfile Learn how to explain your work A good story with average results is better than a boring story with good results (stunning results always win)
35
Data Superman Learn how to use the whole data toolbox He must be able to face the majority of IT challenges by yourself, from bash to Dockerfile Learn how to explain your work A good story with average results is better than a boring story with good results (stunning results always win) Learn how to write “good code” Working like the person you are going to show your code is a psychotic killer.
36
Data Superman Learn how to use the whole data toolbox He must be able to face the majority of IT challenges by yourself, from bash to Dockerfile Learn how to explain your work A good story with average results is better than a boring story with good results (stunning results always win) Learn how to write “good code” Working like the person you are going to show your code is a psychotic killer. Learn how to understand business needs Well posed questions are very rare
37
Data Superman Learn how to use the whole data toolbox He must be able to face the majority of IT challenges by yourself, from bash to Dockerfile Learn how to explain your work A good story with average results is better than a boring story with good results (stunning results always win) Learn how to write “good code” Working like the person you are going to show your code is a psychotic killer. Learn how to understand business needs Well posed questions are very rare Learn how to make models work Data Superman needs to be comfortable with mathematics and statistics
38
Data Superman Learn how to use the whole data toolbox He must be able to face the majority of IT challenges by yourself, from bash to Dockerfile Learn how to explain your work A good story with average results is better than a boring story with good results (stunning results always win) Learn how to write “good code” Working like the person you are going to show your code one day is a psychotic killer. Learn how to understand business needs Well posed questions are very rare Learn how to make models work Data Superman needs to be comfortable with mathematics and statistics
39
ENTERPRISE DATA SCIENTIST TOOLBOX
40
2-3 tools…
41
A short list basket Programming Language
42
A short list basket – Programming Language
Python R High-Level Lower-Level C++ Scala Java
43
A short list basket – Programming Language
Python Why Python: Fast prototyping language Easy to install and to manage Lambda functions Tensorflow + sklearn
44
A short list basket Programming Language Python Big Data framework
45
A short list basket – Big Data Framework
Apache Flink Apache Spark MapReduce Apache Storm
46
A short list basket – Big Data Framework
Why Spark: In-memory computation Great community Stable and easy to manage ML-LIB is great for fast prototyping Offered “as-a-service” by different players (Amazon EMR, Google Dataproc, Cloudera, DataBricks, …) Apache Spark
47
A short list basket Programming Language Python
Big Data framework Apache Spark DB
48
ElasticSearch + Kibana
A short list basket – DB MySQL MariaDB ElasticSearch + Kibana Cassandra MongoDB
49
ElasticSearch + Kibana
A short list basket – DB ElasticSearch + Kibana Why ElasticSearch+Kibana: Big Data ready Fast full text search (based on Lucene) Ultra-fast dashboarding (Kibana) MongoDB Why MongoDB: Big Data ready Document based queries Bson + array
50
A short list basket Programming Language Python
Big Data framework Apache Spark DB ElasticSearch+Kibana or MongoDB Other
51
A short list basket – Other
Plotly SkLearn Tensorflow Jupyter
52
A short list basket Programming Language Python
Big Data framework Apache Spark DB ElasticSearch+Kibana or MongoDB Other Python libraries: Plotly, SkLearn, Tensorflow “IDE”: Jupyter
53
A data product Spark Python HTML+JS Backend Frontend GUI
ML Algorithm ETL Plotly Django HTML5 Bootstrap MongoDB Backend Frontend GUI
54
A short list basket Programming Language Python
Big Data framework Apache Spark DB ElasticSearch+Kibana or MongoDB Other Python libraries: Plotly, SkLearn, Tensorflow “IDE”: Jupyter Container engine Docker
55
OPPORTUNITIES IN MAGNETI MARELLI
56
Opportunities in MAGNETI MARELLI
Marta Ragazzi: Andrea Condorelli: Big Data Workshop October 31st, 2015
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.