Presentation is loading. Please wait.

Presentation is loading. Please wait.

Analytics as a First-Class Concern

Similar presentations


Presentation on theme: "Analytics as a First-Class Concern"— Presentation transcript:

1 Analytics as a First-Class Concern
June 3, 2016 Calum Murray Small Business Data Chief Architect, Intuit

2

3 Accounting Professionals
Who we serve: Small Businesses Accounting Professionals Consumers

4 Our mission: To improve our customers’ financial lives so profoundly… they can’t imagine going back to the old way

5 Transformation to a cloud ecosystem
As Intuit evolved QuickBooks, QuickBooks Payroll, QuickBooks Payments, and other product offerings into a SaaS business and an open cloud platform, business analytics could no longer be treated as an afterthought – it had to be part of the platform architecture as a first-class concern. Desktop Business SAAS Business Portfolio of Products Ecosystem

6 Intuit analytics problem space
Solve for data lifecycle All stages are needed Solve for internal users Data runs the business Solve for external users Data enables customer delight

7 Internal stakeholders
Marketing – Level of campaign success Product – First time use, driving attach Care – 360 view of customer, understanding product usage Sales – success against sale’s targets Finance – financial reports, how is the business doing

8 Top platform analytics data concerns
Applications 1 Key data sources Clickstream Transactional user-entered data Back office data and insights Key cross-cutting concerns Traceability – customer ID, transaction ID REACTive platform architecture Analytics infrastructure Model congruity Sources of truth Micro services Key data sources Write Read Read Read Key cross-cutting concerns 2 7 8 OLTP product DBs 4 PUB REACT REACT REACT 5 KAFKA REACT 5 Analyst tools Back office systems SPARK Streaming 6 3 Consume Ingest Enterprise 4 Ingest (batch) EL Marketing Care Data lake (Hadoop/Hive) Data warehouse (Vertica)

9 Key data sources Entry points Product usage Product data Clickstream
Transactional Billing Customer contacts Campaign metrics Life-time value Propensity scores Enterprise Insights

10 Key cross-cutting concerns
Analytics Infrastructure Designed as part of SAAS platform Batch and near-realtime Congruent models Single sources of truth Reactive pattern Clickstream Transactional Traceability One ID(s) to bind them Customer ID Transaction ID Consume and feed back Enterprise Insights

11 Where we started (in the cloud)
Applications 1 1 Monolithic, siloed applications, inconsistent clickstream collection 4 4 Monolithic data stores with disparate models, multiple sources of truth. 2 2 3 Siloed enterprise data Analyst tools Analyst tools Fragmented IDs, no traceability across applications 4 5 Ingest Ingest Ingest Consume 5 Batch transactional data ingestion. No real time. Consume 3 Consume Enterprise 6 Enterprise data/insights not going into lake. Enterprise systems pulling data into their own data warehouses. Marketing EL EL 6 Care Data lake (Hadoop/Hive) Data warehouse (Neteeza)

12 Not big but complex given the many sources and shapes
Size of data OLTP Customer data ~70 TB across 10+ schemas Data warehouses Analytics ~100 TB Risk ~ 31 TB Click stream ~50TB Not big but complex given the many sources and shapes

13 The journey we are on ... Decomposition and re-decomposition of platform Break up monoliths and reassemble as decomposed services Define single sources of truth Data encapsulation and model alignment – data storage and APIs 1 Micro services Write Read Read Read 1 2 Asynchronous near real-time architecture Move platform to REACT pattern Make analytics part of the platform Single sources of truth PUB REACT REACT REACT 2 3 One data lake and analytics system Kill the clones and centralize KAFKA REACT Analyst tools 4 Back office integration-virtuous cycle Kill the clones and centralize Back office systems SPARK Streaming Consume 3 Consume Ingest Enterprise 4 Ingest (batch) EL Marketing Care Data lake (Hadoop/Hive) Data warehouse (Vertica)

14 The journey we are on – people
Insufficient investment in people for data Concentration on application/services engineers Congruent horizontal data not viewed as necessity Analytics was an afterthought Investment after the fact was even bigger Cleaning up the mess All layers are impacted to get to good state Invest in data early or pay the price later

15 Key takeaways so far Analytics needs to be part of your platform – not an adjunct Data models in application have big impact on ability to get insights Lack of traceability in application will torpedo you – hard to add after the fact Analytics pipeline needs to be treated as first-class, deployable software You need engineers as well as data scientists You need CI/CD, unit testing, the right environments REACTive platform architecture makes it easier To decompose your models To do near real-time analytics Tooling is very important Dashboards Automated reporting

16 Q&A


Download ppt "Analytics as a First-Class Concern"

Similar presentations


Ads by Google