Download presentation
Presentation is loading. Please wait.
Published byJonathan Delisle Modified over 6 years ago
2
Real-time Analytics with TIBCO ComputeDB and SpotfireⓇ
Jags Ramnarayan Ramraj Raghuvanshi
4
TIBCO® Connected Intelligence helps businesses connect to their data wherever it is, augments their ability to gain insight from it, and take faster action.
5
Objectives Key Drivers in Big Data Introducing ComputeDB
Enhancing TIBCO Analytics with data management Product Architecture Sample Use Cases Performance Next Steps
6
Big Data Drivers Today Data Volumes ML/AI key to analytics
Growing data volumes. Increased processing times Streaming as a first class analytics workload ML workloads have become mainstream Analytics on live data a mainstream pursuit Unified Analytics Adoption Growing Cloud Readiness Increasing complexity and cost in cloud solutions, often stitched from multiple products Growing expectations wrt elasticity, security, multi tenancy and robustness Unified API and infrastructure for batch, stream and interactive analytics growing rapidly Support business users and citizen data scientists
7
Mixed Workloads The Norm
Stream Processing Transaction (point lookups, small updates) Interactive Analytics Analytics on mutating data Correlating and joining streams with large histories Maintaining state or counters while ingesting streams
8
ComputeDB- A memory optimized big data platform
HDFS SQL NoSQL HISTORY Spark Execution (Worker) JVM - Long running Framework for streaming SQL, ML… Spark Driver IN-Memory Store Mutable, Transactional SPARK Cluster JDBC ODBC Spark Job Shared Nothing Persistence
9
Working With ComputeDB
Standards based access Use SQL (both ANSI SQL 92 and Spark SQL) from ODBC, or JDBC to ingest data, visualize using TIBCO SpotFire at extremely high levels of performance Interact with ComputeDB using the Spark API Every table exposed as dataframe. Run streaming jobs, ML applications or batch analytics on the same data. The data store is also a Spark distribution. Supports Java, Python, Scala & R APIs Run on-premise or in the cloud Supports Kubernetes based container orchestration of cluster elements Supports high concurrency , security, multi-tenancy and real time HA through replication
10
TIBCO Analytics With ComputeDB- Better Together
Spotfire Blazing fast visualizations with in cluster transformations Deep integration with ComputeDB AQP engine planned for 2019 Access external data source through ComputeDB using external tables Data Science ComputeDB as the cloud ready scalable compute and data backbone for TIBCO Data Science (Future State) Spotfire Streams Support ComputeDB as a ingest platform for SpotFire streams (Future State) Native support for Apache Kafka in ComputeDB (Today)
11
ComputeDB- A Spark Based Big Data Platform
Spark API (Streaming, ML, Graph) Transactions, Indexing Full SQL HA DataFrame, RDD, DataSets Rows Columnar IN-MEMORY Spark Cache Synopses (Samples) Unified Data Access (Virtual Tables) Unified Catalog Native Store COMPUTE DB HDFS/HBASE S3 JSON, CSV, XML SQL db Cassandra MPP DB Stream sources Spark Jobs, Scala/Java/Python/R API, JDBC/ODBC, Object API (RDD, DataSets) GemFire
12
ComputeDB Under The Hood
Cluster Manager & Scheduler Snappy Data Server (Spark Executor + Store) Parser OLAP TXN Synopsis Data Engine Distributed Membership Service HA Stream Processing Data Frame RDD Low Latency High HYBRID Store Probabilistic Rows Columns Index Query Optimizer Add / Remove Server Tables ODBC/JDBC
13
Tax Reconciliation At J&J
This capability is worth 60m to JnJ in savings & is in production CDC Streams NoSQL window Spark Transform (Data Prep) In-memory Row-Column Tables SpotFire Dashboards Raw Data Ingestion & Prep Rich SPARK APIs NoSQL Connectors SQL ComputeDB
14
ComputeDB Typical Usage Patterns
Native Data Store For Spark Applications Interactive Analytics for Visualization Driven Analytics Upto 20X faster than Spark Supports virtual external tables Bundled Spark connectors High performance querying engine with support for slice and dice analytics. Stream Analytics Model Scoring at scale Combine real time stream analytics with in cluster analytics on big data Supports the ability to deploy models and score incoming events against that
15
Use Case Examples IoT analytics for Smart City
Ingest large data volumes, analyze micro batches, join with historical data to spot patterns RFQ Analytics For Large Investment Bank Complex analytics to requiring multiple joins on large data volumes along with low latency lookups on specific instruments Real Time Marketing Campaigns For Telco Targeting real time subscribers who have opted in to receive notifications for marketing promos, based on their location and other factors.
16
Enterprise Ready At Scale
Enterprise Grade Security Support for LDAP based authentication & authorization, row level security, TLS based data encryption across cluster High Concurrency, Performance & Scale Intelligent workload management, support for resource pools, standard CRUD operations bypass all scheduling mechanisms. High availability built into the product. Management & Monitoring Web based console that shows all cluster activity, membership, memory and CPU usage and query level performance in the cluster
17
ComputeDB- Performance Benchmark
1.5-2x faster ingestion, faster trx 7-142× faster analytics (at 300M records)
18
ComputeDB- Blazing Speeds For Approx. Queries
19
ComputeDB-TPC-H Benchmark
Avg Latency SnappyData MemSQL Spark 5.7s 100 GB 12.0s 66.9s
20
ComputeDB-Next Steps Download and get familiar with TIBCO ComputeDB with the developer edition Reach out to our global architects for consultation on your analytic use cases Learn about ComputeDB best practices with our detailed how-to guides
21
Lab Overview Launching a TIBCO ComputeDB cluster on pre-configured AWS instances. Getting an overview of the cluster using its web-based monitoring UI. Connecting to the cluster via snappy shell and load the test data into the cluster. Visually interacting with the data from TIBCO SpotFire Analyst
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.