Presentation is loading. Please wait.

Presentation is loading. Please wait.

Big Data Technologies for InfoSec Dive Deeper. See Further. Ram Sripracha UCLA / Sift Security.

Similar presentations


Presentation on theme: "Big Data Technologies for InfoSec Dive Deeper. See Further. Ram Sripracha UCLA / Sift Security."— Presentation transcript:

1 Big Data Technologies for InfoSec Dive Deeper. See Further. Ram Sripracha (rsriprac@ucla.edu) UCLA / Sift Security

2 Experiences RR Systems

3 What are “Big Data” systems? XXL in Size Data Volume TBs - PBs Computation Scalability Horizontally Scalable Multi-host Deployment Commodity Hardware

4 Why now? Rich Ecosystem Well Supported Open Source Software High Adoption Rate Commercial Backings “Redhat” Model Heavily Invested

5 Platform Providers

6 Technologies

7 Is it a “Big Data” problem? Many moving parts Initially maybe overwhelming 100s of configuration setting Requests some level of expertise Overkill for some problems Larger resource footprint

8 Big Data Stack

9

10 DFS

11 NoSQL Columnar Sits on HDFS Million Rows x Million Columns Cell-level Security

12 Titan Graph-based Datastore Optimized for (E, V) Key/Value attributes for vertices and edges 100s million vertices x 100s billion edges Capturing relationships Sits on top of HBase, Cassandra, …

13 Map-Reduce

14 Resilient Distributed Dataset (RDD) In-Memory RDD Iterative Algorithms Machine Learning

15

16 Impala Near-real-time analysis Micro-batch processing Pipelining of micro-batches Stream annotations

17 Sits on top of Distributed indexing and search Indexes Raw text files from HDFS HBase content Titan properties Other data replicated data streams

18 Application Log Search Full Text Indexes Flexible Faceting Automatic field extraction Dashboard-able search interface Low-cost alternative to Splunk and other search solutions

19 Real-time Blacklist Alerting Fault tolerance Netflow annotation Match alerting Application access alerting Authentication alerting Network metrics

20 Netflow Data Warehouse 3x Nodes 2x 8-Core Intel E5-2450 per node 16Gb RAM per node 72TB Storage Total ~5B Netflow records/day >1 year retention Support complex SQL-like query

21 Netflow Data Warehouse Continuous scanning Direct querying of delimited file Perform metrics and diffs Compute trending Firewall rule validations Long retention DFS

22 EMR Access Anomalies Category of insider threat Relational networks of Users/Groups Department Document Access Community structure-based anomaly detection


Download ppt "Big Data Technologies for InfoSec Dive Deeper. See Further. Ram Sripracha UCLA / Sift Security."

Similar presentations


Ads by Google