Presentation on theme: "AllegroGraph as a Graph Database Jans Aasman, Ph.D. CEO - Franz Inc"— Presentation transcript:
AllegroGraph as a Graph Database Jans Aasman, Ph.D. CEO - Franz Inc
Contents AllegroGraph as a –QuintupleStore (well OcttupleStore in 2011) –RDF store –Graph Database Agraph architecture Extreme use cases –AMDOCS … CRM on top a trillion triples –Pharmaceutic … explore connections in graph space –Demo
Agraph as a quintuple store S, P, O, G + unique ID + transaction # SPOG can be any data type after Jans loves pizza file112 NoOnebelieves 12 And include very efficient geospatial and temporal representations and indices 6 default indices, 24 user controlled indices Range indexing, Freetext Indexing Neighborhood matrixes & UPI maps (for 1 ms access) 2011: time, security
Agraph as an RDF store RDF store when you adhere to the RDF conventions. Full Sparql 1.0, most of Sparql 1.1 RDFS++ reasoner GeoSpatial and Temporal representations. Prolog for Rules Soon Common Logic (CLIF+) –As a usability layer on top of Prolog –Easier to combine Rules and Queries
Agraph as a Graph Database If you want a Property Graph: –use the graph argument Jans loves pizza gr1 gr1 weight 90 gr1 author Sophia
Schema Node typing Edge typing Attributes (nodes) Attributes (edges) Directed edges Undirected edges Restricted edges Loop edges Attribute indexing Starting node Schema Yes Yes: A trusts B gr1, gr1 certainty 80. Yes: A trusts B Yes: if using RDFS symmetric property or generators Yes, if it means there can be islands. Yes, A loves A Yes No, although, is that a DB property? Yes and No: On demand you can use Ontology and validation is straight forward
Database Transactional ACID Fully Indexed Distributed Cache Embeddable Store-engine Migration framework Object mapping Yes Federation (in-machine, between machines), AG5 Yes, adjacency vectors (neighbourhood matrics) Yes: 3.3, No: 4.2.x Custom From RDB to Graph DB? Various Only in Lisp, not in clients.
Languages Java Python Ruby C# Scala Clojure Perl PHP
Many graph algorithms using generator model Because of Social Network Analysis requirements we implement many graph algorithms. –Using generators –A first class function that takes One node as input Returns all children And neighbourhood matrices (or adjacency hash-tables) for speed.
how far is Actor1 from Actor2? Degrees of separation –How far is P1 from P2 Connection strength –How many shortest paths from P1 to P2 through a series of predicates and rules
In what groups is this actor? Find the ego-network around a person or thing –Friend, friends of friends, etc. Find all the fully connect graphs around a person or thing
Questions in SNA: How Important is an actor? In-degree, out-degree Actor degree centrality –I have the most connections in a group so I am more important Actor closeness centrality –I have more shortest paths to anyone else in the group so I am more important Actor betweenness centrality –I am more often on the shortest path between other people in the group so I am more important. I can control flow of information better than other people
Has the group a leader, is the group cohesive? Group centralization –How centralized is this group? –Does this group have a leader –Is there someone controlling the information flow Group cohesiveness –How strong and well connected is this group –Are most people connected –What is the density
All search and SNA functions use Generators Generator –Input: one node –Output: list of nodes –Fully functional, can be complex sparql or prolog queries –Or just predicates and indication of direction
How to get from A to E?? subj pred obj a dinner-with b a kissed-with c c movie-with e b kissed-with d d movie-with e e dinner-with a (defgenerator knows (node) (objects-of :p dinner-with)) (defgenerator knows (node) (objects-of :p dinner-with) (subjects-of :p dinner-with))
How to get from A to E?? (defgenerator knows () (object-of :p dinner-with) (subject-of :p dinner-with) (object-of :p movie-with) (subject-of :p movie-with) (object-of :p kissed-with) (subject-of :p kissed-with)) (defgenerator knows () (undirected (dinner-with movie-with kissed-with)))
Sample SNA functions (Ego-group actor generator depth ?group) - binds ?group to group of nodes (Ego-group-members actor generator depth ?a) - bind ?a to every member in the group (Cliques actor generator min-depth ?cl) - binds ?cl to all cliques (Clique-members actor generator min-depth ?cl ?a) - binds ?cl to cliques and then iterates of ever member ?a in ?cl (Actor-centrality actor group generator ?num) - binds ?num to actorcentrality (Actor-centrality-members group ?actor ?num) - binds ?actor to every actor in group, ?centrality is centrality of that actor, we start with the actor with highest centrality. (Group-centrality group generator ?num) Actor = single node Group = list of nodes Depth = number Generator = generator
Where we use this? Amdocs: Know everything about every customer –Partitioned on customer –Most graph search centered in client Pfizer: help me find connections between drugs, diseases, genes, side effects in a sea of clinical trials –Just a mess of data –All graph search in server
Traditional Business Intelligence Can tell you ALL about the average customer but NOTHING about the individual.
Can you in < 1 second with one push of a button Predict the three most likely reasons why Joe Smith from Kansas is calling the call center? Bill unexpectedly high, loosing connection too often, doesn’t know how to use new subscription service? The ten last events that happened for JS? Phone calls, sms, downloads of movie, device stopped working, payment of bill, looking at map, search for local store. What is the likelyhood that he will change from T-Mobile to Sprint or AT&T? What are his ten most important friends and what devices do they have. And who is the first to change and who follows?
Can you in < 1 second with one push of a button What are the usual daily locations for this person? What kind of shops? What kind of services does he download, what kind of movies/music/games does he like, what products does he buy? Is his plan the right plan for him? Is he in a good mood? Is he a valuable customer, is he a good payer, what is your margin on him, how many times per month does he call a call center, does he look up help for mail on the internet? Can you predict if he is going to pay the bill?
Events Decision Engine Container Actions SBA Application Server “Sesame” AllegroGraph Triple Store DB AllegroGraph Triple Store DB Event Ingestion Event Ingestion Scheduled Events Scheduled Events Inference Engine (Business Rules) Inference Engine (Business Rules) Bayesian Belief Network Bayesian Belief Network Events Operational Systems Event Data Sources Amdocs Event Collector Amdocs Event Collector CRM RM Amdocs Integration Framework Amdocs Integration Framework OMS NW Web 2.0