Presentation is loading. Please wait.

Presentation is loading. Please wait.

Aules d’Empresa 2011 Aules d’empresa 2011 Hands-on course.

Similar presentations


Presentation on theme: "Aules d’Empresa 2011 Aules d’empresa 2011 Hands-on course."— Presentation transcript:

1 Aules d’Empresa 2011 Aules d’empresa 2011 Hands-on course

2 Nom e la presenatació o altra info (opcional) Aules d’Empresa 2011 Contents Introduction DEX API Running example Database construction Validate database construction Script loaders Query database Graph algorithms

3 Nom e la presenatació o altra info (opcional) Aules d’Empresa 2011, a graph database Graph databases focus on the structure of the model. Nodes and edges instead of tables. Implicit relation in the model. DEX is a programming library which allows to manage a graph database. Very large datasets. High performance query processing.

4 Nom e la presenatació o altra info (opcional) Aules d’Empresa 2011 Basic concepts Persistent and temporary graph management programming library. Data model: Typed and attributed directed multigraph. Node and edge instances belong to a type (label). Node and edge instances may have attribute values. Edge can be directed or undirected. Multiple edges between two nodes. Type of edges: Materialized: directed and undirected. Virtual: constrained by the values of two attributes (foreign keys) Just for navigation

5 Nom e la presenatació o altra info (opcional) Aules d’Empresa 2011 A graph model

6 Nom e la presenatació o altra info (opcional) Aules d’Empresa 2011 API Java library: jdex.jar  public API Native library Linux: libjdex.so Windows: jdex.dll System requirements: Java Runtime Environment, v1.5 or higher. Operative system: Windows – 32 bits Linux – 32 and 64 bits

7 Nom e la presenatació o altra info (opcional) Aules d’Empresa 2011 Core API – class diagram DEX Session 1 N DbGraph 1 1 RGraph 1 N Graph Graph factory Persistent DB Temporary Objects 1 N GraphPool 1 N Set of OIDs

8 Nom e la presenatació o altra info (opcional) Aules d’Empresa 2011 Core API – main methods DEX open(filename)  GraphPool create(filename)  GraphPool close() GraphPool newSession()  Session Session getDbGraph()  DbGraph newGraph()  Rgraph close() Graph newNodeType(name)  int newEdgeType(name)  int newNode(type)  long newEdge(type)  long newAttribute(type, name)  long setAttribute(oid, attr, value) getAttribute(oid, attr)  value select(type)  Objects select(attr, op, value)  Objects explode(oid, type)  Objects Objects.Iterator hasNext()  boolean next()  long Objects add(long) exists(long) copy(objs) union(objs) Intersection(objs) difference(objs)

9 Nom e la presenatació o altra info (opcional) Aules d’Empresa 2011 Running example DEX dex = new DEX(); GraphPool gpool = dex.create(“C:/image.dex”); Session s = gpool.newSession(); … s.close(); gpool.close(); dex.close();

10 Nom e la presenatació o altra info (opcional) Aules d’Empresa 2011 Running example … s.beginTx(); DbGraph dbg = s.getDbGraph(); int person = dbg.newNodeType(“PERSON”); long name = dbg.newAttribute(person, “NAME”, STRING); long age= dbg.newAttribute(person, “AGE”, INT); long p1 = dbg.newNode(person); dbg.setAttribute(p1, name, “JOHN”); dbg.setAttribute(p1, age, 18); long p2 = dbg.newNode(person); dbg.setAttribute(p2, name, “KELLY”); long p3 = dbg.newNode(person); dbg.setAttribute(p3, name, “MARY”); s.commitTx(); … JOHN 18 KELLY MARY

11 Nom e la presenatació o altra info (opcional) Aules d’Empresa 2011 Running example … s.beginTx(); DbGraph dbg = s.getDbGraph(); int friend = dbg.newUndirectedEdgeType(“FRIEND”); int since = dbg.newAttribute(friend, “SINCE”, INT); long e1 = dbg.newEdge(p1, p2, friend); dbg.setAttribute(e1, since, 2000); long e2 = dbg.newEdge(p2, p3, friend); dbg.setAttribute(e2, since, 1995); … int loves = dbg.newEdgeType(“LOVES”); long e3 = dbg.newEdge(p1, p3, loves); s.commitTx(); … JOHN 18 KELLY MARY 2000 1995

12 Nom e la presenatació o altra info (opcional) Aules d’Empresa 2011 Running example … s.beginTx(); DbGraph dbg = s.getDbGraph(); int phones = dbg.newEdgeType(“PHONES”); int when = dbg.newAttribute(phones, “WHEN”, TIMESTAMP); long e4 = dbg.newEdge(p1, p3, phones); dbg.setAttribute(e4, when, 4pm); long e5 = dbg.newEdge(p1, p3, phones); dbg.setAttribute(e5, when, 5pm); long e6 = dbg.newEdge(p3, p2, phones); dbg.setAttribute(e6, when, 6pm); s.commitTx(); … JOHN 18 KELLY MARY 2000 1995 4pm 5pm 6pm

13 Nom e la presenatació o altra info (opcional) Aules d’Empresa 2011 Running example … s.beginTx(); DbGraph dbg = s.getDbGraph(); Objects persons = dbg.select(person); Objects.Iterator it = persons.iterator(); while (it.hasNext()) { long p = it.next(); String name = dbg.getAttribute(p, name); } it.close(); persons.close(); s.commitTx(); … 2000 1995 4pm 5pm 6pm JOHN 18 KELLY MARY

14 Nom e la presenatació o altra info (opcional) Aules d’Empresa 2011 Running example … Objects objs1 = dbg.select(when, >=, 5pm); // objs1 = { e5, e6 } Objects objs2 = dbg.explode(p1, phones, OUT); // objs2 = { e4, e5 } Objects objs = objs1.intersection(objs2); // objs = { e5, e6 } ∩ { e4, e5 } = { e5 } … objs.close(); objs1.close(); objs2.close(); … JOHN 18 KELLY MARY 2000 1995 4pm 5pm 6pm

15 Nom e la presenatació o altra info (opcional) Aules d’Empresa 2011 Database construction DEX Basics: Node and edge type: Public identifier: String. DEX identifier: Integer. Attribute: Public identifier: String. DEX identifier: Long. Object instances: DEX identifier (OID): Long.

16 Nom e la presenatació o altra info (opcional) Aules d’Empresa 2011 Database construction Nodes: int Graph#newNodeType(String name) Creates a new node type with the given unique name. Returns the DEX node type identifier. long Graph#newNode(int nodeType) Creates a new object belonging to the given node type. Returns the DEX object identifier.

17 Nom e la presenatació o altra info (opcional) Aules d’Empresa 2011 Database construction Edges: int Graph#newEdgeType(String name, bool directed) Creates a new edge type with the given unique name. Directed or undirected edge type. Returns the DEX edge identifier. int Graph#newRestrictedEdgeType(String name, int srcNodeType, int dstNodeType) Creates a new directed edge type with the given unique name. Returns the DEX edge identifier. (Integrity restriction) Source and destionation of the edge are restricted to the given node types. long Graph#newEdge(long tail, long head, int edgeType) Creates a new edge belonging to the given edge type. Tail is the source and head is the target (iff directed). Returns the DEX object identifier.

18 Nom e la presenatació o altra info (opcional) Aules d’Empresa 2011 Database construction Attributes: long Graph#newAttribute(int type, String name, short dataType, short kind) Creates a new attribute with the given unique name for the given node or edge type. Returns the DEX attribute identifier. “dataType” can be: Value#STRING, Value#INT, Value#LONG, Value#DOUBLE, Value#BOOL, Value#TIMESTAMP. “kind” can be: –Graph#ATTR_KIND_BASIC. Basic attribute (just set and get values). –Grahp#ATTR_KIND_INDEXED. Indexed attribute (set and get values as well as select operations) –Graph#ATTR_KINDUNIQUE. Indexed attribute. Unique (PK).

19 Nom e la presenatació o altra info (opcional) Aules d’Empresa 2011 Database construction Attributes: Class Value encapsulates different data types: String, Integer, Long, Double, Boolean, Timestamp. Use them to set and get attribute values for the objects. Graph#setAttribute(long oid, long attr, Value v) Sets the given Value for the given attribute to the given object identifier. Given attribute must be defined for the object’s type. Value ‘s data type must match attribute’s data type or NULL. Graph#getAttribute(long oid, long attr, Value v) Gets the Value for the given attribute for the given object identifier. Given attribute must be defined for the object’s value.

20 Nom e la presenatació o altra info (opcional) Aules d’Empresa 2011 Exercises All exercises are into the Netbeans project. Open the IDE and the project. Required data sets are stored into the “data” directory. Required libraries are stored into the “libs” directory. All exercises have a main method to be executed.

21 Nom e la presenatació o altra info (opcional) Aules d’Empresa 2011 Exercise 1 Create a synthetic DEX: Create the following schema. User (nickname string, …) Tweet (body string, …) tweets (…) // from User to Tweet Add some data. APIs to be used: Graph#newNodeType / Graph#newEdgeType Graph#newNode / Graph#newEdge Graph#newAttribute / Graph#setAtttribute Value

22 Nom e la presenatació o altra info (opcional) Aules d’Empresa 2011 Validate database construction APIs: GraphPool#dumpData(File f) Dumps a summary of the logical content of the graph database. GraphPool#dumpStorage(File f) Dumps internal information about storage content of the graph database. Graph#export(PrintWriter pw, short kind, Export e) Exports the graph to an external format. “kind” can be: GRAPHVIZ or YGRAPHML. Export implementation defines the visualization (if null, default export). Command-line shell: edu.upc.dama.dex.shell.Shell See edu.upc.dama.dex.shell package description.

23 Nom e la presenatació o altra info (opcional) Aules d’Empresa 2011 Exercise 2 Validate your database construction: Dump data summary. Dump storage summary. Default export. yED (Optional) Shell. APIs to be used: Graph#dumpData. Graph#dumpStorage. Graph#export. Shell.

24 Nom e la presenatació o altra info (opcional) Aules d’Empresa 2011 Script loaders Schema definition CREATE DBGRAPH alias INTO filename CREATE NODE node_type_name "(“ [attribute_name (INT|LONG|DOUBLE|STRING|BOOLEAN|TIMESTAMP|TEXT) [INDEXED|UNIQUE|BASIC],...] ")“ CREATE [UNDIRECTED|VIRTUAL] EDGE edge_type_name [FROM node_type_name[.attribute_name] TO node_type_name[.attribute_name]] "(“ [attribute_name (INT|LONG|DOUBLE|STRING|BOOLEAN|TIMESTAMP|TEXT) [INDEXED|UNIQUE|BASIC],...] ") [MATERIALIZE NEIGHBORS]"

25 Nom e la presenatació o altra info (opcional) Aules d’Empresa 2011 Script loaders Load nodes LOAD NODES file_name COLUMNS attribute_name [alias_name], … INTO node_type_name [IGNORE (attribute_name|alias_name), …] [FIELDS [TERMINATED char] [ENCLOSED char] [ALLOW_MULTILINE]] [FROM num] [MAX num] [MODE (ROWS|COLUMNS [SPLIT [PARTITIONS num]])]

26 Nom e la presenatació o altra info (opcional) Aules d’Empresa 2011 Script loaders Load edges LOAD EDGES file_name COLUMNS attribute_name [alias_name], … INTO node_type_name [IGNORE (attribute_name|alias_name), …] WHERE TAIL (attribute_name|alias_name) = node_type_name.attribute_name HEAD (attribute_name|alias_name) = node_type_name.attribute_name [FIELDS [TERMINATED char] [ENCLOSED char] [ALLOW_MULTILINE]] [FROM num] [MAX num] [MODE (ROWS|COLUMNS [SPLIT [PARTITIONS num]])]

27 Nom e la presenatació o altra info (opcional) Aules d’Empresa 2011 Script loaders APIs: edu.upc.dama.dex.script.ScriptParser Command-line tool: edu.upc.dama.dex.script.ScriptParser See edu.upc.dama.dex.script package description.

28 Nom e la presenatació o altra info (opcional) Aules d’Empresa 2011 Twitter data model This is the data model based on Twitter to be used during the exercises.

29 Nom e la presenatació o altra info (opcional) Aules d’Empresa 2011 Exercise 3 Create the Twitter database: Complete the schema definition script. Complete the loader script. APIs to be used: ScriptParser. Resources: CSV files into the “data/twitter” directory. Script files into the “data/twitter/scripts” directory (*.des).

30 Nom e la presenatació o altra info (opcional) Aules d’Empresa 2011 Exercise 4 Once again, validate your database construction: Dump data summary. Dump storage summary. Default export. yED (Optional) Shell APIs to be used: Graph#dumpData Graph#dumpStorage Graph#export Shell

31 Nom e la presenatació o altra info (opcional) Aules d’Empresa 2011 Query database Retrive data: Class Object Set Iterable Stores large sets of object identifiers. –No order. Combine operations: –Union. –Intersection. –Difference.

32 Nom e la presenatació o altra info (opcional) Aules d’Empresa 2011 Query database Retrive data: Objects Graph#select(int t) Retrieves object identifiers belonging to the given node or edge type. Objects Graph#select(long attr, short op, Value v) Retrieves object identifiers which satisfay the query. “op” can be: Graph#OPERATION_{EQ|NE|GT|GE|LT|LE|LIKE|ERE} long Graph#findObj(long attr, Value v) Retrieve object identifier which has the given value for the given attribute (or INVALID_OID if not found).

33 Nom e la presenatació o altra info (opcional) Aules d’Empresa 2011 Query database Navigation: Objects Graph#explode(long oid, int edgeType, short direction) Retrieves out-going or in-going edges (or both) from or to the given object and for the given edge type. “direction” can be: Graph#EDGES_IN, Graph#EDGES_OUT, Graph#EDGES_BOTH. Objects Graph#neighbors(long oid, int edgeType, short direction) Retrieves neighbor nodes to the given object which can be reached through the given edge type and direction. “direction” can be: Graph#EDGES_IN, Graph#EDGES_OUT, Graph#EDGES_BOTH.

34 Nom e la presenatació o altra info (opcional) Aules d’Empresa 2011 Graph algorithms “edu.upc.dama.dex.algorithms” package. Traversals: Iterator Returns node identifiers. TraversalBFS Breadth-first search. TraversalDFS Depth-first search. Shortest path: SinglePairShortestPathBFS Unweighted graph. SinglePairShortestPathDijkstra Weighted graph. User can specify which node or edge types can be used for the navigation.

35 Nom e la presenatació o altra info (opcional) Aules d’Empresa 2011 Attribute values Class Values: Different attribute values Iterator. Iterator Ascendent or descendent order. Retrieve Values: Values Graph#getValues(long attr, short order) Retrieve Values for the given attribute. “order” can be: Graph#ORDER_ASCENDENT, Graph#ORDER_DESCENDENT.

36 Nom e la presenatació o altra info (opcional) Aules d’Empresa 2011 Exercise 5 Basic queries: Get “Tweet”s from a “User”. 1-hop navigation. Get “Tweet”s which share 2 (or more) given “Hastag”s. Objects combination. Shortest distance between two given “User”s. Just navigate through the “follows” relationship. Use database created at Exercise 3. APIs to be used: Graph#findObj /Graph# select Graph#neighbors Objects SinglePairShortestPath

37 Nom e la presenatació o altra info (opcional) Aules d’Empresa 2011 Exercise 6 Updates: Create an attribute for each “User” to store the number of references (“depicts”) to the “User”. Compute and store the value for each “User”. Find the most popular “User”. The most referenced one. Use database created at Exercise 3. APIs to be used: Graph#degree Graph#newAttribute / Graph#setAttribute Values

38 Nom e la presenatació o altra info (opcional) Aules d’Empresa 2011 Export Graph#export(PrintWriter pw, short kind, Export e) “kind” can be: GRAPHVIZ or YGRAPHML. Implement Export interface to define the visualization. NodeExport getNode(long oid) It is called for each existing node identifier. Return a NodeExport instance which defines the visualization of the given node identifier. EdgeExoport getEdge(long oid) It is called for each existing edge identifier. Return an EdgeExport instance which defines the visualization of the given edge identifier.

39 Nom e la presenatació o altra info (opcional) Aules d’Empresa 2011 Exercise 7 Visualization: Update the given Export implementation. Check out how it updates the resulting visualization. yED APIs to be used: Export GraphExport NodeExport EdgeExport Graph#export

40 Nom e la presenatació o altra info (opcional) Aules d’Empresa 2011 Any question? DAMA Group Web Site: www.dama.upc.eduwww.dama.upc.edu Sparsity Web Site: www.sparsity-technologies.comwww.sparsity-technologies.com


Download ppt "Aules d’Empresa 2011 Aules d’empresa 2011 Hands-on course."

Similar presentations


Ads by Google