Presentation is loading. Please wait.

Presentation is loading. Please wait.

PreprocessingComputePost Proc. XML Raw Data ETL SliceCompute Repeat Subgraph PageRank Initial Graph Analyz e Top Users.

Similar presentations


Presentation on theme: "PreprocessingComputePost Proc. XML Raw Data ETL SliceCompute Repeat Subgraph PageRank Initial Graph Analyz e Top Users."— Presentation transcript:

1 PreprocessingComputePost Proc. XML Raw Data ETL SliceCompute Repeat Subgraph PageRank Initial Graph Analyz e Top Users

2 GraphX

3 HDFS Compute Spark Preprocess Spark Post. Raw Wikipedia XML HyperlinksPageRankTop 20 Pages 605 375

4 Id 3 7 5 2 SrcIdDstId 37 53 25 57 Property (E) Collaborator Advisor Colleague PI Property (V) (rxin, student) (jgonzal, postdoc) (franklin, professor) (istoica, professor) 3 7 5 2 Property GraphVertex Table Edge Table rxin stu. rxin stu. franklin, prof. istoica prof. istoica prof. jgonzal, pst.doc. Collab. PI Advisor Colleague

5 Data-ParallelGraph-Parallel Property Graph Table Result Row

6 Raw Wikipedia XML HyperlinksPageRankTop 20 Pages TitlePR Text Table TitleBody Topic Model (LDA) Word Topics WordTopic Editor Graph Community Detection User Community UserCom. Term-Doc Graph Discussion Table UserDisc. Community Topic Com.

7 Part. 2 Part. 1 Vertex Table (RDD) BC AD FE A D Property Graph Edge Table (RDD) A A B B A A C C C C D D B B C C A A E E A A F F E E F F E E D D B B C C D D E E A A F F Routing Table (RDD) B B C C D D E E A A F F 1 2 12 12 1 2 2D Vertex Cut Heuristic

8 Vertex CutEdge Cut


Download ppt "PreprocessingComputePost Proc. XML Raw Data ETL SliceCompute Repeat Subgraph PageRank Initial Graph Analyz e Top Users."

Similar presentations


Ads by Google