Presentation is loading. Please wait.

Presentation is loading. Please wait.

Qingxia Liu qxliu.nju@gmail.com A Generative Interpretation of RDF Dataset  and its Application in Summarization Qingxia Liu qxliu.nju@gmail.com 2019/4/6.

Similar presentations


Presentation on theme: "Qingxia Liu qxliu.nju@gmail.com A Generative Interpretation of RDF Dataset  and its Application in Summarization Qingxia Liu qxliu.nju@gmail.com 2019/4/6."— Presentation transcript:

1 Qingxia Liu qxliu.nju@gmail.com
A Generative Interpretation of RDF Dataset  and its Application in Summarization Qingxia Liu 2019/4/6 Websoft Research Group

2 Motivation & Goal To understand a dataset Goal Applications
Build an abstract model which fits the actual data most Applications RDF dataset summarization Query generation

3 Traditional Perspectives
Triple set A set of triples (?s, ?p, ?o) Entity graph Node link graph of entity nodes Vertex clustering: type, attribute[1][2][4] Pattern extraction[5] Sentence graph a graph of RDF sentences Salient sentence extraction: centrality measurements[3] Topic graph? nodes are equivalent if they have the same set of outgoing and incoming paths.

4 What is Topic? A topic is … A distribution of a bag of words
A facet of data that describes an entity 人的基本信息、文章发表信息、学习信息

5 What is Topic? A Generative Story Topic分布的分布:Dirichlet分布;
一个topic是一个一般骰子;每个topic对应一个词分布; 每个entity,选择一个topic分布(即骰子的分布),从这个topic分布(一堆骰子)中,选K个骰子(word分布),每个骰子投一次各得到一个word;

6 Why topic? Why not Classes?
Conceptual meaning of the entity, not the feature of data Usage of concepts from different ontologies Existence of wrong, superfluous or insufficient labeled concepts Danyun Xu Classes: graduateStudent, Person Properties(actual data): as a person: name, gender, Address, homepage, telephone as a student: takenCourse, memberOf as a reseacher: undergraduateDegreeFrom, advisor, publication

7 Why topic? What’s the dataset talking about?
Basic info of people in a city? name,gender, occupation, weight, height,… Course taken data in a university? takenCourse, grade Info about researchers? Degree, undergraduateDegreeFrom, publication

8 Preliminary Results topics on LUBM

9 To Do More investigation Experimental effects Evaluate methods

10 References Campinas, Stéphane, Renaud Delbru, and Giovanni Tummarello. "Efficiency and precision trade-offs in graph summary algorithms." Proceedings of the 17th International Database Engineering & Applications Symposium. ACM, 2013. Tran, Thanh, Lei Zhang, and Rudi Studer. "Summary models for routing keywords to Linked Data sources." The Semantic Web–ISWC Springer Berlin Heidelberg, Zhang, Xiang, Gong Cheng, and Yuzhong Qu. "Ontology summarization based on rdf sentence graph." Proceedings of the 16th international conference on World Wide Web. ACM, 2007. Tian, Yuanyuan, Richard A. Hankins, and Jignesh M. Patel. "Efficient aggregation for graph summarization." Proceedings of the 2008 ACM SIGMOD international conference on Management of data. ACM, 2008. Basse, Adrien, et al. "DFS-based frequent graph pattern extraction to characterize the content of RDF Triple Stores." (2010).

11 Thank you! Any Questions?


Download ppt "Qingxia Liu qxliu.nju@gmail.com A Generative Interpretation of RDF Dataset  and its Application in Summarization Qingxia Liu qxliu.nju@gmail.com 2019/4/6."

Similar presentations


Ads by Google