Presentation is loading. Please wait.

Presentation is loading. Please wait.

Impact of different relation extraction methods on network analysis results Jana Diesner.

Similar presentations


Presentation on theme: "Impact of different relation extraction methods on network analysis results Jana Diesner."— Presentation transcript:

1 Impact of different relation extraction methods on network analysis results Jana Diesner

2 Motivation Text DataNetwork DataApplications Need: scalable, reliable, robust methods & tools Unstructured At any scale Network Analysis Answer substantive and graph-theoretic questions Develop and test hypothesis and theories Visualizations Populate databases Input to further computations, e.g. simulations, machine learning

3 Research Questions and Relevance How do network data and analysis results obtained by using different relation extraction methods compare to each other? Why does it matter? –Increased comparability, generalizability, transparency of methods and tools –Increased control and power for developers and users –Supports drawing of reasonable and valid conclusions

4 Relation Extraction Methods Proximity-based linkage of nodes Database query Proximity-based linkage of nodes Meta- Data Text, manual (TextM) Text, automated (TextA) Meta-data (META) Subject Matter Experts (SME) Codebook

5 Data 5 Sudan CorpusFunding CorpusEnron Corpus GenreNewswireScientific WritingEmails Size80,000 articles56,000 proposals53,000 emails SourceLexisNexisCordisFERC/ SEC Time span8 years22 years4 years Text-based networks Article bodiesProject descriptionEmail bodies Meta-data network Index termsIndex terms and collaborators Email headers Large-scale, over-time, open source data from different domains

6 Results I 1.Text automated vs. manual: total number of nodes of sub-type “generic” far higher than “specific” –Rethink focus of network analysis: collectives vs. individuals –Importance of detecting unnamed entities 2.Ground truth data (SME) hardly resembled by analyzing text bodies and not at all by meta-data networks –In most ideal case, 50% of nodes and 20% of links 3.Agreement in structure and key entities depends on type of network

7 Results II 3.Agreement between text-based, and with meta-data depends on type of network Type of Network Text-Based NetworksMeta-Data Network Social networks - Substantial overlap between manual and automated, esp. w.r.t. key players - Localized view on geo- political entities and culture -Major international key players -Small overlap in key entities with text-based networks Knowledge networks - Gist of information in terms of common sense entities - Minimal overlap between manual and automated - Seem more informative (mini-summaries) -Less coreference resolution issues - Minimal overlap with text- based For more complete view, combine automated text-based with meta-data network

8 Acknowledgements This work was supported by the National Science Foundation (NSF) IGERT 9972762, the Army Research Institute (ARI) W91WAW07C0063, the Army Research Laboratory (ARL/CTA) DAAD19-01- 2-0009, the Air Force Office of Scientific Research (AFOSR) MURI FA9550-05-1-0388, the Office of Naval Research (ONR) MURI N00014 ‐ 08 ‐ 11186, and a Siebel Scholarship. Additional support was provided by the CASOS Center at Carnegie Mellon University. The views and conclusions contained in this talk are those of the author and should not be interpreted as representing the official policies, either expressed or implied, of the NSF, ARI, ARL, AFOSR, ONR, or the United States Government. 8 Thank You! Questions, Comments, Feedback: jdiesner@illinois.edu


Download ppt "Impact of different relation extraction methods on network analysis results Jana Diesner."

Similar presentations


Ads by Google