Heterogeneous Data Analysis: Tools, Methods, Applications Andrei Mogoutov | AGUIDEL www.aguidel.com.

Similar presentations

Presentation on theme: "Heterogeneous Data Analysis: Tools, Methods, Applications Andrei Mogoutov | AGUIDEL www.aguidel.com."— Presentation transcript:

1 Heterogeneous Data Analysis: Tools, Methods, Applications Andrei Mogoutov | AGUIDEL www.aguidel.com

2 Scientific Controversy « Spaces » News Streams Blog Sphere Web Sites E-Communication (Mailing Lists, Forums) Scientific Information Data Bases (Publications, Patents) Offline “Literature” Surveys/ Interviews Traditional Media (TV, Radio) Specific Data Bases Etc ?

3 Heterogeneous Data Sets ? Analytical methods and software tools for the treatment of heterogeneous data within a unified framework. Heterogeneity by source –Heterogeneous data means diverse types of data from different sources. For example, databases, surveys, questionnaires with open questions and codified variables, interviews and text collections. Heterogeneity by constitution –Heterogeneous data is not only from various sources, it is also varied internally. Thus different variables are represented within data sets. For example, geographic location, personal profiles, institutional affiliation, or semantic and lexical units. Software help users and analysists to understand relations of a multivariate nature between entities. It is the ‘hidden’ relations and dependencies within your data that the analysis makes evident. Heterogeneity by structure –In today’s world, complexity and diversity of data is unparalleled. Software works with this dynamism, from highly codified and detailed databases to survey data with numerical variables onward to ‘raw’ data. Heterogeneity by scale –Analytical solutions help you to negotiate and manage heterogeneity by source, constitution, structure, and of course, scale. Thus from global level of analysis and interconnection to the institutional, specific, and individual level, software makes data visible.

4 Traditions/Methods/Solutions Statistics / Data Mining Textual Analysis Tools / Text Mining Web Cartography Scientometrix Tools for Qualitative Data Analysis Social / Socio-Technical Networks GIS Etc ?

5 Heterogeneous Data Sets: Back Office “Offline” Questionnaires Bibliographical databases Existing Databases Email Templates Online Data Collection Tools Actor Location Actor Identity Contents Classification Schema Parsing & Matching System NETWORK ORIENTED DATABASE Web Crawler

6 Design of Analytical Solution Back Office - data tables - web crawler - matching tools - tools for textual analysis - tools for data update and control Front Office Middle Office - a layer of analytical queries - pre-defined queries for multilevel - data aggregation and synthetic analysis and indicators - graphical/analytical interfaces (GIS, Relational Mappings, Statistical Charts) - statistical tables, indicators and textual synthesis - integrated querying tools ON-LINEOFF-LINE DATA UPDATEFEED-BACK

7 “Online” Data Collection Tool

8 Front Office “Desktop”

9 Front Office “Online”

10 Scientometrix PubMed (Medline) ISI Derwent

11 Scientometrix 1 1010 100100 10 00 19841984 19851985 19861986 19871987 19881988 19891989 19901990 19911991 19921992 19931993 19941994 19951995 19961996 19971997 19981998 19991999 20002000 20012001 20022002 20032003

12 Scientometrix


14 Scientometrix / Numbers 1 1010 10 0 100 0 19 84 19 85 19 86 19 87 19 88 19 89 19 90 19 91 19 92 19 93 19 94 19 95 19 96 19 97 19 98 19 99 20 00 20 01 20 02 20 03 Exploring the dynamics of biosafety research using relational data analysis Christophe Bonneuil (Centre Koyré d’Histoire des Sciences, Cnrs, Paris) Andrei Mogoutov (Aguidel Consulting) Etienne Klein (INRA, Avignon) Fabien Moll-François (Centre Koyré)

15 Scientometrix / Ranking-Listing

16 Scientometrix: Early Warning: Strategic Diagrams of Research Community Evolution: biosafety research

17 Scientometrix / Mapping

18 Heterogeneous Networks Companies & Technologies Scientometrix/ Adds Heterogeneous Networks Companies & Technologies

19 Actor/Networks Scientometrix / Adds: Actor/Networks Pharma Group I Central, Star-like hierarchical networks Pharma Group II Less central, Less hierarchical Platform Tech. Companies Clique-like, complex networks

20 Space Biotech Clusters Scientometrix /GIS Space Biotech Clusters Boston Region

21 Mapping of Collaborative Networks Web Tool Box for Heterogeneous Data Analysis Andrei Mogoutov | AGUIDEL Paris - www.aguidel.com - andrei@aguidel.comwww.aguidel.com Sources: Bibliographical Databases: PubMed (Medline), ISI etc Heterogeneous Network Output Analysis & Mapping: Co-Authorship Networks Content Analysis – Keyword Mapping Heterogeneous Networks Authors vs Keywords Statistical tables and row data downloadable for desktop tools SVG Mapping Output

22 Scientometrix/ Data & Web Mining / Practice with ReseauLu Software

23 Relational Data Analysis

24 Textual Analysis

25 Scientometrix/ Data & Web Mining / Practice

26 PubMed Mapping I

27 PubMed Data II

28 Scientometrix Online Demo: Aguidel Web Toolbox

29 Web Cartography I WebMap, visual directory that maps 2 million plus web sites

30 Web Cartography II Conversation Map (Warren Sack). virtual conversations analysis

31 Web Cartography III IssueCrawler Project – GovCom.Org, Amsterdam (R Rogers et al)

32 Web Cartography Online Demo: Aguidel Web Toolbox

33 Network in Time

34 News The North Korean English News Space, Sept. 15 – Nov. 15, 2003. Findings: Whitehouse.gov (press release) couches North Korea in terms of regime change and human rights. The only other outlet that does so is Frontpagemag.org, a site which at the time of this map, extolled surfers to sign and e-petition and help “Stop the Left’s Anti-American Agenda… Help expose terrorists in our midst>” Connecting regime change to war is done by Fox News, Newsweek, and Asia Times online. Thus it is clear that these media outlets are framing regime change in terms of military conflict. Regime Change and Reunification are, basically, disconnected. Thus there is little talk of a German model achievement of Regime Change. The Financial Times subscription service, a strong example of the corporate angle on the issue, presented North Korea only in terms of regime change, notably isolating the issue from conflict, reunification, famine and other issues. Only CNN is able to connect famine and reunification, one of the more practical and meaningful associations between the issues of import in the peninsula. This finding defies conventional wisdom which would have CNN less informed by the stance of regionally located English media outlets.

35 News /offline demo (RéseauLu)

36 Text Mining & Web Mining Tools Web Tool Box for Heterogeneous Data Analysis Andrei Mogoutov | AGUIDEL Paris - www.aguidel.com - andrei@aguidel.comwww.aguidel.com Sources: Textual data, Web Based Data, Bibliographical Databases, Abstracts, Articles, Titles Data collection tools Lexical tables Visualization of Heterogeneous Networks Actor/Lexical/Semantic Networks

Download ppt "Heterogeneous Data Analysis: Tools, Methods, Applications Andrei Mogoutov | AGUIDEL www.aguidel.com."

Similar presentations

Ads by Google