Presentation is loading. Please wait.

Presentation is loading. Please wait.

12/03/2013 1 Second International Workshop on New Generation Enterprise and Business Innovation NGEBIS 2013 Cross Domain Crawling for Innovation Pieruigi.

Similar presentations


Presentation on theme: "12/03/2013 1 Second International Workshop on New Generation Enterprise and Business Innovation NGEBIS 2013 Cross Domain Crawling for Innovation Pieruigi."— Presentation transcript:

1 12/03/2013 1 Second International Workshop on New Generation Enterprise and Business Innovation NGEBIS 2013 Cross Domain Crawling for Innovation Pieruigi Assogna, Francesco Taglino CNR-IASI (Italy)

2 12/03/2013 2 Outline Motivations & Objectives Methodological approach Technological approach Conclusions

3 12/03/2013 3 Motivations and Objectives In any kind of organization, creativity and innovation come from people Tools aiming at supporting creativity need to be based on the most accredited theories related to how people use their knowledge to act on the environment, adapt to new situations, invent. The method proposed here aims at providing knowledge “raw material”, capable of triggering out-of-the-box ideas

4 12/03/2013 4 Constructivism According to Constructivism a person’s culture is an integrated network of concepts and models This guides the person’s activity, and is consolidated, enriched, modified by each new experience Apart from pathological situations (schizophrenia) each person’s structure is anyway connected

5 12/03/2013 5 New Paths The connections between concepts create paths that, with time, our mind travels more or less automatically In new situations we have to “take the lead” and try new paths, possibly linking different and distant clusters This is for instance what is favored by “lateral thinking” methods

6 12/03/2013 6 Knowledge Base In general a domain Knowledge Base (KB) is a tool for maintaining and enriching its users’ focused knowledge In particular the KB’s ontology mimics their focused conceptual structure When the users are confronted by new issues, a search on the KB or on the Net (on the base of the domain ontology) typically keeps them within this focused ground

7 12/03/2013 7 The Methodology We propose a way to extend a focused knowledge domain to support diversions from usual thinking paths We use the domain ontology to search the Net for documents that address key topics of the domain together with topics belonging to different ones These documents have good probability of containing considerations, theories, metaphors that link the person’s knowledge clusters with “exotic” ones, able to trigger ideas out-of-the- box

8 12/03/2013 8 Semantics-based cross-domains crawling

9 12/03/2013 9 Documental Resources Space where we search for interesting documents websites (e.g., MIT website on innovations), RSS feeds, and public documents repositories (e.g., BBC news) In our example we focus on Robotics and Machine Vision (R&MV) domain

10 12/03/2013 10 Linked Data A set of principles to allow Standard description of data (RDF-based) Standard way of accessing data (HTTP) Linking resources/data among them Linking Open Data as a project for publishing datasets (e.g., Dbpedia) in a Linked Data fashion

11 12/03/2013 11 The Linking Open Data cloud DBpedia

12 12/03/2013 12 Reference ontology and bridge to the LOD cloud Within the BIVEE project we have built a glossary of 600 concepts on R&MV We enriched such concepts with DBpedia entries (owl:sameAs) Photodiodes R&MV reference ontology DBpedia Photodiode http://dbpedia.org/page/Photodiode owl:sameAs Camera http://dbpedia.org/page/Camera owl:sameAs

13 12/03/2013 13 Terms extraction from analyzed document Extracted terms/concepts are representative and somehow synthesize the document’s content We analyzed different tools for extracting knowledge from documents Zemanta, Alchemy, OpenCalais, FISE AlchemyAPI: extract concepts from a text relevance value link to DBpedia and other LOD dataset

14 12/03/2013 14 Semantic Filter over a doc Two steps Identify the extracted concepts related to our domain of interest Identify good candidate and discarding not interesting documents

15 12/03/2013 15 Semantic Filter over a doc: step 1 Identify the extracted concepts related to our domain of interest (e.g., R&MV) Given an extracted concept ec, it exists at least one reference concept rc, such that Extracted Concept (ec) (r 1 = ref. to Dbpedia entry) Reference Ontology Concept (rc) (r 2 = ref. to Dbpedia entry) (r 1 dc:subject) r AND (r 2 dc:subject r) where r is a resources r 1 = r 2 OR

16 12/03/2013 16 Semantic Filter over a doc: step 2 Let be S1 the set of extracted concepts related to our domain Let be S2 the set of extracted concepts NOT related to our domain A document is a good candidate if (a) t1<Sum(relVal(S1))<t2 AND t 1 =0.1, t 2 =0.4 (b) Sum(relVal(S2))>t3t 3 =0.4 (a) ensures that the analyzed document deals with our reference domain, but in a small manner, (b) second constraint ensures that the analyzed document deals with other topics in a considerable measure.

17 12/03/2013 17 Filtering: example 1 Extracted Concepts and Relevance The document is about extracting energy from insects SUGGESTED AS INTERESTING

18 12/03/2013 18 Filtering: example 2 Extracted Concepts and Relevance The document is about supporting shoppers get the right fit when buying clothes online SUGGESTED AS INTERESTING

19 12/03/2013 19 Filtering: example 3 Extracted Concepts and Relevance The document does not consider Robotics and Machine Vision at all NOT INTERESTING document

20 12/03/2013 20 Filtering: example 4 Extracted Concepts and Relevance The document is too much Robotics oriented, so it can be surely useful for experts in the Robotics field, but it does not appear inspiring for lateral thinking NOT INTERESTING document

21 12/03/2013 21 Conclusions and Outlook Very preliminary work on supporting lateral thinking activities More experimentation Using the LOD cloud as much as possible

22 12/03/2013 22 Questions & Answers


Download ppt "12/03/2013 1 Second International Workshop on New Generation Enterprise and Business Innovation NGEBIS 2013 Cross Domain Crawling for Innovation Pieruigi."

Similar presentations


Ads by Google