Presentation is loading. Please wait.

Presentation is loading. Please wait.

Automatic Discovery of Technology Trends from Patent Text Youngho Kim, Yingshi Tian, Yoonjae Jeong, Ryu Jihee, Sung-Hyon Myaeng School of Engineering Information.

Similar presentations


Presentation on theme: "Automatic Discovery of Technology Trends from Patent Text Youngho Kim, Yingshi Tian, Yoonjae Jeong, Ryu Jihee, Sung-Hyon Myaeng School of Engineering Information."— Presentation transcript:

1 Automatic Discovery of Technology Trends from Patent Text Youngho Kim, Yingshi Tian, Yoonjae Jeong, Ryu Jihee, Sung-Hyon Myaeng School of Engineering Information and Communication University, South Korea

2 Introduction Motive: Patent text is a good source to discover technological progresses. Problem: Previous solutions(citation analysis, network-based patent analysis) for patent domain have some drawbacks – Need domain expertise – Not easy to recognize salient concepts – Hamper wide application of the proposed method

3 Introduction In this paper, the authors want to – Avoid the limitations mentioned previously Method 1.Semantic key-phrase extraction(No experts) 2.Technological trend discovery(Unsupervised) Semantic key-phrase define: – Problem, such as “recognizing spoken language” – Solution, such as “language model” – Domain, such as “speech recognition”

4 Introduction Application: help users explore numerous technical documents efficiently to get the technological trends, the below is a example

5 Overall procedure 1.Technology identification through semantic key-phrase extraction The probabilistic framework with linguistic clues The probabilistic framework have weighting The linguistic clues have weighting Finally, Using statistical learner to learn(Libsvm) 2.Discover technological trends by Select important technologies during a time sapn Linking them according to semantic relatedness

6 Problem Formulation Definition – Domain : A field of technology given by a user query, then generate a collection of related field – Problem : A patent or a method attempts to solve – Solution : A method, a model or an approach that is associated with a particular problem – Technology : A combination of a problem, a solution, and the given domain – Time Span :

7 Problem Formulation Definition – Technological Trend : a main stream of technologies during a time span l. Example:

8 Technological Trend Discovery System Structure of Patent Documents Semantic Key-phrase Extraction – Problem Extraction – Solution Extraction Technological Trend Discovery

9 Structure of Patent Documents Database : USPTO(United States Patent and Trademark office) Time span Cite information Linguistic features

10 Semantic Key-phrase Extraction Step 1 – Parsing a patent to get smallest noun phrase as key- phrase candidates(e.g. signal patterns) – Expand NP to V+NP by dependency(e.g. recognizing signal patterns) Step 2 – Identify Problem key-phrase by classifying Step 3 – Among the rest of candidate, extract solution key- phrase to get

11 Problem Extraction Feature Topical language model(unigram) Consider the dependency(bigram model) Special smoothing: Relevance & background language model

12 Problem Extraction Question: Probability model is biased to the topicality, need other mechanism to revise it Method: Linguistic clues – Gather all distinct patterns from the annotation – Generalize grammar by these pattern – E.g. (method/NN+in/PP )and(system/NN+in/PP) ==> ( method | system )NN+in/PP

13 Problem Extraction Feature 342 generalized patterns

14 Problem Extraction generalized patterns need a confidence A statistical machine learner(Libsvm) to the linguistic clues and the language models. Libsvm classify the candidate into problem & non-problem by using the above features

15 Solution Extraction Probability features work would not be useful – The solution phrase are rarely share within cited document Add the “head word” feature(i.e. model, approach, method, methodology etc.) the other feature category is the same as Problem Extraction

16 Technology Trend Discovery Reduction: Select several salient technologies and associate semantic relations between them How to find an good time span can discover effective technological trends – KL-divergence to compare two language model

17 Technology Trend Discovery How to find salient technologies within time spans. – If a technology is important, many patent will refer to it – Mutual information concept

18 Technology Trend Discovery Algorithm Step 1 – Define an initial time span(by dense of the data) Step 2 – Generate all possible combination of time span(e.g. ) Step 3 – Calculate KL-divergences of all pairs from step 2,rank them Step 4 – Select the most important technology among the top n pairs

19 Experiment Database: USPTO Domain: Speech recognition Data number: US 1420 patent document Time: 1976 - 2003 Annotator: three computer science graduate students Annotated number:400 document(uniformly select over the span of time)

20 Experiment Annotated work – Deal with the acronym(by Wiki and simple parenthetical patterns) – WordNet to normalize the noun and verb Technology phrase(Answer) is produced by gold standard with majority votes Agreements for 78% of sample(about 300 ) Technology Trend Discovery do not have a standard, it is too hard.(too many time span) ==>do not have good evaluation

21 Experiment Set the background language model Used LIBSVM as a machine learner,used 5-fold cross validation

22 Experiment All feature was proven the effectiveness

23 Experiment From the above step, we can discover many meaningful problems and solutions Question: Synonymy issue(even utilize synonyms from WordNet)

24 Experiment Discover technological trends by the Technology Trend Discovery Algorithm

25 Conclusion & future work Discover such trends can reveal latent technologies Also can assist an exploration by alleviating information overload caused by search results Future work  Synonymy issue in Semantic Extraction  TTD standardized evaluation needs to investigated


Download ppt "Automatic Discovery of Technology Trends from Patent Text Youngho Kim, Yingshi Tian, Yoonjae Jeong, Ryu Jihee, Sung-Hyon Myaeng School of Engineering Information."

Similar presentations


Ads by Google