Presentation is loading. Please wait.

Presentation is loading. Please wait.

Applying Key Phrase Extraction to aid Invalidity Search

Similar presentations


Presentation on theme: "Applying Key Phrase Extraction to aid Invalidity Search"— Presentation transcript:

1 Applying Key Phrase Extraction to aid Invalidity Search
Manisha Verma, Vasudeva Varma SIEL, LTRC, IIIT Hyderabad

2 Outline Introduction Related Work Motivation and Contribution
Approaches Experiments and Results Future Work Questions ???

3 INTRODUCTION

4 Invalidity Search The task is to uncover patents or other published prior art that may render a granted patent invalid Find prior art that the patent examiner overlooked so that a patent can be declared invalid.

5 Input and Process INPUT It’s a patent application PROCESS
Use existing search engines to find similar work. MANUALLY create queries, go through several documents – articles, granted patents etc and find similar documents.

6 Related Work

7 Our work employs the second approach.
Related Work Two ways of approaching the problem Create a query from a patent and try different retrieval models Use different models to create a query from a patent then use an existing retrieval model. Our work employs the second approach.

8 Approach 1 Use claim text or abstract to create a query from the patent. Following have been used to improve Recall and Precision Re-ranking using several features Cluster based Pseudo Relevance Feedback Scoring based on subtopics etc.

9 Approach 2 Select words/phrases from different sections in a patent
Find out which section results in best queries Select words using tf-idf from a patent. Assign weight to each word to mark its importance. Common weighing methods explored are tf,and tf-idf Identify the optimal length of the query i.e. number of words to keep in a query generated from a patent. Empirically determine the value.

10 Motivation and Contribution

11 Motivation and Contribution
Explore and evaluate different ways to select phrases to make queries for patents. Though several key phrase extraction approaches have been proposed in the literature, they have not been used to create queries for invalidity search task. Evaluate and analyze the performance of queries created by using state-of-the-art unsupervised and supervised key phrase extraction techniques.

12 Approaches

13 Key Phrase Extraction Techniques
Unsupervised TextRank (R. Mihalcea et al.) SingleRank (X. Wan et al.) Tf-Idf Tf Supervised RankPhrase (X. Jiang et al.) KEA (I. H.Witten et al.)

14 Unsupervised Approaches
TextRank Present text as graph using co- occurrence statistics Run iterative algorithm to find dominant nodes (words) in graph.. SingleRank Same approach as TextRank While in TextRank phrases containing the top-ranked words are selected, in SingleRank, we do not filter out any low scoring words.

15 Supervised Approaches
KEA Use features to represent key phrases. Use a classifier to train on manually annotated data. RankPhrase Treat key phrase extraction as ranking problem Same features from KEA have been used

16 Training Supervised Approaches ???
To annotate patents with key phrases, take some applications with relevance judgments. For every phrase in the document Fire it as a query. Calculate MAP and Recall of that phrase (using the relevance judgments) Select phrases with high Map and Recall Prune phrases based on tf-idf scores Use these phrases for the document. Use some sample documents annotated using this approach to train the supervised approach.

17 Experiments And Results

18 1.3 million patents (NTCIR) 1000 patent applications
Our DATA 1.3 million patents (NTCIR) 1000 patent applications For each application, a list of patents which claim same invention is provided.

19 Unsupervised vs Supervised

20 Performance on different sections

21 Results The experiments indicate that key phrase extraction techniques indeed improve invalidity search results. Queries created by using unsupervised and supervised approaches perform better than those formed by tf or tf- idf. In supervised approaches, queries created by using phrases extracted by KEA show 29% and 37% improvement in MAP over TextRank and tf-idf respectively.

22 Future Work Weigh queries generated by using both the approaches
Try the approaches on different patent collections Explore combination of the two approaches for query construction

23 References X. Xue and W. B. Croft. Automatic query generation for patent search. In CIKM '09: Proceeding of the 18th ACM conference on Information and knowledge management, pages 2037–2040, NY, USA, ACM. R. Mihalcea and P. Tarau. TextRank: Bringing order into texts. In Proc. of EMNLP, 2004. X. Xue and W. B. Croft. Transforming patents into prior-art queries. In SIGIR '09: Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval, pages 808–809, NY, USA, ACM. X. Jiang, Y. Hu, and H. Li. A ranking approach to key phrase extraction. In SIGIR '09: Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval, pages 756–757, NY, USA, ACM.

24 Questions ???


Download ppt "Applying Key Phrase Extraction to aid Invalidity Search"

Similar presentations


Ads by Google