Presentation is loading. Please wait.

Presentation is loading. Please wait.

Project Description 2 Inverted List Database. Create an Inverted File Tokenize a text document, and attach to each token a list of locations that this.

Similar presentations


Presentation on theme: "Project Description 2 Inverted List Database. Create an Inverted File Tokenize a text document, and attach to each token a list of locations that this."— Presentation transcript:

1 Project Description 2 Inverted List Database

2 Create an Inverted File Tokenize a text document, and attach to each token a list of locations that this token has appeared Sort and Store these result in Oracle database

3 Tokenizer –Admissible symbols for token; we will not user delimiter to capture the token. –Keep a record of the position of each token

4 Tokenizer Example: Document1: He is a dumb teacher Dumb! Dumb! and Dumb! Document2:“He is a great council. His advices are really great. He truly helps.

5 Tokenizer Inverted File for document 1: -continue: dumb 4 Dumb 6 Dumb 8 Dumb 11 He 1 is 2 teacher 5

6 Tokenizer - Example: Inverted File for document 1: ! 12 ! 7 ! 9 a 3 and 10

7 Tokenizer Inverted File for document 1 ! 7, 9, 12 a 3 and 10 Dumb 4, 6, 8, 11 He 1 is 2 teacher 5

8 Tokenizer Inverted File for document 2 : (period). 6, 12 a 3 advices 8 are 9 council 5 great 4, 11 He 1. 7 is 2 really 10

9 Token database Store the token into database First Column is sorted tokens Second Column is the Document Name/NO Rest of a tuple keeps locations of the token This is the so called inverted list –(option) Compressed the sequence of locations into some new data type.


Download ppt "Project Description 2 Inverted List Database. Create an Inverted File Tokenize a text document, and attach to each token a list of locations that this."

Similar presentations


Ads by Google