Presentation is loading. Please wait.

Presentation is loading. Please wait.

Oracle 8i interMedia Text Presented by Jorge Rimblas 4-Feb-2002 SSI Worldwide.

Similar presentations


Presentation on theme: "Oracle 8i interMedia Text Presented by Jorge Rimblas 4-Feb-2002 SSI Worldwide."— Presentation transcript:

1

2 Oracle 8i interMedia Text Presented by Jorge Rimblas 4-Feb-2002 SSI Worldwide

3 Oracle 8i interMedia Text a.k.a. Before 8.1.5, Oracle ConText In Oracle 9i, Oracle Text Tightly integrated with Oracle 8i to provide better search performance and greater ease of use.

4 What does it do? Extends Oracle8i by indexing any text or documents stored in Oracle8i, operating system flat files or URLs Enables content-based queries using standard SQL

5 How do you use it? Create an index on the item description (varchar2 field) create index mtl_system_items_ctx on mtl_system_items_b(description) indextype is ctxsys.context;

6 How do you use it? We can run content-based queries with the CONTAINS function. select segment1, description from mtl_system_items_b where contains(description, ‘Monitor') > 0; NOTE: The > 0 part is necessary to make it legal Oracle SQL, which does not support boolean return values for functions (yet).

7 Results… select segment1, description, score(1) from mtl_system_items_b where contains(description, 'monitor',1) > 0

8 Results… select segment1, description, score(1) from mtl_system_items_b where contains(description, 'monitor and LCD',1) > 0

9 Operators and Querying CONTAINS, SCORE Score can be between 0 and 100, but the top result will not necessarily have a score of 100 Salton formula used for the Score: 3f(1+log(N/n)) AND (&), OR (|), NOT(~), FUZZY(?), SOUNDEX(!), EQUIV(=), ABOUT ( ) and { }

10 Creating Indexes create index INDEXNAME on TABLE(COLUMN) indextype is ctxsys.context ; Only one column is allowed in the column list. Types: CHAR, VARCHAR, VARCHAR2, LONG, LONG RAW, BLOB, CLOB, BFILE

11 DML Processing 1.Text indexing a document is a lot of work 2.Inverted indexes, composed of lists of documents by word, are best updated in batches of documents at a time 3.Most text applications are fairly static, having relatively lower DML frequency

12 INSERT The document rowid is placed into a queue for later addition to the text index. Queries before this DML is processed will not find the new document contents. UPDATE The old document contents are invalidated immediately, and the document rowid is placed into the queue for later reindexing. Queries before this DML is processed will not find the old contents, but neither will it find the new contents. DELETE The old document contents are invalidated immediately.

13 Processing Additions alter index myindex rebuild online parameters ('sync') ; The ONLINE keyword is very important.

14 Other Stuff… Stoplist, lexer, filters, storage_clause, wordlist, etc… CTXCAT index


Download ppt "Oracle 8i interMedia Text Presented by Jorge Rimblas 4-Feb-2002 SSI Worldwide."

Similar presentations


Ads by Google