Presentation is loading. Please wait.

Presentation is loading. Please wait.

Module 20 Working with Full-Text Indexes and Queries.

Similar presentations


Presentation on theme: "Module 20 Working with Full-Text Indexes and Queries."— Presentation transcript:

1 Module 20 Working with Full-Text Indexes and Queries

2 Module Overview Introduction to Full-Text Indexing Implementing Full-Text Indexes in SQL Server Working with Full-Text Queries

3 Lesson 1: Introduction to Full-Text Indexing Discussion: The Need for More Flexible User Interaction Why LIKE Isn't Enough Fuzziness in Queries Demonstration 1A: Using Full-Text Queries

4 Discussion: The Need for More Flexible User Interaction Consider this search page: It’s hard to imagine anyone wanting separate fields, etc. yet this is exactly how we build most business applications today  Why? Ask yourself if you’d prefer an interface like this:

5 Why LIKE Isn't Enough We can try to build more flexible search using the T-SQL LIKE operator A search for somename named Lee returns Ashlee, Carolee, Colleen, Kathleen, Kaylee, Lee, Shirleen, etc. Substrings aren't words: a search for Pen returns pen, pencil, pendulum, penitentiary, open, etc. Searching for two words gets even more complicated: SELECT DISTINCT FirstName FROM Person.Person WHERE FirstName LIKE '%Lee%'; SELECT DISTINCT FirstName FROM Person.Person WHERE FirstName LIKE '%Lee%'; WHERE Details LIKE '%Fred%Terry%' OR Details LIKE '%Terry%Fred%' WHERE Details LIKE '%Fred%Terry%' OR Details LIKE '%Terry%Fred%'

6 Fuzziness in Queries IT Professionals tend to like to work in an exact and precise way End users prefer flexible and fuzzy search capabilities You might be able to find substrings in T-SQL but how would you find:  the word Kathleen near the word bicycle?  the word Client when the search term was Customer?  the words Driving or Drove when the search term was Drive?  rows relating to Attempts to Improve Solar Energy Efficiency?

7 Demonstration 1A: Using Full-Text Queries In this demonstration you will see why full-text indexing is important for creating advanced and flexible user interfaces

8 Lesson 2: Implementing Full-Text Indexes in SQL Server Discussion: Search-related Options Full-Text Search in SQL Server Core Components of Full-Text Search Language Support and Supported Word Breakers Implementing Full-Text Indexes Demonstration 2A: Implementing Full-Text Indexes

9 Discussion: Search-related Options Which forms of search are you familiar with?  Bing?  Search in the operating system?  Search in Outlook?  Full-text search in earlier versions of SQL Server?  Other search engines?

10 Full-Text Search in SQL Server Search allows full-text queries against character-based data stored in SQL Server  char, varchar, nchar, nvarchar  text, ntext, image  xml  varbinary(max) Indexes  are created on the tables containing the character-based data  are stored in the database along with other data  allows columns to be written in many languages  can query with simple words or phrases  can rank results via table-valued functions

11 Core Components of Full-Text Search Which of these components are likely to be specific to a language?

12 Language Support and Supported Word Breakers Arabic Bengali Brazilian British English Bulgarian Canadian English Catalan Chinese (Simplified) Chinese (Traditional) Chinese (Hong Kong) Chinese (Macau) Chinese (Singapore) Croatian Danish Dutch English French German Gujarati Hebrew Hindi Icelandic Indonesian Italian Japanese Korean Latvian Lithuanian Malay - Malaysia Malayalam Marathi Neutral Norwegian Polish Portuguese Punjabi Romanian Russian Serbian (Cryllic) Serbian (Latin) Slovak Slovenian Spanish Swedish Tamil Telugu Thai Turkish Ukrainian Urdu Vietnamese Query sys.fulltext_languages to see the current list

13 Implementing Full-Text Indexes Steps to implement full-text indexing: Must have a table with character-based data Create a full-text catalog (if none already) Create a full-text index on the table Populate the index Query the index using full-text predicates or TVFs

14 Demonstration 2A: Implementing Full-Text Indexes In this demonstration you will see how to create a full-text catalog how to create a full-text index how to check when a full-text index is fully populated

15 Lesson 3: Working with Full-Text Queries CONTAINS Queries FREETEXT Queries Table Functions and Ranking Results Thesaurus Stopwords and Stoplists SQL Server Management of Full-Text Demonstration 3A: Working with Full-Text Queries

16 CONTAINS Queries SELECT MessageID,Description FROM dbo.Messages WHERE CONTAINS(Description,'filing') ORDER BY MessageID; SELECT MessageID,Description FROM dbo.Messages WHERE CONTAINS(Description,'file AND NOT boundary') ORDER BY MessageID; SELECT MessageID,Description FROM dbo.Messages WHERE CONTAINS(Description,'filing') ORDER BY MessageID; SELECT MessageID,Description FROM dbo.Messages WHERE CONTAINS(Description,'file AND NOT boundary') ORDER BY MessageID; Searches for words Can use operators  AND, OR, AND NOT Can use proximity (NEAR), inflectional, and thesaurus forms

17 FREETEXT Queries SELECT MessageID,Description FROM dbo.Messages WHERE FREETEXT(Description, 'statement was terminated') ORDER BY MessageID; SELECT MessageID,Description FROM dbo.Messages WHERE FREETEXT(Description, 'statement was terminated') ORDER BY MessageID; Are used to search for values that match the meaning, not just the wording Internally assign each term a weight and then find matches Work on a single table but can work with joins of multiple tables in a FROM clause

18 Table Functions and Ranking Results SELECT m.MessageID,m.Description,ft.RANK FROM dbo.Messages AS m INNER JOIN FREETEXTTABLE(dbo.Messages,Description, 'statement was terminated') AS ft ON m.MessageID = ft.[KEY] ORDER BY ft.RANK DESC; SELECT m.MessageID,m.Description,ft.RANK FROM dbo.Messages AS m INNER JOIN FREETEXTTABLE(dbo.Messages,Description, 'statement was terminated') AS ft ON m.MessageID = ft.[KEY] ORDER BY ft.RANK DESC; Table-valued function versions of CONTAINS and FREETEXT provide ranking of relevance  CONTAINSTABLE  FREETEXTTABLE KEY and RANK columns are provided by the functions

19 Thesaurus 0 user operator developer NT5 W2K Windows 2000 0 user operator developer NT5 W2K Windows 2000 Allows searching for words other than those specified Provides Replacements and Expansions Is implemented as an XML file at the SQL Server instance level

20 Stopwords and Stoplists SELECT * FROM sys.fulltext_system_stopwords WHERE language_id = 1033; CREATE FULLTEXT STOPLIST CompanyNames; ALTER FULLTEXT STOPLIST CompanyNames ADD 'Microsoft' LANGUAGE 1033; SELECT * FROM sys.fulltext_system_stopwords WHERE language_id = 1033; CREATE FULLTEXT STOPLIST CompanyNames; ALTER FULLTEXT STOPLIST CompanyNames ADD 'Microsoft' LANGUAGE 1033; Not all words in any language are useful in an index Company names, etc. are often in every document and often useless to index sys.fulltext_system_stopwords shows the built-in stopwords by language Stoplists can be created manually Words in Stoplists are not indexed by iFTS

21 SQL Server Management of Full-Text Full-text indexes live within the database  Inexes are backed up and/or restored along with the database  ALTER INDEX REORGANIZE can be used to defragment a full-text index  ALTER FULLTEXT CATALOG REORGANIZE causes a master merge of the full-text indexes in the catalog sys.dm_fts_parser and other DMVs are useful for troubleshooting sys.fulltext_document_types shows indexable document types

22 Demonstration 3A: Working with Full-Text Queries In this demonstration, you will see how to: query a full-text index locate the built-in stopwords create a stoplist and add a value to it check the parsing of text by the full-text engine

23 Lab 20: Working with Full -Text Indexes and Queries Exercise 1: Implement a full-text index Exercise 2: Implement a stoplist Challenge Exercise 3: Create a stored procedure to implement a full-text search (Only if time permits) Logon information Estimated time: 45 minutes

24 Lab Scenario Users have been complaining about the limited querying ability provided in the marketing system. You are intending to use full-text indexing to address these complaints. You will implement a full-text index on the Marketing.ProductDescription table to improve this situation. You will implement a stoplist to avoid excessive unnecessary index size. If you have time, your manager would like you to help provide a more natural interface for your users. This will involve creating a new stored procedure.

25 Lab Review What sorts of values would be useful in stoplists? What sorts of values would be useful in a thesaurus?

26 Module Review and Takeaways Review Questions Best Practices

27 Course Evaluation


Download ppt "Module 20 Working with Full-Text Indexes and Queries."

Similar presentations


Ads by Google