Presentation is loading. Please wait.

Presentation is loading. Please wait.

Introduction to Full-Text Searching in SQL Server 2012 Adolfo J. Socorro, Ph.D. IT Impact, Inc.

Similar presentations


Presentation on theme: "Introduction to Full-Text Searching in SQL Server 2012 Adolfo J. Socorro, Ph.D. IT Impact, Inc."— Presentation transcript:

1 Introduction to Full-Text Searching in SQL Server 2012 Adolfo J. Socorro, Ph.D. IT Impact, Inc. asocorro@itimpact.com

2 Outline  What can we do with FTS?  How to install FTS  FTS components  Creating FTS indexes  How to query with FTS  FILESTREAM and FileTable

3 FTS Basics  char  varchar  nchar  nvarchar  text  ntext  image  xml  varbinary  varbinary(max) FTS allows searching against character-based data

4 Search Functionality “hotel” => “hotel” Specific words or phrases “fan” => “fantastic”, “fantasy” “local store” => “locally stored” Prefixes “minimized” => “minimizing”, “minimise” Inflectional forms

5 Search Functionality “search,query” => “query to perform search” Proximity “folder” => “directory” SynonymsWeighted Values

6 A First Look  Let’s run some simple examples to get a feel for FTS!

7 LIKE vs FTS  LIKE works on character patterns only  Cannot use the LIKE predicate to query formatted binary data  FTS is much faster against large amounts of unstructured text data

8 Supported SQL Server Editions  Enterprise  Business Intelligence  Standard  Web  Express with Advanced Services Available since at least SQL Server 2000

9 FTS Components Word Breaker StemmerStoplists ThesaurusFilters Property Lists

10 Language Support  50+ languages  Language-specific components  Word breakers and stemmers  Stoplists  Thesaurus files

11 How to Install

12 Default FTS Language

13 FTS Indexes  One index per table or indexed view  Must have a unique, single-column, non- nullable index on the table  Grouped within the same database into one or more full-text catalogs (“containers”)

14 Full-Text Catalogs  A logical construct  A way to manage FT indexes together

15 Index Population  Population: the addition of data to full-text indexes Automatic Manual On Request Scheduled

16 Steps to Setup an Index on a Table Create Full-Text Catalog For Each Column to Index Indicate language Indicate document type * Choose Change-Tracking Mechanism

17 Full-Text Index Wizard

18 Example: Create Catalog and Index

19 CONTAINS  Precise or prefix matches to single words and phrases  Proximity matches  Logical operations between conditions: AND, OR, AND NOT  Optional use of inflectional forms and thesaurus

20 FREETEXT  Matching the meaning, but not the exact wording, of specified words or phrases  Always uses inflectional forms and thesaurus

21 CONTAINSTABLE AND FREETEXTTABLE  Return a relevance ranking value (RANK) and full-text key (KEY) for each row  The actual RANK values are unimportant and typically differ each time the query is run  ISABOUT/WEIGHT influence the ranking in CONTAINSTABLE

22 Example: Queries

23 Stoplists  A mechanism to discard commonly occurring strings that do not help the search aisthe byand…

24 Thesaurus  Nicknames: Robert/Bob  Common misspellings: calendar/calender  Homophones: Geoff/Jeff  Technical terms: proc/procedure Very powerful if you log searches and learn what users are commonly searching for

25 Thesaurus  One file per language Expansions “bike” in addition to “bicycle” Replacements “calendar” instead of “calender”

26 Filters  Extract textual information from the document (removing the formatting)  Send the text to the word-breaker component for the language associated with the column  Need to manually install Office 2010 and PDF filters

27 Example: FTS Components

28 Where to Store Large Objects? DatabaseFile System security manageability, recoverability transactional consistency performance

29 Why Store in the Database?  Integrating unstructured data into the relational database provides significant benefits:  Integrated storage and data management capabilities (e.g., backup)  Ease of administration and policy management  Full-text search

30 FILESTREAM  A database/file system hybrid  FILESTREAM is an attribute that can be assigned to a varbinary(max) column  Allows storing BLOB data in the file system  Not restricted to the 2 GB limit SQL Server imposes on BLOBs

31 FILESTREAM  SQL Server buffer pool is not used  Isolation semantics are governed by Database Engine transaction isolation levels

32 Steps to FILESTREAM Enable at OS levelConfigure at instance levelCreate a filegroupAdd a file to the filegroup Indicate root folder

33 OS-level Configuration of FILESTREAM

34 Instance-level Configuration of FILESTREAM

35 Example: FILESTREAM

36 FILESTREAM  All data access must be transactional  Must use specific APIs for file I/O  Do not edit the files directly!

37 When to Use FILESTREAM  Objects that are being stored are, on average, larger than 1 MB  Store smaller objects in the database  Fast read access is important  You are using a middle tier for application logic

38 FileTables  A special, fixed-schema kind of table  Builds on top of existing FILESTREAM capabilities  Store files and documents in in the database, but access them from Windows applications as if they were stored in the file system (WIN32 API)

39 FileTables  Hierarchical namespace  Includes file system properties as columns  Preserves full file names  Non-transactional access through the FS

40 FileTables  Calls to create or change a file or directory through the Windows share are intercepted by a SQL Server component and reflected in the corresponding relational data in the FileTable

41 Example: FTS over FileTables

42 FileTables vs FILESTREAM  File and directory hierarchy maintained in the database  Windows application compatibility  Relational access to file attributes  Both are available in all editions

43 Wrap Up  Advanced searching on character-based data, including documents  FTS setup, components, and queries  FILESTREAM  FileTables

44 Other Topics  Document-property search  Semantic search  Optimizations  Query plans and execution traces

45 References  Posts and presentations by Bob Beauchemin  http://www.sqlskills.com/blogs/bobb/  Blog: SQL Server FTS Team Blog  http://blogs.msdn.com/b/sqlfts  SQL Server 2012 Books Online  http://msdn.microsoft.com/en- us/library/cc645577(SQL.110).aspx

46 Filter Packs  Adobe PDF Filter  http://www.adobe.com/support/downloads/thankyo u.jsp?ftpID=4025&fileID=3941  Office 2010 Filters  http://www.microsoft.com/en- us/download/details.aspx?id=17062

47


Download ppt "Introduction to Full-Text Searching in SQL Server 2012 Adolfo J. Socorro, Ph.D. IT Impact, Inc."

Similar presentations


Ads by Google