Technical Overview of FAST Search Server 2010 for SharePoint Sezai Komur SharePoint Solutions Architect CSG
A Special Thanks to…
What is FAST Search Server 2010 for SharePoint? Microsoft buys FAST Search and Transfer in 2008 for $1.2 Billion US. Port of FAST ESP integrated with SharePoint FS4SP is a new enhanced search engine integrated with SharePoint Server 2010 FS4SP, FSIS, FSIA, FIS-E
How is FAST Search better than SharePoint Server Search? Better search result quality – Better quality search engine relevancy and use of linguistics, stemming & lemmatisation in search processing Extreme Scale Search - Search Billions of documents with sub second search times Search Platform Extensibility Advanced Content Processing – Property Extraction – Document Processing Pipeline Extensibility Deep Refinement – Exact refinement number, SharePoint refiners aren't over the entire result set Advanced Filter Pack – Support for indexing 200+ file formats without the need to purchase numerous iFilters.
How is FAST Search better than SharePoint Server Search? Advanced Sorting – Sort on Managed Properties and Rank Profiles Tuneable Relevance with Multiple Rank Profiles FQL Query Language Contextual Search – Tailor results and refinement to user profile or audience Rich Web Indexing – Dynamic web content and Javascript, highly customisable connector Similar Results Detection & Results Collapsing Thumbnails and Previews – SharePoint 2010 Word and PowerPoint results via Office Web Apps Visual Best Bets
FAST Search on the Internet FAST Search for Internet Sites provides a search solution no matter what technology or platform is used to build a website
SharePoint Farm FAST Search Server 2010 for SharePoint Farm Query Processing Query Processing Query SSA (Search Service Application) Query SSA (Search Service Application) Web Link Analysis Web Link Analysis Microsoft System Center Operations Manager Microsoft System Center Operations Manager Content SSA (FAST Search Connector) - SharePoint - BDC - Exchange - Web Content SSA (FAST Search Connector) - SharePoint - BDC - Exchange - Web Site Collection Admin UI Deployment User Context Management Promotion/Demotion Site Collection Admin UI Deployment User Context Management Promotion/Demotion Indexing Query Matching FAST Search Authorization (FSA) FAST Search Authorization (FSA) FAST Indexing Connectors FAST Indexing Connectors Content Monitoring External federation sources ! ! People Search (query/crawl) People Search (query/crawl) Active Directory Content Administration PowerShell Schema configuration Admin configuration Deployment configuration PowerShell Schema configuration Admin configuration Deployment configuration Central Administration UI Property mapping Property extraction Spell-checking Central Administration UI Property mapping Property extraction Spell-checking Item Processing Item Processing Web Server Query Web Service Query Web Service Federation Object Model Federation Object Model Custom front-end Custom front-end Web Parts FAST Search Query FAST Search Query User Profiles
FAST Search Service Applications Two SharePoint Service Applications Communicate with FAST Servers FAST Content Search Service Application – Connector Configuration – Crawling FAST Query Search Service Application – Queries and Results from associated Web Applications – Managed Property Mapping configuration – People Search
FAST Web Services Services that SharePoint communicates with Content Distributors Query Service Administration Service Resource Store Log Server See Install_Info.txt in FAST install folder and look in IIS
Simple Conceptual Architecture
Topology Diagram
FAST Search Sizing At least one dedicated FAST server for production. Physical is better than Virtual. Good Disk IO is important. 1 x SharePoint + 1 x SQL + 1 x FS4SP server is an ‘extra small’ deployment. Estimate # and size of items crawled to work out disk space required. See Capacity Planning white paper: 825c cd7-3311d
Medium Farm
Search Engine Basics Crawling – Gathering content to store in an index Indexing – Storing content in an index optimised for searching Querying – Users execute searches against the index
Crawling
Connecting to sources of content to download files and data for processing Downloading documents or files (Items) Working through URLs – List or directory of items to crawl – Following links to other items Extracting information from files – Converting file formats to text for processing – Identifying properties or fields of information
FAST SEARCH SERVICE APPLICATIONS FAST SYSTEM DIRECTORY FAST WEB SERVICES CONNECTORS & CRAWLING DEMO
Processing & Indexing
Item Processing Format conversion – IFilters – Advanced Filter Pack (Oracle Outside In) formats Language and encoding detection Lemmatizer – linguistics normalization Tokenizer – word breaking Entity extraction – companies, locations DateTimeNormalizer – Date normalization Vectorizer – Create document vector for similarity searching WebAnalyzer – anchor text and link cardinality analysis PropertiesMapper – Map to crawled properties PropertiesReporter – report detected properties Optional content pipeline stages : XML Properties mapper Offensive content filter Verbatim (whole word) extractor (loads dictionary for custom extraction, e.g. product names) Field Collapsing Entity Extraction (persons) Document Processing Pipeline Extension FAST Search stores data to its Search Index after processing completes
Document Processing Pipeline … Format Conversion Language Detection Entity Extraction Lemmatization Mapper …
Property Extraction Extract metadata from unstructured content
Document Processing Pipeline Extensibility Items are processed in the Document Processing Pipeline after they are crawled and before they are stored in the index. Create and alter crawled property data. You can run code and pass data to other systems – CRM/ERP and other Line-of-business systems – Geocoding – OCR – Audio and Video Transcription – ramp.com – ‘Deep’ Search of raw data … The sky is the limit!
DOCUMENT PROCESSING PIPELINE CRAWLED PROPERTIES MANAGED PROPERTIES DEMO
Search UI
Web Parts
Refiners Refinement Panel Web Part Add and edit refiners displayed by changing filter category definition XML. Properties specified in lower case, managed property must have refinement enabled
Rank Profiles Configure Multiple Rank Profiles Allow Selection of Rank Profile in Search UI to change sorting Defaulting based on user profile
SEARCH UI REFINERS DOCUMENT PREVIEWS VISUAL BEST BETS DEMO
QUESTIONS?
Related Links Sezai’s Blog Enterprise Search IT Professional Training Debugging and Tracing Pipeline Extensibility Stages tracing-fast-search-pipeline-extensibility-stages.aspx search.html Shyam Nyaran’s blog – Visual Refiner Web Part sharepoint-server-2010-and-fast-search.aspx
Related Links Phonetic People Name Search nickname-search/ look-at-phonetic-people-search-in-sharepoint-2010.aspx Reasons to go with FAST Search for SharePoint instead of regular SharePoint 2010 Search for-sharepoint-instead-of-regular-sharepoint-2010-search/ Three Main Reasons Why You Should Upgrade to FAST for SharePoint should-upgrade-to-fast-for-sharepoint