MEC 2014 4/11/2017 9:45 AM © 2014 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks.

Slides:



Advertisements
Similar presentations
The Keys to Speed. File Extensions Definition A tag of three or four letters, preceded by a period, which identifies a data file's format or the application.
Advertisements

ECHO Browse Reclassification Document ID: ECHO_Ops_Con_023 Version: 2.
Fernan Lake Watershed Pilot Project Bruce Godfrey January 31, 2014
CHS GRAPHICS GDP UNIT 01 FILE FORMATS Understanding File Formats.
® Microsoft Office 2010 Browser and Basics.
1 NCDesk % of the test will be Telecommunication/Internet Questions.
Enterprise Integration Solutions SharePoint Imaging.
GOVDELIVERY.COM New Distribution Service Georgia EPD Air Protection Branch.
Information Retrieval in Practice
Electronic Mail and SMTP
ARC06 SharePoint Search Deployment Mike Fitzmaurice Senior Technical Product Manager Microsoft Corporation
Microsoft SQL Server 2000 Reporting Services ( 주 ) 아이티즌 서정만 선임연구원
Overview of Search Engines
Simple Mail Transfer Protocol
Introduction 1 Lecture 7 Application Layer (FTP, ) slides are modified from J. Kurose & K. Ross University of Nevada – Reno Computer Science & Engineering.
Mail Server Fitri Setyorini. Content SMTP POP3 How mail server works IMAP.
Enterprise Search. Search Architecture Configuring Crawl Processes Advanced Crawl Administration Configuring Query Processes Implementing People Search.
Presented By: Product Activation Group Syndication.
SIS – Simplified Interline Settlement IS Functionality – How IS works? 15 th September 2010 Robin PAUL, Kale Consultants.
EXL311: Exchange Server 2013 Architecture Deep Dive Scott Schnoll Microsoft Corporation EXL311.
CONTI’2008, 5-6 June 2008, TIMISOARA 1 Towards a digital content management system Gheorghe Sebestyen-Pal, Tünde Bálint, Bogdan Moscaliuc, Agnes Sebestyen-Pal.
Software All parts of the computer people can NOT touch, such as programs, files, documents and any other data.
CSE401N: Computer Networks Lecture-5 Electronic Mail S. M. Hasibul Haque Lecturer Dept. of CSE, BUET.
Application Layer Protocols Simple Mail Transfer Protocol.
Fall 2005 By: H. Veisi Computer networks course Olum-fonoon Babol Chapter 7 The Application Layer.
Requirements Walk-through
1Copyright © 2011 Pearson Education, Inc. Publishing as Prentice Hall. Exploring Microsoft Office Access 2010 by Robert Grauer, Keith Mast, and Mary Anne.
SIS – Simplified Interline Settlement IS Functionality – How IS works? ICH UG 2010 – Breakout Session Robin PAUL, Kale Consultants.
Chapter 2 Architecture of a Search Engine. Search Engine Architecture n A software architecture consists of software components, the interfaces provided.
Microsoft Exchange 2000 Service Pack 2 Features Mark Barringer Support Professional Enterprise Messaging Support Microsoft Corporation.
1 What’s the difference between DocuShare 3.1 and 4.0?
MySQL Databases & PHP Integration Using PHP to write data to, and retrieve data from, a MySQL database.
1. The Basic and New Features Of MSU Centralized Adobe Connect Pro MSU IT Conference Breakout Session 3 Presented by Catherine Zhang 2.
Module 10 Administering and Configuring SharePoint Search.
Microsoft Office Illustrated Introductory, Second Edition Started with Outlook 2003 Getting.
Source: “A Business Report – Big Data Gets Personal,” MIT Technology Review, May 2013.
XP Practical PC, 3e Chapter 3 1 Installing and Learning Software.
A radiologist analyzes an X-ray image, and writes his observations on papers  Image Tagging improves the quality, consistency.  Usefulness of the data.
«Fly Carrier» agent software Optimization of data transmission over IP satellite networks.
DataFlow Diagram – Level 0
Working with ShakeCast A Training Manual. Contents  Module 1: Introduction to key concepts  Module 2: ShakeCast installation  Module 3: System configuration.
We now will look at options for saving searches in CINAHL. We have accessed the Results for Chloroquine AND Pyrimethamine AND Sulfadoxine search. We now.
Internet Applications (Cont’d) Basic Internet Applications – World Wide Web (WWW) Browser Architecture Static Documents Dynamic Documents Active Documents.
PageManager /16 What ’ s the strength in PM6 ? Open Architecture Tree View to Browse Any Folders In Your System Open Architecture Tree View to Browse.
A Quick Look At How Works Understanding the basics of how works can make life a lot easier for any user. Especially those who are interested.
The World Wide Web. What is the worldwide web? The content of the worldwide web is held on individual pages which are gathered together to form websites.
WIRED Future Quick review of Everything What I do when searching, seeking and retrieving Questions? Projects and Courses in the Fall Course Evaluation.
Water Rights Website (Toolshed Tour) RWUA Water Rights Workshop April 29, 2008
COM: 111 Introduction to Computer Applications Department of Information & Communication Technology Panayiotis Christodoulou.
1 Visa IntelliLink Spend Management Navigation as a Cardholder Training Deck.
Search Engine and Optimization 1. Introduction to Web Search Engines 2.
Colleen Alber OnBase Mail Integrations. Agenda 1.Integration for Microsoft Outlook 2.Mailbox Importer.
Information Retrieval in Practice
Intro to Google Docs 2014.
Architecting Search in SharePoint 2016
How to Setup and Utilize Functionality
Search Engine Architecture
SIS: A system for Personal Information Retrieval and Re-Use
Lisa Ruff Business Productivity/Accessibility TS Microsoft Federal
Partner Portal Introduction Bottomline Partner Programme
Organizing Files What is a file?
SIS – Simplified Interline Settlement
Елементи и формати в системата Е К С Т Р И
Similarity Checker ‘turn it in’ Guide for self-checking
Lesson 5: Multimedia on the Web
System Software: Operating system, Utility Programs, & File Management
DocumentParser: November, 2013.
Online video system used in LMS
Productivity Advantages
PubMed/How to Search, Display, Download & (module 4.1)
Presentation transcript:

MEC 2014 4/11/2017 9:45 AM © 2014 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

Search in Exchange Kumar Venkateswar, Sr. Program Manager Kutlay Topatan, Sr. Program Manager Microsoft © 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

Search in Exchange 2013 Infrastructure What’s new in O365? How does information get into the index and how is it maintained? What types of information can be indexed? What processes are essential and what resources are used? How can search be monitored and managed? How does search provide high availability? How does querying work, and what problems do end users encounter? What’s new in O365?

Big picture Indexing Query Processing Index

Indexing flow (mailbox) Good: Works well Less good: Can’t write back to database Per-copy processing Per-mailbox processing (DLs) How do we mitigate these? Can we pre-process what goes into Store? Yes! Search service Filter Word Break Content XForm DocIDs Content flow operators (simplified) Notifications Index Writer Store Content Engine Index Node Tokens Fetch Content DB Index Document Parser(s)

Indexing flow (transport) Good: Writes metadata to message before delivery Writes natural language processing data Less good: Best effort processing Both flows together let us index reliably! Filter Word Break Content XForm Content flow operators (simplified) Index Writer Store Transport Content Engine DB Document Parser(s)

Indexing Metadata (aka annotation stream) Words to index only Filter Word Break Content XForm Index Writer Transport Annotation stream writer Annotation stream reader Content Engine Index Node Store Index Document Parser(s) DB Search service Metadata (aka annotation stream) Words to index only Word break boundaries Stored on message Usually much smaller than body

Mailbox Flow: More Operator Detail Amount of data read from the mailbox item will be reduced if an “annotation stream” is already present Mailbox Database Document Index System Tokens (or an error record) are written out to the index at the end of the flow… Tokens Document ID enters here…

Index – parts and merging Master L5 L4 L3 L2 L1 Two update groups - %default and folder update group On-disk index parts have five levels plus master index (L0 is in memory) Each merge occurs when there are three lower parts “full” Throttled, <=4 simultaneous merges, <=1 master Master merge occurs with 20% of content outside the master index This is subject to future tuning

Message and Attachment Processing Message structure can be complex Even if format is supported, not all attachments may be processed because… There are too many attachments, or… Nesting is too deep, or… Processing is disabled Administrators can change the defaults: HKLM\Software\Microsoft\ExchangeServer\v15\Search\SystemParameters MaxAttachmentDepth (default value: 2) MaxAttachmentCount (default value: 10) ProcessImages (default value: 0) MarkSkippedImagesAsPartiallyProcessed (default value: 0)

Formats Supported “Out of the Box” Group Format Handled by Parsers Microsoft Office Excel OneNote Outlook PowerPoint Publisher Visio Word .xls, etc. .one .msg .ppt, etc. .pub .vsd, .vsdx, etc. .doc, .rtf, etc. Other Microsoft E-Mail XML Paper Specification .eml, .mhtml, .rss, etc. .xps Adobe Acrobat .pdf OpenOffice.org OpenOffice .odp, etc. Image GIF JPEG TIFF .gif, etc. .jpg, etc. .tif, etc. Other HTML Plain Text XML ZIP .htm, etc. .txt, .csv, etc. .xml, etc. .zip FAST engine has expanded the list of supported formats Third party IFilters will be picked up and used Example: New- SearchDocumentFormat - Name "Proprietary SCT Formats" -MimeType text/scriptlet - Extension .sct - Identity ProprietarySCT1

New Search Processes HostControllerService.exe (Windows service) XML config Host Controller Service is started by the Exchange Search service. Host Controller starts 4 worker processes, each named NodeRunner. These communicate with each other and with Exchange with Windows Communication Foundation. NodeRunner.exe (“Admin Node”) XML config XML config XML config NodeRunner.exe (“Content Engine Node”) NodeRunner.exe (“Interaction Engine Node”) NodeRunner.exe (“Index Node”) XML config ParserServer.exe (Ifilter sandbox)

Resource consumption Disk Memory CPU Per-item index size is approximately 10% of the per-item database size Merges cause the index to need up to 20% of database size temporarily IO is relatively sequential, since items are appended to parts and then merged Memory Rule of thumb: around 15% of RAM for search More precise: constant cache cost + constant per-index system cost + constant per-item cost Capacity planning spreadsheet gives the best estimate CPU Variable, based on rate, size, and content type of incoming items Merges consume CPU as well

Search Management Same cmdlets as in Exchange 2010 Some new properties exposed by Get-MailboxDatabaseCopyStatus Check search health per Server or per MDB with Test-ExchangeSearch List unindexed documents per Server or per MDB or per MBX with Get-FailedContentIndexDocuments

Search Monitoring: Perf Counters MS Exchange Search Indexes object (one instance per database): Crawler: Items Processed Notifications: Delayed Items Notifications: Processed/sec Crawler: Items Sent for Processing Notifications: Deletes Processed Retry: Deleted Mailboxes Remaining Crawler: Mailboxes Remaining Notifications: Deletes Processed/sec Retry: Items Deleted Crawler: Submission Delay Time Notifications: Items Processed Retry: Items Processed Crawler: Submission Delays Notifications: Items Sent for Processing Retry: Items Sent for Deletion Failed Items Notifications: Last Successful Poll Timestamp Retry: Items Sent for Processing Feeding Sessions Notifications: Moves Processed Retry: Retriable Items Items Processed Notifications: Moves Processed/sec Retry: Submission Delay Time Items Processed/sec Notifications: Processed Retry: Submission Delays Notifications: Age of Last Notification Processed Notifications: Awaiting Processing Notifications: Queue Length Notifications: Creates Processed Notifications: Stall Time Notifications: Creates Processed/sec Notifications: Updates Processed Rate at which items are being processed into the index(es) Number of items that are scheduled for reprocessing (because of a previous timeout or failure)

High Availability for Search Based around Database High Availability architecture Indexing always reads from the active database copy Look for reductions in network usage in the future Transport Role Transport Content Node Flow Before DL Expansion Local delivery MBX1 MBX2 MBX3 DB Index DB Index DB Index Retrieval/indexing Log shipping Log shipping Retrieval/indexing Retrieval/indexing

Copy selection, failovers, and seeding Index health is an important factor in best copy selection, second only to DB health Healthy > Crawling > other status This is health, not queue length. Search status can trigger failovers Disabled or failedandsuspended index on mounted Index is suspended but database is not Stalled seed, not able to reseed, or failed for too long on passive, plus restarting services doesn’t help No results from query Seeding Since index is smaller, takes much less time than seeding database Used to remedy a variety of index issues, on passives and actives

Big picture revisited: Query Indexing Query Processing Index

Middle Tier (depends on protocol) Query Word break Parse Query Plan TWIR Filter Word break Parse Content XForm Middle Tier (depends on protocol) Store Interaction Engine Index Node DB Index Queries are composed of AQS and query restrictions Queries are wordbroken and parsed to FQL plus “that which is residual” (TWIR) Index returns document IDs from FQL, and store processes TWIR and links in message data before returning

Language detection Transport flow Mailbox flow Query Body is run through language detection If the body is <12 characters or language can’t be detected, this is left blank Wordbreaking uses the detected language, or English if not detected Mailbox flow Body + subject + contacts are run through language detection Since this is a greater number of characters it is more likely to succeed Fall back to English Query Mailbox session culture is used for language identification

Query troubleshooting – common problems Queries will return 250 maximum results Capping results improves query latency Most users don’t want to search through many results, so our focus is on tools to improve querying Indexing issues (transient or permanent) Attachments that are excessively complex Attachments without filters installed IRM server unavailable

Demo: Query troubleshooting Kumar Venkateswar

What is new in O365 mail search? 4/11/2017 9:45 AM What is new in O365 mail search? Kutlay Topatan Program Manager © 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

Let’s talk numbers 93 19% 4.5 BILLION YEARS AVERAGE IW SPENDS SEARCHING FOR AND GATHERING INFORMATION BEFORE AN O365 USER NEEDS TO DELETE ANY EMAIL (20 emails/day @ 75 KB) BUSINESS EMAILS SENT DURING THIS TALK

New search experiences raised user expectations 4/11/2017 9:45 AM New search experiences raised user expectations Relevant refiners Suggestions Instant results © 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

4/11/2017 Users struggle to find what they are looking for, re-searching frequently PAIN POINTS Too slow Refining is difficult & inefficient Pre-Organizing doesn’t improve search success Recall - Difficult to remember context Source: OXG Search Focus Groups © 2012 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

Introducing personalized search 4/11/2017 9:45 AM Introducing personalized search Lightning fast results that utilize the new indexing and query pipeline Personalized suggestions that help interpret user intentions Content based refiners that help finding best results with minimal effort Hit highlighting to easily find relevant sections in long conversations © 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

4/11/2017 9:45 AM Demo © 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

Additional details Availability Suggestions Refiners 4/11/2017 9:45 AM Additional details Availability Hit-highlighting is already available today Lighting search, dynamic refiners and suggestions available for service customers first, on-prem with next release Suggestions Keyword suggestions are populated from search history and mailbox content People suggestions are a combination of matches from recipient cache and directory To: search suggestions in “Sent Items” More suggestions sources and types are being planned Refiners From:, Folder, Attachment and Date refiners More refiner categories being planned © 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

4/11/2017 9:45 AM © 2014 Microsoft Corporation. All rights reserved. Microsoft, Windows and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION. © 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

Query troubleshooting demo

Query troubleshooting demo

Query troubleshooting demo

Query troubleshooting demo

Query troubleshooting demo

Query troubleshooting demo

Query troubleshooting demo

4/11/2017 9:45 AM © 2014 Microsoft Corporation. All rights reserved. Microsoft, Windows and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION. © 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.