Presentation is loading. Please wait.

Presentation is loading. Please wait.

Manageability Services at Microsoft

Similar presentations

Presentation on theme: "Manageability Services at Microsoft"— Presentation transcript:

1 Manageability Services at Microsoft
Abstract This IT Showcase presentation discusses how the Microsoft Manageability Services group manages the global Microsoft IT environment by using the Microsoft Operations Framework. Introduction This presentation focuses primarily on the roles, responsibilities, and functions of the Microsoft Manageability Services group as they relate to manageability of the global Microsoft Information Technology (Microsoft IT) environment. This presentation covers the four key services that the Manageability Services group is responsible for: Server life cycle Client life cycle Configuration management Monitoring and control The server life cycle and client life cycle operate in all four quadrants of the Microsoft Operations Framework (MOF). However, the Manageability Services group focuses primarily on the Changing and Operating quadrants. Configuration management fits primarily in the Changing quadrant; the monitoring and control service fits primarily in the Operating quadrant. This presentation is for technical decision makers who are interested in how Microsoft operates internally and the best practices that Microsoft IT uses to manage the global infrastructure. The audience should be familiar with Microsoft technologies and general network terminology. Published: December 2006

2 Microsoft IT Environment
340,000+ computers 121,000 end users 98 countries 441 buildings 15,000 clients running Windows Vista™ 25,000 clients running the 2007 Microsoft Office system 5,700 Exchange Server 2007 mailboxes 31 servers running Windows Server “Longhorn” 46 million+ remote connections per month 189,000+ SharePoint sites 4 data centers 8,400 production servers Microsoft IT Environment The Microsoft IT environment is a large global enterprise operation that spans 98 countries. It represents one of the prime proving grounds for Microsoft enterprise products like Microsoft® Operations Manager (MOM), Microsoft Systems Management Server (SMS), and Microsoft SharePoint® Products and Technologies, which must be fully deployed in this environment before new versions can be approved for release. messages per day: million internal 10 million incoming 9 million filtered out 37 million instant messages per month 120,000+ server accounts 2 2

3 Microsoft IT As a Microsoft Customer
Possible Similarities Possible Differences Security is mission critical Mix of Microsoft operating systems and configurations Balancing security, cost, and efficiency is the bottom line Heterogeneous network environment Need to integrate disparate management systems Being the first and best customer of Microsoft Software deployed more than once Majority of users are technical, local administrators High-priority target for security attacks State-of-the-art networks and latest operating systems Windows-only environment Microsoft IT As a Microsoft Customer In many ways, Microsoft IT is a typical enterprise customer because it uses Microsoft products to provide many application and infrastructure services by using the Microsoft Windows Server® platform and Windows®-based clients. Microsoft IT is responsible for providing a high-security environment while still being conscious of cost and efficiency. Disparate management systems for servers, clients, and network components must be integrated for efficient operation, and a mix of Microsoft operating systems must be supported. Microsoft IT is also different from most enterprise customers in some regards. Being the “first and best customer” of Microsoft means that Microsoft IT is typically running many of its systems on beta code and must often deploy multiple versions of beta code prior to establishing the final shipping version of the product. This may also require some additional redundancy in the architecture to allow problems to be debugged offline without incurring service outages. Another difference is the fact that Microsoft is a Windows-only environment, so other operating systems are not managed. 3 3

4 Primary Challenges Pressure to reduce IT management costs
Continuous new software versions (beta release) Rapid updates New computers and servers configured daily Wide variety of hardware (various laptops, desktop computers, and Tablet PCs) Need to constantly monitor and control health and security of network 4

5 Dogfood and IT Scorecard
Shared goals Product feedback Planning and testing “Dogfooding” and running a world-class utility—IT Scorecard Showcase Dogfood and IT Scorecard Microsoft IT participates jointly with the product groups in establishing shared goals early in the product cycle. Shared goals define the requirements that products must meet before they can be released. In addition to deploying and using beta products, Microsoft IT must closely track the metrics that make up the shared goals to determine when a product is ready to be released. Deploying early beta versions of Microsoft products internally within the Microsoft IT environment is often referred to as “eating our own dogfood.” There is a constant tension between the need to “dogfood” early beta versions of products in the production environment and the need to run a world-class IT utility. Some service outages can be tolerated internally if they can be shown to result in bugs being detected and fixed prior to release of a product. Close tracking of availability and performance of services is required to determine when products have reached the necessary level of reliability and scalability for release. The IT Scorecard, built through Microsoft Office Business Scorecard Manager, is one of the primary tools that Microsoft IT uses to measure and report service availability against expected service level agreements (SLAs) and shared goals. Key performance indicators are carefully tracked and reported for each service provided, with clear indicators of success or failure to meet established goals. The process of planning, testing, deploying, and operating enterprise products must also be captured in considerable detail so that it can be documented and made available to customers upon product release in the form of white papers, webcasts, and presentations. 5 5

6 Manageability Services Model
MSTManage Product Groups Partners Third-Party Software Manageability Services Model Server life-cycle services in the Change quadrant, including: Building and configuring servers by using a scripted custom solution today and moving to an automated solution with Windows Deployment Services and System Center Configuration Manager (SCCM) Operating System Deployment (OSD). Deploying updates by using the patch management process and SMS. Client life-cycle services, including: How Microsoft IT manages 233,000 clients effectively. Installation and configuration through Remote Installation Services (RIS) and Windows Deployment Services. Client update and patch management process through SMS. Customers Program Management Business Units Server Life Cycle Configuration Management Service Monitoring Client Life Cycle Microsoft IT (Security) Image Management Operating System Provisioning Patch Management Software Distribution 3 Software Distributions 4 Updates 2,000 Images CMDB Server and Network Tools Management Enterprise Reporting 500,000 Configuration Items 15,000 Devices Managed 100+ Metrics Managed Server and Network Fault Management Alert Stream MP Onboarding 16,000 Devices Monitored 37,000/1 Million Alerts 11 Base Management Packs Image Management Operating System Provisioning Patch Management Software Distribution 12 Software Distributions 7 Updates 6,000 Images Service Management End Users External Customers Tiered Support (Helpdesk, Shared T2 Globally) 6 6

7 Manageability Services Scope
4 enterprise data centers and 50 remote locations globally Server Life Cycle Configuration Management Service Monitoring Client Life Cycle Manageability Services Scope Next, the presentation will explain the Configuration Management Service Management Function (SMF), including: How Manageability Services gathers and stores information about every managed server in the environment. How Manageability Services determines which servers should be managed through a Source of Record and the IT configuration (IT Config) database, in addition to using the Device Manageability Index (DMI). Finally, the presentation will discuss the Service Monitoring and Control SMF, including: The steps that Microsoft IT has taken to integrate all monitoring and control functions by using MOM. The specific processes that Microsoft IT uses to ensure that proper monitoring occurs. Clients (~233,000) Servers (~10,000) Local administrators Compliance through SMS Multiple desktops Frequent rebuilds IPsec for Secure Net Network (~10,000) Telephony (~10,000) 5 Active Directory forests Standardized on Windows Server 2003 200 servers provisioned each month 441 buildings globally 7 7

8 Microsoft Operations Framework
Structured approach to achieving operational excellence Collection of best practices, principles, and models Guidance on achieving high availability, reliability, and security 21 service management functions Microsoft Operations Framework MOF Quadrants The MOF quadrants are as follows. The primary focus of the Manageability Services group falls in the Changing and Operating quadrants. Therefore, this presentation will mostly discuss these first two quadrants. Changing The overall purpose of the Change quadrant and its SMFs is to introduce new service solutions, technologies, systems, applications, hardware, and processes. This is the Manageability Services group’s field on a global basis. The Change quadrant includes the following SMFs: Change Management, Configuration Management, Release Management Operating The Operating quadrant is concerned with performing day-to-day operations tasks effectively and efficiently. Again, Manageability Services does this on a global basis. The Operating quadrant includes the following SMFs: Service Monitoring and Control, System Administration, Network Administration, Directory Services Administration, Security Administration, Storage Management, Job Scheduling Supporting The Supporting quadrant’s focus is resolving incidents, problems, and inquiries in a timely, efficient manner. This quadrant is generally outside the scope of the Manageability Services group, except as it pertains to specific services that the group offers. The Supporting quadrant includes the following SMFs: Service Desk, Incident Management, Problem Management Optimizing The Optimizing quadrant drives changes to optimize cost, performance, capacity, and availability. This quadrant leads into, and is closely associated with, the Changing quadrant. The Manageability Services group performs many of this quadrant’s services on a global basis. However, the group’s primary focus is at the handoff to the Changing quadrant as the group implements the optimization designed by other groups. The Optimizing quadrant includes the following SMFs: Availability Management, Capacity Management, Service Level Management, Security Management, Infrastructure Engineering, Financial Management, Workforce Management, Service Continuity Management 8 8

9 MOF-Based Operations $100 Million 3-Year Spend Reduction
IT Utility (Cost per Head) Cumulative Reduction FY03 FY04 FY05 FY06 $ 7,220 $ 6,159 $5, 778 $4, 739 -15% -20% -34% Automation MOF-Based Operations MOF describes proven team structures and operational processes and applies IT best practices to improve the efficiency and quality of IT operations. It is based on the IT Infrastructure Library (ITIL), published by the U.K. Office of Government Commerce (OGC). MOF extends ITIL through the inclusion of guidance and best practices derived from the experience of Microsoft operations groups, partners, and customers. In keeping with ITIL’s spirit to “adopt and adapt,” Microsoft has chosen to provide additional, specific guidance, which applies to customers who use Microsoft technologies in their environments. MOF was designed to complement the well-established Microsoft Solutions Framework for solution and application development. Together, the frameworks provide guidance throughout the IT life cycle. MOF provides a well-structured approach to achieve operational excellence in an organization of any size. The collection of best practices, principles, and models provide clear guidance on achieving high availability, reliability, and security. The MOF process model shown on this slide describes a life cycle that can be applied to releases of any size and relating to any service solution. The model groups similar SMFs into each of the four quadrants. Each quadrant owns a specific mission of service. Consolidation Centralization 90% Auto-ticketing Single MOM console Alert-to-ticket ratio = 1.4:1 CMDB drives MOF processes Decreased duplicate/No Problem Found tickets by 90% Improved critical updates from 28 to 21 days, emergency updates from 15 to 8 days Change and release processes centralized 143 offices connected via Internet 450:1 server-to-staff ratio (remote support) 200:1 server-to-staff ratio (on-site support) Tier 2 support moved to India 30% reduction in infrastructure servers Exchange servers down from 74 to 4 sites globally 500+ virtual servers (16:1 guest-to-host ratio) Data Protection Manager (eliminated 115 tape libraries) While Improving… Security Productivity Zero service impacts from Denial of Service attacks Increased patching speed 700+ application security and privacy audits Significant improvement in customer satisfaction score Increased mobility with Microsoft Office Outlook® Web Access, Smartphones, and RPC over HTTP Greater collaboration with SharePoint, MySites, Document Workplace 9 9

10 Life Cycle Management Server and Client Software Life Cycle
Image Management Seven base client images MUI for international languages Group Policy for standard registry key changes and security configurations Operating System Provisioning Software Distribution Server and Client Software Life Cycle Bare metal—fully automated via RIS and PXE (Windows Deployment Services/RIS) Scripted automated build-outs of base operating system Product key management Package, test, and deploy security and software update packages Baseline packages (N, N+1) Patch Management Security and emergency updates Windows and Office using ITMU ITCU for third party 1. Deploy 2. Baseline 3. Inventory 4. Update Scripted builds, server joins domain SMS post-build updates SMS inventories for configuration and compliance SMS deploys security updates and other software updates 10 10

11 Patching Methodology Server and Client (Critical Updates)
Patch Released Sustainer Remediation Two week grace period Forced Remediation Patching Methodology There are three update deployment cycles that can be applied to any update: Critical: three weeks to meet the compliance SLA. Update installation is enforced at the end of two weeks, but users and server owners can install before that. For data-center servers, maintenance windows are available prior to the enforcement date so that SMS can install the updates when it is most convenient for the server users. Accelerated: 48 hours to meet the compliance SLA. Same general approach as for critical updates but much faster. Emergency: Do everything as fast as possible. This leaves no time for application compatibility testing and similar activities. Microsoft IT rarely uses this cycle. The figure on this slide details the timeline for the critical cycle. The accelerated cycle is comparable except that everything is compressed to fit within 48 hours. The emergency cycle is compressed to be done as fast as the administrators can complete the tasks. Microsoft IT will apply all updates to desktop computers by using the accelerated schedule if any of the updates in a given month are accelerated (as directed by the internal Corporate Security group). This means that desktop computers are restarted only once a month for patching. Otherwise, the computers would be restarted once soon after Patch Tuesday (the designated day for updates) for the accelerated updates, and again two weeks later for the critical updates. M T W F S Update available to server owners for testing and deployment Servers 99.5% Updated Sustainer Remediation Desktops Servers Testing/Evaluation/Installation Forced Remediation Update available to desktops via SMS, Windows Update, or Automatic Updates Desktops 98% Updated 11 11

12 Degrees of Client Management
IPsec IPsec boundary Creates Secure Net environment Degrees of Client Management This slide gives a logical illustration of how the various degrees of client management are divided among the all the devices. The outer portion (about 330,000) is the superset of all clients that are attached to the network and have obtained an IP address in the past 14 days. This includes all Remote Access Service (RAS) systems, labs, and temporary workgroups. Not all of these systems are actively managed through SMS, but they all require a certain level of compliance before they are allowed to connect to the network. For instance, noncompliant RAS clients are automatically put into quarantine until they have at least a base level of compliance, such as antivirus software and the latest critical updates. The second circle in is Secure Net, which consists of about 270,000 manageable systems. These are systems that have proper IP security (IPsec) settings to join the domain and receive source code; they include all computers in the Active Directory® directory service that have been active in the past 14 days. These clients meet all mandatory security requirements. A subset of those clients is the approximately 265,000 systems that are actively managed through SMS. These consist of data-center servers, desktop computers, and lab and pilot computers. The lab and pilot computers sometimes cross boundaries, so the numbers are constantly in flux. The approximately 16,000 servers in the diagram’s innermost circle are the data-center servers that the Data Center Operations team handles. Therefore, of the 265,000 actively managed computers, 249,000 are considered clients or desktop computers. All Devices ~330,000 Secure Net Devices ~270,000 Devices managed through SMS ~265,000 Remote access clients/dial-up Workgroups ~16,000 servers Labs Unique management challenges 12

13 Microsoft Update; E-mail and ITWeb Notification (Optional)
Multiple-Phased Approach to Client Management Multiple-Phased Approach to Client Management Microsoft IT uses a multiple-phased approach for patching client systems. Deployment and enforcement of emergency and critical updates at Microsoft consist of the following phases: 1. Notification of Helpdesk: Microsoft IT notifies Helpdesk about the impending deployment of an update so that Helpdesk can be ready to support users. 2. Announcement through Windows Update: Microsoft IT fully supports using Windows Update for patching client systems. When a product development group releases a software update, Microsoft IT uses a Group Policy setting to notify employees that they can voluntarily go to the Windows Update Web site and install the update on their own. Microsoft uses this method because of its educated user base and the fact that the majority of employees enjoy high-speed, reliable network connections and full access to the Internet. This method enables employees to manage any necessary restarts in order to minimize the impact on their work. 3. Announcement through and the Microsoft IT internal Web site: Within four to six hours of the initial Windows Update publication, Microsoft IT sends throughout the organization and posts a notification on the Microsoft IT internal Web site that a critical update is available and must be installed by a certain date and time. Typically, in the Microsoft SMS 2003 environment at Microsoft IT, about 70 percent of desktop users voluntarily respond and install critical updates within the first 24 hours after the update is announced. 4. SMS 2003 software distribution forced patch management: Microsoft IT uses the SMS 2003 Advanced Client persistent icon in the Windows desktop notification area. The persistent icon reminds users of an impending update’s enforcement date and time three to fives times a day until the grace period expires. After the grace period expires, Microsoft IT initiates an SMS 2003–based forced deployment of an update. For example, if Microsoft IT first advertises the update on Wednesday at 10:00 A.M. Pacific Time, Microsoft IT begins the forced deployment the next day (Thursday) at 5:00 A.M. Pacific Time so that the installation is available as users on the East Coast begin their day. In the Distribution Software Update Wizard, Microsoft IT selects Time Authorized—instead of Time Detected—for enforcement. This option offers a shorter time frame for forced deployment and helps Microsoft IT be confident that computers are patched as soon as users turn them on. 5. Logon script forced patch management for unmanaged desktop computers: Microsoft IT deploys the software update in a user logon script to patch the desktop computers in the unmanaged space—for example, computers in test labs that otherwise would not be forced to install the update because SMS does not manage them. 6. Environment scan: Twenty-four hours after the initiation of the SMS forced distribution, Corporate Security uses an internally developed vulnerability assessment tool to scan the entire environment—both managed and unmanaged desktop computers—and patches any computers that have not installed the update. After this round of forced patching, the vulnerability assessment tool creates a list of noncompliant computers; for example, those that were turned off at the time of the forced patch or those on slow links that are still unpatched. Corporate Security then uses an internally developed port shutdown tool to remove these unpatched systems from the corporate network. However, use of the port shutdown tool is rare at Microsoft. Low Client Impact High Client Impact Microsoft Update; and ITWeb Notification (Optional) SMS Patch Management (Voluntary > Forced) SER Scanning and Scripted Patching Port Shutdown 13 13

14 SMS Architectures Systems Management Server Data Center Lab Desktop
Central Site Central Site Central Site SMS Architectures The SMS environment at Microsoft IT consists of three main SMS hierarchies: one for managing data-center servers, one for managing lab systems, and one for managing client or desktop systems. Microsoft IT uses a separate hierarchy for managing data-center servers to ensure high-speed patching capabilities as well as rapid feedback in the form of frequently updated inventory data. Microsoft IT uses a separate hierarchy to manage lab systems because it can delegate more authority to local lab managers who need local control over software and update distribution. Note that on this slide, EMEA stands for Europe, Middle East, Africa, and Asia. Server Patch Management Redmond Puget Sound Primary Site Singapore Primary Site Dublin Australia-Asia EMEA North America Primary Sites Primary Site Puget Sound Distribution Points Distribution Points Distribution Points Primary Sites 14

15 SMS Redmond Redmond Primary Site Management Points SQL Replication
The largest single SMS site in the Microsoft IT SMS environment is the site that manages client systems in Redmond. Microsoft IT uses standard documented SMS scalability techniques, including multiple distribution points and multiple management points (within a Network Load Balancing [NLB] cluster). Microsoft also uses Structured Query Language (SQL) replication to give each management point its own copy of the database; the results are fast local access to data and removal of the workload from the server database at the primary site. Management Points SQL Replication Distribution Points NLB Cluster Random Selection Clients 15

16 Configuration Management Model
Self-Service Portal Asset management and reporting tightly linked to support operations Service management drives end-to-end IT services Metadata: manually populated Service > asset mapping Service scoping Exception tracking Element management “One Tool to Rule All” does not exist Federated model Integration Extensible modeling Data Analysis Problem Mgmt Incident Mgmt Change Mgmt Configuration Management Model This slide illustrates the enterprise configuration management architecture and basic information flow. Microsoft IT uses the Management Source of Record to determine exactly what infrastructure the team is supposed to be managing. The Source of Record is process driven. The integration framework ties that information into the business support applications, such as the change control processes, ticketing processes, and enterprise applications like SAP. In the same integration framework, details about those managed elements reach the infrastructure management applications, such as fault notification in MOM, SMS for updated configuration, and security in Active Directory Group Policy. Data aggregation and the data warehouse use the same integration framework to extract details about the managed infrastructure from the element managers (such as applications like Microsoft Exchange, telephony, network elements, and the server’s hardware and operating system). They then expose that information in a common repository so that the business support applications, such as ticketing, can use those details. The IT Config database also integrates with the next SMF to be discussed, Enterprise Monitoring and Control. Data Warehousing and Reporting CMDB Integration Framework Management Applications Fault : Config : Accounting : Performance Security : Audit Managed Infrastructure Telephony : Applications : Network : Server/Operating System 16 16

17 SQL Server Integration Services SQL Server Integration Services
Configuration and Reporting IT Services Catalog Self-Service Portal Configuration and Reporting One of the primary cost-saving initiatives of Microsoft IT is the drive toward self-service access to manageability data. Two key technologies in this arena are Microsoft SQL Server™ Report Builder and SQL Server Reporting Services. SQL Server Report Builder can provide a simplified interface to the management data collected by Microsoft System Center products. SQL Server Reporting Services provides ready access to canned reports via a Web browser and also enables more knowledgeable users to build their own reports, provided that they understand the database schemas of the source databases. Another focus area for self-service access to manageability services data is the use of bundled SQL Server Reporting Services reports that MOM and System Center Operations Manager management packs provide. Microsoft IT is in the forefront of providing requirements for, and evaluating, bundled reports in product management packs. Microsoft IT custom reports are generally submitted to the product group as suggestions for bundled reports to be included in management packs. SQL Server Analysis Services enables IT professionals to build cubes that provide concisely summarized data that directly addresses the business needs of specific consumers of data collected by MOM and SMS. Cubes that Microsoft IT builds in response to real business needs are routinely submitted to the product groups for inclusion in future management solutions. Microsoft IT provides ongoing feedback to the product groups regarding the need for denormalized database views that provide a simplified interface for IT professionals who are knowledgeable in basic SQL queries and can build their own reports or run their own queries, provided that they have a relatively simplified view into the data collected by System Center products. Offline Data Storage (ODS): Another key aspect of the need for near real-time reporting is the need for near real-time summarization of data. System Center Operations Manager 2007 will offer the ability to bypass the operational database for specific types of data and instead send data directly to the data warehouse, where it will be automatically summarized on a scheduled basis (such as hourly). Interval data is summarized and made available on an ongoing basis, eliminating the need to access the real-time operational database for near real-time data. SQL Server Report Builder SQL Server Reporting Services Views Scorecards Reports Data Warehousing And Analysis Services SQL Server Integration Services SQL Server Integration Services ODS Offload Other ODS SCCM/MOM IT Config 17 17

18 Enterprise Monitoring and Control
Systems Integration (Connectors) Ad-Hoc Gap Analysis Self-Help UI Multiple Console Views Presentation Layer Enterprise Monitoring and Control Integrated Real-Time Tools The Service Monitoring and Control SMF falls into the Operating quadrant of the Microsoft Operations Framework. The primary goal of enterprise monitoring and control is to observe the health of IT services and initiate remedial actions to minimize the impact of service incidents and system events. The Service Monitoring and Control SMF at Microsoft provides an end-to-end process that can be used to monitor services or individual components. It also provides core data about component or service trends and performance for other SMFs so that they can optimize the performance of IT services. The diagram represents the integration that has occurred to unify the monitoring and control processes and environment: The purple box represents a node on the network (a managed server), which contains hardware agents such as HP Systems Insight Manager and Dell OpenManage™ . These hardware agents then connect to MOM through a MOM agent that resides on the managed node. The MOM agent not only acts as a relay for hardware agents but also collects and sends other pertinent metrics and information. Specialized servers, such as Web servers and servers running SQL Server, use specific management packs, such as Internet Information Services (IIS) Management Pack and SQL Server Management Pack. They also connect directly to MOM to give Microsoft IT a single fault process and a single view of the health and availability of Hypertext Transfer Protocol (HTTP) and SQL Server services. Any event information is first sent up through an Microsoft IT middle-tier MOM 2005 console that enables viewing and action by specialized personnel, depending on what information is being sent. The same is true for messaging information and alerts; a middle-tier console captures specialized events and uses the same MOM agents deployed on the managed servers. All events and alerts eventually go to a centralized MOM 2005 console for a clear, single view of the network, and to allow for integration with other important management tools such as the Configuration Management Database (CMDB) and the trouble ticketing processes. Application-level monitoring uses the same MOM agent to gather information. However, this information, because it does not directly affect the network as a whole, is routed to a specialized application management console for the business unit manager’s consumption. Network-specific monitoring, such as routers and switches, relies on separate third-party network agents. Events and alerts are first piped to a third-party network monitoring console, but MOM enables direct integration of information from this console, providing this information to the centralized management console. This holistic integration gives Microsoft IT a consolidated network view and unified processes, reducing the time and effort required to manage the extremely large environment. Console Self-Help Reporting Ad Hoc Management Pack Baseline Reduce No Problem Found/Duplicate tickets Event-to-Ticket Ratio Event Stream Cleanup Alert Stream Notification Workflow Alert Stream Environment Consolidation Onboarding MOM V3 Architecture Audit Event Collection Network Management Network Source Information Internal Network Labs Extranet MMS 18 18

19 MOM 2005 Architecture Real-Time Monitoring Tools
IT Config CMDB MOM 2005 Master Centralized Monitoring Console MOM 2005 Architecture Microsoft IT uses a tiered implementation of MOM 2005 for the following reasons: A tiered implementation provides scalability by handling the high-volume performance and event data collection at the middle tier or zone layer. Only configuration and alert data are forwarded to the top tier or master management group. A tiered implementation enables monitoring data from separate network environments to be consolidated onto a single console with minimal security exposure. Limited TCP ports need to be opened in network firewalls between specific MOM servers to allow forwarding of alert data and discovered topology data. A tiered implementation enables monitoring rules to be centrally maintained in a single top-tier management group and systematically propagated to distributed zone management groups in a controlled manner, providing a structured change management process for enterprise monitoring. Microsoft IT uses management packs from Microsoft product groups as well as hardware vendors such as HP and Dell. These management packs enable Microsoft IT to provide comprehensive infrastructure management by using a single monitoring infrastructure that covers hardware, operating system, and Windows infrastructure services monitoring with a single tool. Application-layer services such as SQL database services, IIS, and custom line-of-business (LOB) applications can also be monitored in a single infrastructure. The MOM Connector Framework provides standardized integration with other management systems, such as network management, asset management, and ticketing systems. Microsoft IT uses a MOM connector provided by EMC to integrate network monitoring alerts with server monitoring alerts in the MOM 2005 master management group. MOM is also used to provide end-to-end monitoring of availability and performance of Web-based services. Service Desk VM Network Intranet MOM 2005 Zone Messaging MOM 2005 Zone Extranet MOM 2005 Zone MOM 2005 Data Warehouse VM VM VM EMC Smarts Business Unit Application Console MOM Agents Intranet Management Group 2,039 agents Intranet Management Group 2,060 agents Extranet Management Group 1,988 agents MOM 2005 Applications MG VM 19 19

20 MOM 2005 Architecture Drill-Down
Multi-Homed Agents Application Monitoring Management Group Infrastructure Monitoring Management Group Hardware Operating System Infrastructure services Application SQL Server IIS MOM 2005 Architecture Drill-Down Multi-homed Agents The primary monitoring MOM infrastructure is responsible for deploying and managing the health of MOM agents. This infrastructure: Builds on MOM agent infrastructure. Enables autonomous monitoring of specialized applications and services. Provides a simple architecture for production and pre-production environments. Managed Server Production Management Group Production Management Group Pre-Production Management Group 20 20

21 Event Pattern Monitoring WMI Subscriber
ACS Architecture Collection Databases Reporting Databases Collectors Intranet Domain Controllers ACS Architecture The current Audit Collection System (ACS) has a set of collectors, collection databases, and reporting databases. Custom applications are also used to subscribe to a subset of the consolidated event stream for real-time monitoring purposes. Separate sets of collectors, collection databases, and reporting databases are deployed for each network environment. ACS facilitates the collection of high volumes of security audit events into a central reporting/auditing database. These events provide an audit trail of activities determined by Active Directory audit policy and can be used for after-the-fact investigations. The central event collectors also provide a central point for monitoring security audit event streams. ACS provides a mechanism for WMI Query Language (WQL) subscription to specific subsets of centrally collected security audit event streams. This mechanism provides a real-time monitoring capability that can detect distributed attacks. More than 80 million events are collected per day on a single ACS collector. Total event collections average about 275 million per day in the Microsoft IT environment. ACS replaces a dedicated MOM management group for audit event collection. Zero-configuration forwarder deployments lower total cost of ownership (TCO). Forwarder deployments are automated through SMS. To help detect cross-server event patterns indicative of suspicious behavior, custom code performs WMI subscriptions to consolidated event streams. Collected data is migrated to separate reporting servers to offload reporting/query activity and to enable longer-term event storage without loss of performance on the collection databases. DTS SQL Event Pattern Monitoring WMI Subscriber WMI Intranet Exceptions DTS SQL WMI DTS Extranet SQL WMI 21 21

22 Intranet Operations Manager
Operations Manager 2007 Planned Architecture IT Config CMDB Operations Manager Centralized Monitoring Console Operations Manager 2007 System Center Operations Manager 2007 management servers can consolidate the functionality of ACS. Similarly, the database collecting security audit events can be co-located on the System Center operational database (for small to medium customers), or it can be separated onto a distinct database of security audit events for security reasons or for scalability reasons at the discretion of enterprise customers. Service Desk VM Network Intranet Operations Manager Server Zone Intranet Operations Manager Client Zone Extranet Operations Manager Zone Operations Manager Data Warehouse VM VM VM EMC SMARTS Audit Collection Database Audit Collection Database Audit Collection Database 4/1/2017 22 22 22

23 Manageability Best Practices
Thresholds for Logical Drives Outsource to Automation Self-service manageability services Single console for operations Automated agent management Automated ticketing Drive down alerts/tickets MOF processes drive services Implement service catalog and CMDB Smart Consolidation Infrastructure—Exchange Internet connected offices (ICOs)—consider ICOs and modified SLAs Use virtual servers (utility model) Consider backup to disk Security Update Status Maintenance Windows Internet Connection Local Server Directory Services Exchange Backup Server 23 23

24 Manageability Best Practices
Centralization Through Processes (MOF) Processes first, tools after Change, configuration, monitoring, incident/problem management Tier 3 (base) support Service Focus End-to-end service management ownership Service level management IT tax vs. customer-driven chargeback People Tools MOF Processes IT Services Service Life Cycle 24 24

25 For More Information Additional content on Microsoft IT deployments and best practices can be found on Microsoft TechNet Microsoft Case Study Resources For More Information Additional content about Microsoft IT deployments and best practices can be found on . TechNet: Case study resources: About Microsoft IT Showcase Microsoft IT Showcase is a collection of key business applications, deployment strategies, early adopter experiences, best practices, and leading-edge initiatives direct from the Microsoft IT organization. IT Showcase features case studies, white papers, presentations, and multimedia presentations that illustrate internal business applications, product deployment experiences, and other key IT initiatives being implemented within Microsoft. The Microsoft IT Experience Early adopter: Microsoft IT is often the first to implement new Microsoft products in a production environment—and to develop line-of-business applications based on Microsoft technologies. Knowing what challenges we've faced and how we dealt with them can help you as you plan and execute similar projects. Large-scale deployments: Microsoft IT oversees worldwide deployments, both of Microsoft products and those of other vendors. The issues we have to deal with and the lessons we learn along the way can help you as you gear up for your own large rollouts.

26 26 This document is provided for informational purposes only.
© 2006 Microsoft Corporation. All rights reserved. This presentation is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS OR IMPLIED, IN THIS SUMMARY. Microsoft, Active Directory, Outlook, SharePoint, Windows, Windows Server, and Windows Vista are either registered trademarks or trademarks of Microsoft Corporation in the United States and/or other countries. The names of actual companies and products mentioned herein may be the trademarks of their respective owners. This document is provided for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS OR IMPLIED, IN THIS DOCUMENT. © 2006 Microsoft Corporation. All rights reserved. This presentation is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS OR IMPLIED, IN THIS SUMMARY. Microsoft, Active Directory, Outlook, SharePoint, Windows, Windows Server, and Windows Vista are either registered trademarks or trademarks of Microsoft Corporation in the United States and/or other countries. The names of actual companies and products mentioned herein may be the trademarks of their respective owners. 26 26

Download ppt "Manageability Services at Microsoft"

Similar presentations

Ads by Google