Presentation is loading. Please wait.

Presentation is loading. Please wait.

Brand Niemann, US EPA & Co-Chair SICoP

Similar presentations


Presentation on theme: "Brand Niemann, US EPA & Co-Chair SICoP"— Presentation transcript:

1 A Needle in a Haystack: What Web Users Are Searching For: The Federal Sitemaps Initiative
Brand Niemann, US EPA & Co-Chair SICoP Excellence in Government Conference Washington Convention Center Breakout Session II: April 4, 11:15 am to 12:15 pm Google: Federal Sitemaps Google: SICoP

2 Prospectus 'A Needle in a Haystack: What Web Users Are Searching For.‘
The federal government is both the world's largest information source and the inventor of the Internet. So why is it so hard for federal employees and citizens to find the information they need? Hear how Google and other leaders in Internet search technology make government more open, transparent and customer-focused. Gain new perspectives on how to embrace technologies to increase your agency's presence on the Web."

3 Agenda Moderator: Jon Desenberg Google: JL Needham
PerformanceWeb.Org Google: JL Needham Sitemaps at FOSE 2007 and the need for agencies to balance their investment in web and site search (see next slide). Science.Gov: Walt Warnick OSTI's specific (and ongoing) experience with implementing sitemaps to make deep web information accessible to researchers using search engines. State Department, Luigi Canali Managing its web publishing centrally and, in particular, implementing sitemaps to ensure automatic communication of newly added content to Google. Federal CIO Council’s Sitemaps Initiative: Brand Niemann Broader policy context and the value of the Federal government as a whole embracing the Sitemap protocol and similar standards.

4 Web search vs. site search
Supporting the two levels of search All of the open and accessible deep web Search scope A segment of your public sites’ content Citizens and professionals User Professionals and citizens Search engine crawling intervals Freshness Customizable Limited by robots.txt, dynamic content Crawling Limited by server capacity and cost High-level stats Reporting tools More detailed, all facets Free Cost Varies

5 Federal Government Context
Government information is estimated to be about 80% unstructured and about 90% of the structured information is estimated to be invisible to search engine crawlers and users. In addition, because: (1) the UK government recently announced that hundreds of their websites are being consolidated or shut down to make access to information easier for people and (2) the recent SICoP Special Conference on Building DRM 3.0 and Web 3.0 in support of the Federal CIO Council Strategic Plan for FY Goal 2 (Information securely, rapidly, and reliably delivered to our stakeholders) to provide implementation strategies, best practices, and success stories, It seems appropriate to pilot a process that deals with all of these issues at the same time.

6 EPA Context Total: 27 Sample list of EPA sites with uncrawlable elements:

7 EPA Webmaster Experience
“Sitemaps as a method for discovering database content is something that I heartily endorse. It makes sense, and it's good to have a data standard for doing it. Google, et. al. are to be commended for that. Too bad it's such a minimalist protocol! As we work to expose database contents to our internal search engine, we will keep in mind the need to express that content in a Sitemap protocol as well. EIMS is our first target database, hopefully tackling it this spring.” Source: John Shirey, Notes on Federal Sitemaps Discussion, January 10, 2007.

8 EPA Pilot March 15th, EPA Web Workgroup Presentation: Objectives:
Structure unstructured EPA information. Make EPA databases visible to search engine crawlers and users. Consolidate EPA information to make it easier to use. Provide semantic metadata and linking in support of DRM 3.0 and Web 3.0 applications. Pilot Content: The new EPA Strategic Plan, Report on the Environment, Enterprise Architecture, and Performance Results were used to illustrate the “long tail” of search (being successful with obscure queries). See

9 Policy Context The CIO Council's XML Community of Practice (xml.gov) and the Semantic Interoperability Community of Practice (SICoP) encourage adoption and implementation of the Sitemap protocol by federal agencies because it: Supports the E-Government Act of 2002 (Pub. L. No ). Supports the Federal Enterprise Architecture's Data Reference Model 2.0. Supports the SICoP DRM 2.0 Implementation - Knowledge Reference Model. Supports the new CIOC Strategic Plan FY

10 Policy Context Policy Response E-Government Act of 2002
Organize and categorize information intended for public access and ensure it is searchable across agencies. Federal Enterprise Architecture's Data Reference Model 2.0 Identify how information and data are created, maintained, accessed, and used. SICoP DRM 2.0 Implementation - Knowledge Reference Model Use of increasing metadata to provide increasingly powerful search results. See next slide. CIOC Strategic Plan FY Provide updates to the FEA Data Reference Model (DRM) and establish DRM implementation strategies, best practices, and success stories.

11 From Search to Knowing Source: Figure 10 in SICoP White Paper Series Module 2: Semantic Wave Executive Guide to the Business Value of Semantic Technologies, May 15, 2006, Principal Author Mills Davis, Project10X.

12 From Search to Knowing From bottom-to-top, the amount, kinds, and complexity of metadata, modeling, context, and knowledge representation increases. From left-to-right, reasoning capabilities advance from (a) information recovery based on linguistic and statistical methods, to (b) discovery of unexpected relevant information and associations through mining, to (c) intelligence based on correlation of data sources, connecting the dots, and putting information into context; to (d) question answering ranging from simple factoids to complex decision-support, and (e) smart behaviors including robust adaptive and autonomous action.

13 From Search to Knowing Moving from lower right to upper left, the diagram depicts a spectrum of progressively more capable categories of knowledge representation together with standards and formalisms used to express metadata, associations, models, contexts, and modes of reasoning. As the amount and expressive power of the semantics and knowledge increases, so does the value of the reasoning capacity it enables.

14 Upcoming Events April 25, 2007, SICoP Special Conference 2: Building Knowledgebases for Cross-Domain Semantic Interoperability Google: DRM 3.0 and Web 3.0 May 6-8, 2007, The 22nd Semi-Annual Spring Government CIO Summit Government by Wiki: New Tools for Collaboration, Information-Sharing, and Decision-Making. Web 2.0 Essentials for Government: Tying It All Together in a Service System.


Download ppt "Brand Niemann, US EPA & Co-Chair SICoP"

Similar presentations


Ads by Google