Presentation is loading. Please wait.

Presentation is loading. Please wait.

Enhanced Data Description for End Users ScribeKey, LLC Brian Hebert, Solutions Architect

Similar presentations

Presentation on theme: "Enhanced Data Description for End Users ScribeKey, LLC Brian Hebert, Solutions Architect"— Presentation transcript:

1 Enhanced Data Description for End Users ScribeKey, LLC Brian Hebert, Solutions Architect

2 ScribeKey Project Experience Global FGDC Metadata production for large commercial data provider(s) Federal Agency Assistance: Assess, describe, and standardize large collection of geospatial datasets Experience with data cleansing, metadata, integration, presentation, application development. 200+ Countries 72 Layers 100s of Attributes 100s of Domains Quarterly Updates 50+ States 400 Layers 1000s of Attributes 100s of Domains Annual Updates 2

3 Goal: Make Data Easy to Understand and Use Data users today have more information than ever to keep track of. Individual provider data may be just part of larger data use and mission. Learning about data can take considerable time and effort. How to best help data customer understand and use data the most effectively? Reduce the learning curve. 3

4 Multiple Data Description Sources Website Documentation Metadata User Tech Support Data Itself Users learn how to use data through a variety of sources 4 Email

5 Data Description Checklist Is there a Data User Guide? A glossary and index? Are primary data categories and entities fully described? Are all acronyms, abbreviations, provider vocabulary terms explained? Are short, cryptic database field names and values explained? Are data types, lengths, keys, nulls allowed, formats, lists clear to help user form SQL queries? Is FGDC/ISO Metadata available? Are sample values and data profiles available? Are data presentations, maps, symbols, reports prepared for quick start? All this info in one place? 5 MeaningStructure Contents Complete metadata describes Meaning, Structure, and Contents. Maximize understanding by end user to help write queries/reports.

6 Solution: Lightweight HTML Data Dictionary Full descriptions of data categories, entities, attributes, domain values. Information integrated from documentation, data profiles, metadata, and data provider website. Available as stand alone HTML or on web site. 6

7 A Library Science Indexing/Abstracting approach is taken to ensure the most important and useful information is seen first. Focus here is on clearly describing top level data categories, layers and tables. Key data provider terminology and concepts are explained. Dataset Overview 7

8 Includes Name, Geometry Type, Definition, Attribute List, Keywords, and link to standard FGDC/ISO Metadata Drill down to review Attributes and Domains FGDC metadata is typically organized and accessed as set of separate XML documents. ScribeKeys approach integrates these separate documents, making all information available at a single access point. Search/Highlight/Filter/Sort Layer and Table Details 8

9 Core Data Info: All dataset metadata including Data Type, Length, Format, Nulls Allowed, Primary and Foreign Keys, Join Information, Sample Values, Percent Complete. This data profiling information is essential for end user wanting to generate information products as reports, maps, charts, and graphs from SQL queries. Attributes and Domain Values 9

10 Helping with the Data Provider/End User Communication Gap User Language Data providers and users have different languages and understandings of data. Use of keywords, aliases, and definitions in data dictionary helps bridge this gap; provides a translation 10 Layer Table Attribute Map Symbol Centroid Join Report Provider Language Impute FROMHN EDGES ADDRFN Internal Point MTFCC S1100

11 How Does Data Profiling Help? An essential tool for enhanced metadata: shows end user actual sample values, data types, lengths, formats, percent complete, etc. This valuable contents information is typically not found in metadata. 11 NUMFIELDDESCRIPTION 1 DatasetIdA unique identifier for the dataset 2 DatabaseNameThe name of the source database 3 TableNameThe name of the source database table 4 RecordCountThe number of records in the table 5 ColumnCountThe number of columns in the table 6 NumberOfNullsThe number of null values in the table

12 ScribeKey Metadata Generation 12 Sample data is reviewed and profiled. Any metadata is imported into repository. From profile, existing user documentation, technical support staff, and website, a metadata repository is populated and metadata document templates are developed. FGDC/ISO Metadata generated, as XML/HTML reports, from metadata repository. Metadata Repository Metadata Templates Metadata Templates Metadata Export App FGDC XMLHTML PDF DOC

13 Map, Query, Report Preparation 13 Metadata Layers.MXD Preparation Prepared for end user quick start: can include symbol set up, joins/relates, maps, queries, reports, Use metadata to create GIS layers to allow variety of map presentations, reports, etc. to summarize and highlight datasets by metadata values.

14 The Geospatial Metadata Repository ABC ABC ABC ABC AreasEntities AttributesDomains METADATA REPOSITORY The Metadata Repository, implemented as an RDMBS, is populated with automated tools then used to generate metadata outputs, data dictionary content, schemas, maps, etc. Data Layers Metadata Documents Assessments 14 Derivative Datasets Meta-Maps Pivot Tables Schemas Data Dictionary Enhanced User Views

15 Recap: ScribeKey Data Description Support Generate or Upgrade FGDC/ISO Metadata Profile Data to provide user with actual contents information Help develop Data User Guides (PDF) and Website Copy Help author Indexes, Abstracts, and Glossaries Integrate multiple and separate data description materials in a single lightweight HTML front end. Help prepare ArcMap,.mxd, symbols, joins, reports, and maps Result: Data is as easy to understand and use as possible 15

16 About ScribeKey, LLC: Massachusetts Corporation Brian Hebert, PMP, 30+ years designing and building desktop and web DB/GIS solutions Extensive experience producing metadata and data dictionaries for data providers and end users Extensive experience with data integration, data quality assessments, data cleansing, ETL, and application development with ESRI/ArcObjects,.NET, SQL, XML, HTML Small focused teams, template approach, quick turnarounds, practical approach 16

Download ppt "Enhanced Data Description for End Users ScribeKey, LLC Brian Hebert, Solutions Architect"

Similar presentations

Ads by Google