Data Management: Data Processing Types of Data Processing at USGS There are several ways to classify Data Processing activities at USGS, and here are some.

Slides:



Advertisements
Similar presentations
Systems Investigation and Analysis
Advertisements

Systems Development Environment
HP Quality Center Overview.
Describing Process Specifications and Structured Decisions Systems Analysis and Design, 7e Kendall & Kendall 9 © 2008 Pearson Prentice Hall.
CPSC 695 Future of GIS Marina L. Gavrilova. The future of GIS.
Copyright 2002 Prentice-Hall, Inc. Chapter 1 The Systems Development Environment 1.1 Modern Systems Analysis and Design Third Edition Jeffrey A. Hoffer.
16 months…. The Visibility Information Exchange Web System is a database system and set of online tools originally designed to support the Regional Haze.
Components and Architecture CS 543 – Data Warehousing.
Fundamentals of Information Systems, Second Edition
Chapter 1 The Systems Development Environment 1.1 Modern Systems Analysis and Design Third Edition.
Modified from Sommerville’s originalsSoftware Engineering, 7th edition. Chapter 8 Slide 1 System models.
CASE Tools CIS 376 Bruce R. Maxim UM-Dearborn. Prerequisites to Software Tool Use Collection of useful tools that help in every step of building a product.
Examine Quality Assurance/Quality Control Documentation
Software Documentation Written By: Ian Sommerville Presentation By: Stephen Lopez-Couto.
Enterprise Architecture
ETL By Dr. Gabriel.
BUSINESS INTELLIGENCE/DATA INTEGRATION/ETL/INTEGRATION AN INTRODUCTION Presented by: Gautam Sinha.
Managing Data Interoperability with FME Tony Kent Applications Engineer IMGS.
USGS Data Release ESIP 2015 Winter Meeting Viv Hutchison US Geological Survey U.S. Department of the Interior U.S. Geological Survey.
Chapter 1 Database Systems. Good decisions require good information derived from raw facts Data is managed most efficiently when stored in a database.
Introduction to Information System Development.
Systems Analysis and Design: The Big Picture
WWLC Standard Operating Procedures Presented by Frank Hall, Laboratory Certification Coordinator.
Overview of the Database Development Process
Agenda: DMWG SM policy status ESIP meeting recap Reminder - DM Webinar Series New and updated web pages on DM website Metadata Training Sessions CDI meeting.
Copyright 2002 Prentice-Hall, Inc. Chapter 1 The Systems Development Environment 1.1 Modern Systems Analysis and Design.
Data Publication 101 for PhD students, starting their academic career [2014] CC-BY: 3TU.Datacentre more
AL-MAAREFA COLLEGE FOR SCIENCE AND TECHNOLOGY INFO 232: DATABASE SYSTEMS CHAPTER 1 DATABASE SYSTEMS (Cont’d) Instructor Ms. Arwa Binsaleh.
Fundamentals of Information Systems, Fifth Edition
ITEC 3220M Using and Designing Database Systems
1 Chapter 9 Database Design. 2 2 In this chapter, you will learn: That successful database design must reflect the information system of which the database.
Introduction to Database Systems
material assembled from the web pages at
© 2001 Business & Information Systems 2/e1 Chapter 8 Personal Productivity and Problem Solving.
Lead Black Slide Powered by DeSiaMore1. 2 Chapter 8 Personal Productivity and Problem Solving.
Copyright 2002 Prentice-Hall, Inc. 1.1 Modern Systems Analysis and Design Jeffrey A. Hoffer Joey F. George Joseph S. Valacich Chapter 1 The Systems Development.
Software Development Cycle What is Software? Instructions (computer programs) that when executed provide desired function and performance Data structures.
Research Design for Collaborative Computational Approaches and Scientific Workflows Deana Pennington January 8, 2007.
Software Engineering Prof. Ing. Ivo Vondrak, CSc. Dept. of Computer Science Technical University of Ostrava
IS 325 Notes for Wednesday August 28, Data is the Core of the Enterprise.
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 1 Database Systems.
Database Systems. Role and Advantages of the DBMS Improved data sharing Improved data security Better data integration Minimized data inconsistency Improved.
Developing and applying business process models in practice Statistics Norway Jenny Linnerud and Anne Gro Hustoft.
Introduction of Geoprocessing Lecture 9. Geoprocessing  Geoprocessing is any GIS operation used to manipulate data. A typical geoprocessing operation.
CISB113 Fundamentals of Information Systems IS Development.
Metadata By N.Gopinath AP/CSE Metadata and it’s role in the lifecycle. The collection, maintenance, and deployment of metadata Metadata and tool integration.
Foundations of Information Systems in Business. System ® System  A system is an interrelated set of business procedures used within one business unit.
U.S. Department of the Interior U.S. Geological Survey Decision Support Tools and USGS Data Management Best Practices Cassandra Ladino USGS Chesapeake.
Copyright 2010, The World Bank Group. All Rights Reserved. Recommended Tabulations and Dissemination Section B.
Module 4: Systems Development Chapter 13: Investigation and Analysis.
 An Information System (IS) is a collection of interrelated components that collect, process, store, and provide as output the information needed to.
Copyright (c) 2014 Pearson Education, Inc. Introduction to DBMS.
PROGRAMMING FUNDAMENTALS INTRODUCTION TO PROGRAMMING. Computer Programming Concepts. Flowchart. Structured Programming Design. Implementation Documentation.
1 Management Information Systems M Agung Ali Fikri, SE. MM.
1 Management Information Systems M Agung Ali Fikri, SE. MM.
Introduction: Databases and Database Systems Lecture # 1 June 19,2012 National University of Computer and Emerging Sciences.
Statistical process model Workshop in Ukraine October 2015 Karin Blix Quality coordinator
Data Stewardship Lifecycle A framework for data service professionals Protectors of data.
Data Management: Data Analysis Types of Data Analysis at USGS There are several ways to classify Data Analysis activities at USGS, and here are some of.
Chapter 1 The Systems Development Environment
Data Ingestion in ENES and collaboration with RDA
Towards connecting geospatial information and statistical standards in statistical production: two cases from Statistics Finland Workshop on Integrating.
Software Documentation
Chapter 1 The Systems Development Environment
Big DATA.
Chapter 1 The Systems Development Environment
Palestinian Central Bureau of Statistics
Fundamental Science Practices (FSP) of the U.S. Geological Survey
Presentation transcript:

Data Management: Data Processing Types of Data Processing at USGS There are several ways to classify Data Processing activities at USGS, and here are some of them. A Process Can Exist Anywhere Within the Data Lifecycle The Process “stage” of the data lifecycle is not limited to data preparation activities after Acquisition and before Analysis, but includes all data handling activities from obtaining data and initial storage, through basic data screening and preparation, iterating with data changes prompted during analysis, and culminating with actions that prepare data for long-term preservation and sharing. Processes may also be created for producing documentation, managing data quality, and data protection PROCESS Landing Page Strawman Data Processing covers any data manipulation activity resulting in the alteration or integration of source data, including the preparation of data for preservation and sharing. Process components can support retrieval, filtering, screening, transformation, translation, classification, transfer, and integration, among others. Data Processing typically produces data ready for use, but can also result in graphs and reports. PROCESS Process Documentation, Diagrams, and Workflow Tools Capturing and communicating information about how data were processed is critical for reproducible science. In addition to descriptive metadata, the use of flow charts, data flow diagrams, and workflow tools can help. The Importance of Standards to Data Processing The use of data standards facilitates the creation of automated data processing procedures and scripts. For example, the use of common data models provides a structural consistency for creating and sharing reusable process components and tools to serve maintenance and analytical needs for multiple projects using the same kind of data. ETL – Extract, Transform, and Load ETL is a term representing the overall process of moving data from one form or environment to another. ETL integrates and chains together processes that (1) gather data from a source, (2) screen and transform it, and (3) load it into a target data store. ETL processes are usually automated to support data warehouses, online portals, and integrated data environments such as The National Map.

Data Management: Data Processing PROCESS Landing Page Strawman Cont’d Examples of Data Processing at USGS USGS produces extensive datasets and interpretive products using a variety of data processing techniques and methods. This section provides examples of data processing for satellite imagery, sensor networks (earthquakes, real-time stream data), telemetry from ocean-going vessels and wandering animals, and for the production of aggregate datasets in portals and data warehouse access points. ETL – Extract, Transform, and Load ETL is a term used to represent a very common chain of integrated process activities. Extraction of data from one or more sources is followed by screening and transformation of the data into a form that is then loaded into a target data store. ETL processes are frequently automated and used to keep data current in online Portals, data warehouses, and integrated data environments such as The National Map. PROCESS Process Component Library Current best-practices for coding promote the creation of reusable modular components to manipulate datasets and other objects in a consistent and documented way. USGS shares components via GitHub and other venues. Process Automation and Scripting Data processing can range from a manual set of actions performed by a single person to meet specific research needs, to a fully automated operation using scripts or programs to ensure repeated production of high-quality datasets in a consistent and documented way. Automating even simple processes helps to provide documented consistency and repeatability, and generate necessary documentation. [R projects]

Data Management: Data Processing PROCESS Landing Page Strawman Cont’d What the U.S. Geological Survey Manual Says: Policies that apply to the Process stage largely deal with providing appropriate documentation of the methods and actions used to modify data from its raw form to the form used for research or produced for sharing. Metadata standards (FGDC, ISO) include sections for describing the ‘provenance’ of data, meaning that enough information is provided for the user to determine where data originated and what changes were made to get to the form being described. The USGS Manual Chapter Fundamental Science Practices: Planning and Conducting Data Collection and Research discusses the requirements for data documentation:Chapter Fundamental Science Practices: Planning and Conducting Data Collection and Research "Documentation: Data collected for publication in databases or information products, regardless of the manner in which they are published (such as USGS reports, journal articles, and Web pages), must be documented to describe the methods or techniques used to collect, process, and analyze data (including computer modeling software and tools produced by USGS); the structure of the output; description of accuracy and precision; standards for metadata; and methods of quality assurance." Further: "Standard USGS methods are employed for distinct research activities that are conducted on a frequent or ongoing basis and for types of data that are produced in large quantities. Methods must be documented to describe the processes used and the quality-assurance procedures applied." The USGS Manual Chapter Fundamental Science Practices: Review, Approval, and Release of Information Products covers the documentation of methodology:Chapter Fundamental Science Practices: Review, Approval, and Release of Information Products "Methods used to collect data and produce results must be defensible and adequately documented." Software Release --- put a reference here that describes how scripts and software that perform data processing need to be fully documented, reviewed, and released. PROCESS

Data Management: Data Processing PROCESS Landing Page Strawman Cont’d Recommended Reading: References: PROCESS

Sub Part Definition More Defs Etc Best Practices What the Survey Manual Says References Key Points Bubble Etc Recommended Reading 1 st Sublevel Page Special Call-out PROCESS