Components of a Data Analysis System Scientific Drivers in the Design of an Analysis System.

Slides:



Advertisements
Similar presentations
An Indispensable Quality Assurance Tool for Dairy Processing Plants.
Advertisements

NIMAC 2.0: The Accessible Media Producer Portal NIMAC 2.0 for AMPs.
The EBSCONET Subscription Management System is a multi-lingual
Process Monitoring is only the first step in improving process efficiency.
Business Development Suit Presented by Thomas Mathews.
Tutorial 12: Enhancing Excel with Visual Basic for Applications
OVERVIEW OF OFFICE 2007 What You Need to Know to Get Started!
Integrated Imaging and Document Management System Product Demonstration.
TRACK 2™ Version 5 The ultimate process management software.
Automating Tasks With Macros
COMP106 Assignment 2 PROPOSAL 20. Proposed metaphor For the new system I propose to implement an interface which much more closely imitates a library.
NLC - The Next Linear Collider Project Lee Ann Yasukawa 05/25/99 NLC Archiving Requirements (Preliminary)
Russell Taylor Lecturer in Computing & Business Studies.
About the Presentations The presentations cover the objectives found in the opening of each chapter. All chapter objectives are listed in the beginning.
Chapter 2: The Visual Studio.NET Development Environment Visual Basic.NET Programming: From Problem Analysis to Program Design.
Chapter 7 Managing Data Sources. ASP.NET 2.0, Third Edition2.
How to Use Microsoft PowerPoint What is PowerPoint? Presentation software that allows you to create slides, handouts, notes, and outlines. Slide.
Chapter 3 Software Two major types of software
User interface design.
Application Process USAJOBS – Application Manager USA STAFFING ® —OPM’S AUTOMATED HIRING TOOL FOR FEDERAL AGENCIES.
Systems Software Operating Systems.
Open and save files directly from Word, Excel, and PowerPoint No more flash drives or sending yourself documents via Stop manually merging versions.
Working with Graphics. Objectives Understand bitmap and vector graphics Place a graphic into a frame Work with the content indicator Transform frame contents.
AGForms Smart Forms Technology Solution overview Transforming Data to a Portable Application.
2 1 Chapter 2 Data Model Database Systems: Design, Implementation, and Management, Sixth Edition, Rob and Coronel.
1 Chapter One A First Program Using C#. 2 Objectives Learn about programming tasks Learn object-oriented programming concepts Learn about the C# programming.
1 Integrated Development Environment Building Your First Project (A Step-By-Step Approach)
Linux Operations and Administration
Overview of SQL Server Alka Arora.
Languages and Environments Higher Computing Unit 2 – Software Development.
Classroom User Training June 29, 2005 Presented by:
Adobe Reader By Ryan Lingholm and Sandra Laurent.
MSS Technologies and the AIIM Grand Canyon Chapter present: Electronic Document Management System Needs Analysis.
ODBC : What is it and how does it work with MDS ?.
Miscellaneous Excel Combining Excel and Access. – Importing, exporting and linking Parsing and manipulating data. 1.
CHAPTER FOUR COMPUTER SOFTWARE.
PowerPoint Lesson 10 Sharing and Delivering Presentations Microsoft Office 2010 Advanced Cable / Morrison 1.
Designing Interface Components. Components Navigation components - the user uses these components to give instructions. Input – Components that are used.
N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTER Charles Leggett A Lightweight Histogram Interface Layer CHEP 2000 Session F (F320) Thursday.
1. To start the process, Warehouse Stationery (WSL) will invite you to use The Warehouse Group Supplier Electronic Portal and will send you the link to.
THE LWA SOFTWARE LIBRARY Jayce Dowell – LWA Users’ Meeting – July 27, 2012.
NWS FLDWAV ANALYSIS TOOL
4 Copyright © 2004, Oracle. All rights reserved. Creating a Basic Form Module.
Enhancing Forms with OLE Fields, Hyperlinks, and Subforms – Project 5.
Capabilities of Software. Object Linking & Embedding (OLE) OLE allows information to be shared between different programs For example, a spreadsheet created.
DATABASE MANAGEMENT SYSTEM ARCHITECTURE
C OMPUTING E SSENTIALS Timothy J. O’Leary Linda I. O’Leary Presentations by: Fred Bounds.
1 KFPA Critical Design Review – Fri., Jan. 30, 2009 KFPA Data Pipeline Bob Garwood- NRAO-CV.
User Interface Components Lecture # 5 From: interface-elements.html.
1 Software. 2 What is software ► Software is the term that we use for all the programs and data on a computer system. ► Two types of software ► Program.
Visual Basic for Application - Microsoft Access 2003 Finishing the application.
CONTENT  Introduction Introduction  Operating System (OS) Operating System (OS) Operating System (OS)  Summary Summary  Application Software Application.
Archiving.Net® Document Management System rchiving.Net® is a bi-lingual (Arabic/English) document management system that lets you capture, index, organize,
Chapter – 8 Software Tools.
Why PACKZ? Innovation No real innovations in pre-press for years Offers new approach using standard open file formats Technology is moving fast, we are.
From Missions to Measurements: an Ocean Discipline Experience.
Student Financial Assistance. Session 18-2 Session 18 Updates & Tips for the EDExpress 8.2 Pell Module.
CIS 595 MATLAB First Impressions. MATLAB This introduction will give Some basic ideas Main advantages and drawbacks compared to other languages.
1 AQA ICT AS Level © Nelson Thornes 2008 Operating Systems What are they and why do we need them?
Online Programming| Online Training| Real Time Projects | Certifications |Online Classes| Corporate Training |Jobs| CONTACT US: STANSYS SOFTWARE SOLUTIONS.
NIMAC for Accessible Media Producers: February 2013 NIMAC 2.0 for AMPs.
Tutorial 1 Getting Started with Adobe Dreamweaver CS5.
XP Chapter 1 Succeeding in Business with Microsoft Office Access 2003: A Problem-Solving Approach 1 Level 2 Objectives: Understanding and Creating Table.
User Interface Components
WINDOWS VISTA.
MS 2013 POWERPOINT.
CSCI/CMPE 3334 Systems Programming
Modular Object Scanning Technology (MOST)
Great Plains User Interface Training
Presentation transcript:

Components of a Data Analysis System Scientific Drivers in the Design of an Analysis System

Data Import Format –Either widely used/accepted, or –Can be converted easily from something widely used –User need not know the details of the format –Well documented (e.g., which flavor of latitude). Fast Access –Disk I/O speeds do not follow Moore’s law –Read speed is more important than write speed –Caching –File size is only important to keep access times low Content must represent the details of the data E2E - Full intent of the observer must be embedded

Data Export Format –Either widely used/accepted, or –Can be converted easily into something widely used –User need not know the details of the format –Well documented (e.g., which flavor of latitude). You can read what you write –Import format == Export format Fast Access –Disk I/O speeds do not follow Moore’s law –Read speed is more important than write speed Content must represent the details of the data E2E - Full intent of the observer must be embedded. Includes user annotation/comments

Data Base System Ability to work with more than one data set Data base for both export and import files Large data volumes –Access using scan numbers is no longer sufficient –Require the ability to select subsets of data via sophisticated data-base queries –Moderate number of columns in data base index –‘Index’ to data kept in memory to speed data access –File summaries at various levels of detail Various levels of ‘granularity” Calibrated and raw data E2E - User can add annotation/comments Security – Only the observer can access data

Data Archive Write speed more important than read speed. File size is very important Cannot anticipate types of user queries –Large number of columns in data base index –Very sophisticated/fast RDBMS Storage need not be a widely used data format –Format can be very different from that used by analysis system. Export format should be a widely used data format

Interactive On-Line Data Analysis The ability to access data ASAP –Import file updates automatically as observations proceed (real-time “filler”). –Index to file updates automatically –Updates happen per ‘integration’ (spectral-line) or per N seconds (continuum) –Minimum integration time ~ few times the minimum time of real-time “filler” –Analysis system automatically is aware of updated index. –Read-protect online/filled data? User should be able to ‘see’ the data within an ‘integration’ of when it was taken (or N seconds).

User Interface Command line –Familiar syntax better than a good syntax –Procedural with byte-wise compiling (performance) –History, min-match or command completion –Useful error messages –Interruptible –Error trapping and exception handling –Ability to “Undo”

User Interface GUI’s best for: –Interacting with data visualizations –Filling in forms data base queries options for data pipelines –Browsing for data files –Defining E2E data flow (ala labview)

Imaging Tools Visualization –Shouldn’t try to recreate those things already available in another package – export instead. Data Flagging – Pick a system that works Graphics –Traditional capabilities (zoom in/out, scroll, print, save, …) –Data volume requires great performance, smart libraries (screen resolution << # data pts) –Interactive feedback (e.g., defining baseline regions). Publishable plots or export into something else? –Default plot style –Ability to tweak everything (label formats; char sizes; add, remove, move annotation; tick mark size; major/minor ticks, full box; grid; multiple X and Y axes, …..)

Analysis Algorithms Algorithms well documented Study what exists in other packages. Robustness very important but so is speed –Provide less robust but faster alternatives Developers should not force an algorithm on users Developers should provide ‘defaults’ only Building blocks better than a do-all algorithm. Ability to use and modify ‘header’ information as well as data. E2E – do-alls are built out of the same building blocks.

Documentation On-line and hardcopy –Tutorials/Quick Guides –Cookbook Based on observing types –Reference Manuals Full, gory details Data Formats Algorithms –Searchable by keywords Quick, interactive command help from within the system. Never release until these are in place

User Support/Feedback A familiar system minimizes staff support Easily accessed, on-line “help desk” and “Suggestion” box Automatic generation of “bug” reports Observers of observers

Marketing A familiar system already has a market Don’t be another cereal on the supermarket shelf Workshops are better than papers Create a User Community Responsive feedback from developers Independent Beta testers Reputation & first experiences are everything

User Community User Forums Newsletters Accept User Contributions/Additions –Sourceforge-like system –NRAO-seal-of-approval NRAO Moderator

Real-Time Data Display To guarantee data quality –Product is not stored (except for hardcopy) –Sequential processing -- different from E2E/Data pipeline –Fast is more important than accurate –Few bells and whistles -- must avoid the RTD black hole –A simple display for all observation types more important than sophisticated displays for a few data types Display happens within an ‘integration’ of when data were taken – tied to real time filler GUI based – underlying language is unimportant Output understandable by an operator

Real Time Data Analysis Pointing/Focus/Tipping/… are different from RTD –Results should be stored (Data Base) –Results are used by the control system (pointing/focus) or by subsequent analysis (tipping) –Accuracy is as important as speed –More bells, whistles, user-options –Sequential processing (non E2E/data pipeline) –Only a few observation types are handled Analysis happens within an ‘integration’ of when data were taken GUI based – underlying language is unimportant Output understandable by an operator

IDL Work Package SDFITS –Interim solution for data import/export –Class/IDL specific; soon Aips++/Aips/UniPOPS? –MD/BDFITS next generation (keywords, incompleteness of contents, versatility, …) IDL – Tom Bania –Uses UniPOPS as a ‘model’ – familiar to many –Very good reproduction –Bania-centric – needs to be generalized

IDL Work Package Glen Langston –Assess whether IDL will meet performance, extensibility, usability, … goals. –Generalization to other observing types. –Real-Time data access and display –Developed on top of and in parallel with Tom’s work (so, implementations have diverged) –Works well for Glen’s own experiments

IDL Work Package Institutionalize what Tom and Glen have done –Code management –Code review –Combine Tom and Glen’s branch –Generalize code –Provide ways for Tom and Glen to contribute within the same revision-control branch. Develop ‘Institutionalized’ code –Improve performance, usability, maintenance –Add/Replace I/O components with better CS methods.

Calibration Work Package User-tunable algorithms –Options for the ‘real-time filler’ – sequential –Options for E2E pipeline – non-sequential –Options for interactive data reduction Default algorithms for all observing cases Extensible as new algorithms are developed User-defined/tweaked algorithms Robust and not-so-robust algorithms

Calibration Work Package Opacity/atmosphere model Output units Efficiencies –Source size –Telescope model Tsys(f) estimates Differencing schemes Non-linearities/template fitting/….