Co-Directors: Yigal Arens USC / Information Sciences Institute Judith Klavans Columbia University.

Slides:



Advertisements
Similar presentations
Data Mining and the Web Susan Dumais Microsoft Research KDD97 Panel - Aug 17, 1997.
Advertisements

Report on the activities of the Digital Soil Mapping Working Group Endre Dobos.
DELOS Highlights COSTANTINO THANOS ITALIAN NATIONAL RESEARCH COUNCIL.
Presentation at WebEx Meeting June 15,  Context  Challenge  Anticipated Outcomes  Framework  Timeline & Guidance  Comment and Questions.
Terminology Reference System Vision Linda Spencer OEI/OIC Data Standards Branch Denzel Carrico Lockheed Martin Inc.
Background Chronopolis Goals Data Grid supporting a Long-term Preservation Service Data Migration Data Migration to next generation technologies Trust.
Recent Work at ISI Jose Luis Ambite Yigal Arens Eduard Hovy Andrew Philpot USC/ISI.
Kansas Health Workforce Matters Our Mission: To protect and improve the health and environment of all Kansans.
Introduction to the State-Level Mitigation 20/20 TM Software for Management of State-Level Hazard Mitigation Planning and Programming A software program.
Information Retrieval in Practice
Issues in the Transfer of Help Tools to Government Agencies: The Example of the Statistical Interactive Glossary (SIG) Stephanie W. Haas School of Information.
Co-Directors: Yigal Arens USC / Information Sciences Institute Judith Klavans Columbia University.
Interactive Dynamic Aggregate Queries Kenneth A. Ross Junyan Ding Columbia University.
CPSC 695 Future of GIS Marina L. Gavrilova. The future of GIS.
Co-Directors: Yigal Arens USC / Information Sciences Institute Judith Klavans Columbia University.
Open Statistics: Envisioning a Statistical Knowledge Network Ben Shneiderman Founding Director ( ), Human-Computer Interaction.
Joint Information Systems Committee Supporting Higher and Further Education Development of an Information Environment for UK Learning and Teaching NOF-Digitise.
Chapter 2 Succeeding as a Systems Analyst
1 Dr. Frederica Darema Senior Science and Technology Advisor NSF Future Parallel Computing Systems – what to remember from the past RAMP Workshop FCRC.
The Statistical Knowledge Network: Glossary and Metadata at the EIA Stephanie W. Haas & Sheila O. Denn The GovStat Project NSF.
Columbia University Dept of Computer Science Center for Research on Info Access University of So. Calif Information Sciences Institute (ISI)
The GovStat Project ils.unc.edu/govstat Integration of Data and Interfaces to Enhance Human Understanding of Government Statistics: Toward the National.
A Portal for Access to Complex Distributed Information about Energy Jose Luis Ambite, Yigal Arens, Eduard H. Hovy, Andrew Philpot DGRC Information Sciences.
Data Sources & Using VIVO Data Visualizing Scholarship VIVO provides network analysis and visualization tools to maximize the benefits afforded by the.
Corporation For National Research Initiatives NSF SMETE Library Building the SMETE Library: Getting Started William Y. Arms.
Finding the Right LINCS Beth Fredrick, Center for Literacy Studies
Overview of Search Engines
JumpStart the Regulatory Review: Applying the Right Tools at the Right Time to the Right Audience Lilliam Rosario, Ph.D. Director Office of Computational.
The Natural Resources Digital Library Needs, Partners, and Challenges Bonnie Avery, Janine Salwasser, & Janet Webster Oregon State University.
Open Science Grid For CI-Days Internet2: Fall Member Meeting, 2007 John McGee – OSG Engagement Manager Renaissance Computing Institute.
CI Days: Planning Your Campus Cyberinfrastructure Strategy Russ Hobby, Internet2 Internet2 Member Meeting 9 October 2007.
Science Research: Journey to 10,000 Sources Presented by: Abe Lederman, President and Founder Deep Web Technologies, Inc. Special Libraries Association.
Learning by Doing (LBD) based Course Content Development (in Areas of CS and ECE) International Institute of Information Technology Hyderabad, India 1.
Planning for Arctic GIS and Geographic Information Infrastructure Sponsored by the Arctic Research Support and Logistics Program 30 October 2003 Seattle,
Connecticut State Data Center at the Map and Geographic Information Center - MAGIC Connecticut State Data Center Affiliates Annual Meeting May 11, 2012.
Knowledge Representation and Indexing Using the Unified Medical Language System Kenneth Baclawski* Joseph “Jay” Cigna* Mieczyslaw M. Kokar* Peter Major.
CSED Computational Science & Engineering Department CHEMICAL DATABASE SERVICE The Current Service is Well Regarded The CDS has a long and distinguished.
Melissa Armstrong – Sponsor Dr. Eck Doerry – Mentor Greg Andolshek Alex Koch Michael McCormick Department of Computer Science SolutionProblemDesign User.
Page 1 Informatics Pilot Project EDRN Knowledge System Working Group San Antonio, Texas January 21, 2001 Steve Hughes Thuy Tran Dan Crichton Jet Propulsion.
Creating and Operating a Digital Library for Information and Learning– the GROW Project Muniram Budhu Department of Civil Engineering & Engineering Mechanics.
Interactive Science Publishing: A Joint OSA-NLM Project Michael J. Ackerman National Library of Medicine.
What is Cyberinfrastructure? Russ Hobby, Internet2 Clemson University CI Days 20 May 2008.
19/10/20151 Semantic WEB Scientific Data Integration Vladimir Serebryakov Computing Centre of the Russian Academy of Science Proposal: SkTech.RC/IT/Madnick.
United Nations Statistics Division Bringing Information to the World.
Metadata Architecture at StatCan MSIS 2008 Luxembourg, April 7-9, 2008 Karen Doherty Director General Informatics Branch Statistics Canada.
Innovations in Data Dissemination Thomas L. Mesenbourg, Jr. Acting Director U.S. Census Bureau United Nations Seminar on Innovations in Official Statistics.
The Saguaro Digital Library for Natural Asset Management Dr. Sudha RamSudha Ram Advanced Database Research Group Dept. of MIS The University of Arizona.
Interstate Statistical Committee of the Commonwealth of Independent States (CIS-STAT) Improvement of the Websites of the CIS Statistical Offices and Creation.
CBSOR,Indian Statistical Institute 30th March 07, ISI,Kokata 1 Digital Repository support for Consortium Dr. Devika P. Madalli Documentation Research &
Usability Issues in Metasearch Interface Design: persectives of an information provider LITA Human Machine Interface Interest Group June 25, 2004 Oliver.
Cyberinfrastructure What is it? Russ Hobby Internet2 Joint Techs, 18 July 2007.
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
“A Library outranks any other one thing a community can do to benefit its people.” --Andrew Carnegie.
OER synthesis and evaluation Phase II Start-Up Meeting – September 2010 UK OER II synthesis and evaluation Allison Littlejohn, Lou McGill Helen Beetham,
Digital Libraries1 David Rashty. Digital Libraries2 “A library is an arsenal of liberty” Anonymous.
NSDL & Access Management David Millman Columbia University Jan ‘02.
Cyberinfrastructure Overview Russ Hobby, Internet2 ECSU CI Days 4 January 2008.
Cyberinfrastructure: Many Things to Many People Russ Hobby Program Manager Internet2.
Fire Emissions Network Sept. 4, 2002 A white paper for the development of a NSF Digital Government Program proposal Stefan Falke Washington University.
Real Time Collaboration and Sharing
High Risk 1. Ensure productive use of GRID computing through participation of biologists to shape the development of the GRID. 2. Develop user-friendly.
Collection Description considerations in the nof-digitise programme Sarah Mitchell Programme Manager New Opportunities Fund.
1 CLASS Lesson Planning System and Teachers’ Collaboratory Dagobert Soergel With Katy Lawley, Tandeep Sidhu, Ryen White, and David Doermann College of.
Digital Library Development: Springboard to State-Wide Access Barbara I. Dewey Dean of Libraries University of Tennessee.
Interactive Science Publishing: A Joint OSA-NLM Project Michael J. Ackerman National Library of Medicine John Childs Optical Society of America.
The Brookings Institution
Structures for Implementation
VISUAL INTERFACE DATABASE WITH FISHEYE TECHNOLOGY
Metadata supported full-text search in a web archive
Palestinian Central Bureau of Statistics
Presentation transcript:

Co-Directors: Yigal Arens USC / Information Sciences Institute Judith Klavans Columbia University

2 The purpose of DGRC To Make Digital Government Happen Advance information systems research Bring the benefits of cutting edge IS research to government systems Help educate government and the community Learn needs from government partners to drive next stage system development Build pilot systems as part of new infrastructure

3 The problem and the solution Solution: Create a system to provide easy standardized access: need multi-database access engine, need powerful user interface, need terminology standardization mechanism. Problem:FedStats has thousands of databases in over seventy Government agencies: data is duplicated and near-duplicated, even Government officials and specialists cannot find it

4 The Vision: Ask the Government... How have property values in the area changed over the past decade? How many people had breast cancer in the area over the past 30 years? Is there an orchestra? An art gallery? How far are the nightclubs? We’re thinking of moving to Denver...What are the schools like there? Census Labor Stats

5 Research challenges Scale to incorporate many databases … build data models automatically Process large and disparate data efficiently … develop fast processing techniques … create aggregation and substitution operators Integrate data models across sources and agencies …take a large ontology and link the models into it automatically … develop ways to automatically harvest glossary data for building ontologies Develop new ways to interact with data … use language processing tools for question-answering Display complex information from distributed sources …develop and evaluate new presentation techniques

6 The Energy Data Consortium EDC members Government partners Research challenge Information Sciences Institute, USC Columbia University Energy Information Admin. (EIA) Bureau of Labor Statistics (BLS) Census Bureau Make accessible in standardized way the contents of thousands of data sets, represented in many different ways (webpages, pdf, MS Access, text…) Xxx x x Xx xxxxxx Xx xx Xxx xx X Xxx x x xx

7 The Vision: Ask the Government... Are alternative energy sources any cheaper to use? Which state has the highest oil production? How long has the nuclear plant been in service? We’re thinking of moving to Cambridge…How much does gas cost there? Census Labor Stats

8 Data Integration Labor EPA EIA Census Heterogeneous Data Sources User InterfaceInformation Access Definition Ontology query

9 From Phase I to Phase II Phase One Terminology/ontology Information integration and in-memory data analysis New Interfaces for Complex Human-computer interaction Phase Two Question-Answering Usability Testing and Evaluation Privacy Portal

10 Data Integration Labor EPA EIA Census Heterogeneous Data Sources User InterfaceInformation Access Definition Ontology Trade Main Memory Query Processing Question-Answer Access User Evaluation Task-based Evaluation query

11 Data Integration Labor EPA EIA Census Heterogeneous Data Sources User InterfaceInformation Access Definition Ontology Trade Main Memory Query Processing Question-Answer Access User Evaluation Task-based Evaluation query

12 Data Integration ??? EPA EIA Census Heterogeneous Data & Meta-data Sources User InterfaceInformation Access Data Definitions (Ontology) interface query Labor definitions Metadata mediates

13 Recent example EIA problem: Data cleared for publication is grouped together across states Also need data gathered by state separately Need general ability to ungroup and reaggregate data

14 Main Memory Achievements on large data manipulation – optimization for efficiency and speed New input for visualization with dials that user can manipulate Applications with electoral boundaries

15 Get Gloss The Identification of Glossaries in High Fan-out Websites Large sites with many links Glossaries hidden all over No coherent view within and across sites No way to determine who is defining what and how

16 Glossary Finding Function Function to compute a best guess score Ranked list Higher is better Evaluation to determine how likely it is that a high score will be associated with a (large) glossary.

17 ParseGloss Once a glossary is found, then how can individual definitions be analyzed Once analyzed into components, how then can this be loaded into the ontology GetGlossParseGloss Ontology

18 Evaluation New Effort Peter Sommer, Director of Education Center for New Media Teaching and Learning Focus on purposeful use of emerging technologies for researchers, students, teachers, analysts… Funded by NSF and BLS

19 Privacy Portal Increasing multiple access to data bases creates a security problem Original DGRC proposal included component on privacy Newly funded NSF SGER proposal Columbia – Computer Science and School of Business (Stolfo and Johnson)

20 Privacy and Government Websites What are user fears? What are their preferences? What are their perceptions of privacy issues? What are the implications for design of systems and interfaces?

21 Social Science Research Explorations of “dial manipulation” application for health databases for dynamic querying Useful for interactive mapping for redistricting Use statistics on neighborhoods, e.g. CPS (long and wide) Census summary data is another source – tables compiled for various levels Joint with ISERP Social Science Research Center

22 Proposals SGER proposal funded Topic: Urban transportation study—new methods for freight tracking in LA by comparing across databases Grant awarded to USC, shared by ISI and USC’s Dept of Policy and Planning White paper to DoT Topic: Searching for patterns in freight traffic Submitted by USC campus people and Jose Luis Ambite ITR proposal submitted Topic: Semi-automated topic hierarchy creation Partners: Eduard Hovy communicated with EPA group If funded will use EPA’s CARAT ontology as starting point and evaluation standard

23 Digital Government is Here! An increasing quantity and variety of information is available in digital form Government agencies already collect much digital information Government is a holder and provider of often unique data and services Access to information/services by industry and citizen-users must be facilitated, while limiting cost and risk

24 Well – Not Quite... Expectations are very high due to the pervasiveness of Web/Internet information technology Government IT/IS is behind best practices Legacy, stovepipe systems designed for trusted staff Failed very large modernization efforts A disconnect exists between the research community and government IS

25 The purpose of DGRC To Make Digital Government Happen Advance information systems research Bring the benefits of cutting edge IS research to government systems Help educate government and the community Learn needs from government partners to drive next stage system development Build pilot systems as part of new infrastructure

26 Thank you! Any questions?