Searching Text and Data via Common Geography 1 SEARCHING TEXT AND DATA via COMMON GEOGRAPHY Geographic Information Retrieval: Searching Text and Data via.

Slides:



Advertisements
Similar presentations
Alexandria Digital Library Project Integration of Knowledge Organization Systems into Digital Library Architectures Linda Hill, Olha Buchel, Greg Janée.
Advertisements

About «Cross Border E-archive» Conference «Digital archives and historical cross border heritage» 19 June 2014, Riga, Latvia.
Introduction to metadata for IDAH fellows Jenn Riley Metadata Librarian Digital Library Program.
Integrating Official Statistics and Geospatial Information : Issues and Challenges.
GESIS – Vocabulary, Statistics, Time and Geography Combining Statistics and Text for a View of Irish Cultural Heritage IASSIST 2009, Tampere Finland, May.
Advisory Board Meeting  Portland, Oregon  08 November 2000 System Architecture David Maier
StatCat Building a Statistical Data Finder ssrs.yale.edu/statcat Steven Citron-Pousty Ann Green Julie Linden Yale University.
HISTORICAL CENSUS RESCUE PROJECT Historical Census Rescue Project at UC DATA IASSIST 2003 Conference, June 28, 2003, Ottawa Canada Project Management.
Hanoi, Dec 6, 2008ECAI-PNC Laptops1 Laptops and Libraries: Decentralized Access to Explanatory Resources Michael Buckland University of California, Berkeley.
7/16/2002JCDL 2002, Ray Larson The “Entry Vocabulary Index” Approach to Multilingual Search Ray R. Larson, Fredric Gey, Aitao Chen, Michael Buckland University.
Data and design issues in historical GIS II: The place-based information interface. Contextualizing Places: Gazetteers, Maps, and Bibliographical Searches.
What is Where? Lecture 5 Introduction to GISs Geography 176A Department of Geography, UCSB Summer 06, Session B.
Retrieving Documents with Geographic References Using a Spatial Index Structure Based on Ontologies Database Laboratory University of A Coruña A Coruña,
1 CS 502: Computing Methods for Digital Libraries Lecture 20 Multimedia digital libraries.
Joint Information Systems Committee Supporting Higher and Further Education Development of an Information Environment for UK Learning and Teaching NOF-Digitise.
Access to Digital Heritage Resources using What, Where, When and Who Michael Buckland Electronic Cultural Atlas Initiative University of California, Berkeley.
Nov 15, 2005Ohio State University Libraries1 What, Where, When, and Who: A Renaissance for the Reference Collection Michael Buckland School of Information.
GTECH 361 Lecture 02 Introduction to ArcGIS. Today’s Objectives explore a map and get information about map features preview geographic data and metadata.
Seamless Searching of Numeric and Textual Resources Funded by a National Library Leadership Grant from the Institute of Museum and Library Services Michael.
A Digital Geolibrary: Integrating Keywords and PlacenamesECDL A Digital GeoLibrary: Integrating Keywords And Place Names Mathew Weaver and Lois Delcambre.
SLIDE 1IS 245 – Spring 2009 Codes and Rules for Description: History University of California, Berkeley School of Information IS 245: Organization.
What is Where? u Getting Started With Geographic Information Systems u Chapter 5.
Bringing Lives to Light: Biography in Context Ray R. Larson Berkeley ISchool + Kyoto University Workshop 2009 Credits: Ryan Shaw, Michael Buckland, Jeanette.
Printed Resources and Digital Information The Digital Difference in Reference Collections Michael Buckland, School of Information Management & Systems,
8/28/97Information Organization and Retrieval Files and Databases University of California, Berkeley School of Information Management and Systems SIMS.
Mar 24, 2009ECAI/CAA Williamsburg1 Electronic Cultural Atlas Initiative – Computer Applications in Archaeology Joint Conference, Williamsburg, “Making.
ECAI – CAA Conference, Fargo, April 19, 2006 Geo-temporal Indexing: Events, Lives, and Geographical Features Michael Buckland also Kim Carl, Sarah Ellinger.
Incorporating Historical and Geographical Dimensions into a Search Interface Michael Buckland Electronic Cultural Atlas Initiative University of California,
SLIDE 1IS 257 – Fall 2007 Codes and Rules for Description: History University of California, Berkeley School of Information IS 245: Organization.
Advanced Database Applications Database Indexing and Data Mining CS591-G1 -- Fall 2001 George Kollios Boston University.
National libraries and identity in the Semantic Web Gordon Dunsire BNE, Madrid, 14 Dec 2011.
Alexandria Digital Library Project Goals and Challenges in Georeferenced Digital Libraries Greg Janée.
Metadata: Its Functions in Knowledge Representation for Digital Collections 1 Summary.
Lessons learned within international collaboration in the area of digital preservation of cultural heritage Gábor KAPOSI – MTA SZTAKI Tibor SZKALICZKI.
Teaching Metadata and Networked Information Organization & Retrieval The UNT SLIS Experience William E. Moen School of Library and Information Sciences.
TOURISM PLANNING OF ALMATY INFRASTRUCTURE IN GEOINFORMATION SYSTEMS Erkin H. KakymzhanovErkin H. Kakymzhanov.
TERRA KRIDLER SENIOR LIBRARIAN & ASSISTANT UNIVERSITY ARCHIVIST AMERICAN UNIVERSITY IN CAIRO MIDDLE EAST AND NORTH AFRICA INNOVATIVE USERS GROUP CONFERENCE.
‘The Universal Catalogue’ a cultural sector viewpoint David Dawson Senior Policy Adviser (Digital Futures) Museums, Libraries and archives Council.
LIS 506 (Fall 2006) LIS 506 Information Technology Week 11: Digital Libraries & Institutional Repositories.
Connecticut State Data Center at the Map and Geographic Information Center - MAGIC Connecticut State Data Center Affiliates Annual Meeting May 11, 2012.
Metadata and Geographical Information Systems Adrian Moss KINDS project, Manchester Metropolitan University, UK
Spatial Data Analysis Yaji Sripada. Dept. of Computing Science, University of Aberdeen2 In this lecture you learn What is spatial data and their special.
Aligning library-domain metadata with the Europeana Data Model Sally CHAMBERS Valentine CHARLES ELAG 2011, Prague.
Introduction to World Geography
EContentplus BERNSTEIN – THE MEMORY OF PAPERS Collaborative systems for paper expertise and history (targeted project) max. EU funding: 1,6 Mill EURO project.
Word of the Day: “Call Number” A combination of numbers and letters which is used to identify a particular book or item in a library's collection. Items.
Alexandria Digital Library User and Use Evaluation Experiments with Log Data Analysis Linda Hill Mary Larsgaard Catherine Masi Mary-Anna Rae Philip Sallis.
Alexandria Digital Library Project Introduction ---- Digital Gazetteers Integration into Distributed Library Services JCDL 2002 Workshop Sponsored by Networked.
Introduction to metadata
Future Directions for Geolibraries Michael F. Goodchild University of California Santa Barbara.
Lazerow Lecture, UTK, University of Tennessee, Knoxville, School of Information Sciences, What, Where, When, and Who: Redesigning the Reference Environment.
1 Understanding Cataloging with DLESE Metadata Karon Kelly Katy Ginger Holly Devaul
Functional Requirements for Bibliographic Records The Changing Face of Cataloging William E. Moen Texas Center for Digital Knowledge School of Library.
MARC Content Designation and Utilization Learning from Artifacts: Metadata Utilization Analysis William E. Moen School of Library and Information Sciences.
Future of Cataloguing: how RDA positions us for the future for RDA Workshop June, 2010.
LIS 204: Introduction to Library and Information Science Week Nine Kevin Rioux, PhD.
Commission on Cyberinfrastructure for the Humanities and Social Sciences Metadata as Infrastructure, Interoperability, and the Larger Context Michael Buckland,
FIND IT! USING LIBRARY CATALOGING CONCEPTS TO ORGANIZE AND MAKE RECORDS FINDABLE DIONNE L. MACK, INTERIM DIRECTOR OF QUALITY OF LIFE DEPARTMENTS.
1 CS 430: Information Discovery Lecture 21 Non-Textual Materials 1.
Introduction: Databases and Database Systems Lecture # 1 June 19,2012 National University of Computer and Emerging Sciences.
A Complex Standard and Its Use Results from an empirical analysis of MARC 2004 Texas Library Association Annual Conference, March 18, 2004, San Antonio,
1 CS 430: Information Discovery Lecture 23 Non-Textual Materials.
Alexandria Digital Library ADL Metadata Architecture Greg Janée.
Geog. 314 Working with tables.
Dr. Dania Bilal IS 530 Spring 2005
WHAT DOES THE FUTURE HOLD? Ann Ellis Dec. 18, 2000
Introduction to Metadata
Vocabulary, Statistics, Time and Geography
University of California, Berkeley
Time Period Directories
Presentation transcript:

Searching Text and Data via Common Geography 1 SEARCHING TEXT AND DATA via COMMON GEOGRAPHY Geographic Information Retrieval: Searching Text and Data via Common Geography IASSIST 2002 Conference, June 12-14, 2002, Storrs, CT PIs - Fredric C. Gey, Michael Buckland, Aitao Chen, Ray Larson, University of California, Berkeley Students: –Vivien Petras, Natalia Perleman Work performed under Institute for Museum and Library Services (IMLS) national leadership grant and DARPA research contract N ;AO#F477: Search support for unfamiliar metadata vocabularies ( ) and IMLS proposal Going Places in the Catalog: Improved Geographic Access Fredric C. Gey

Searching Text and Data via Common Geography 2 MOTIVATION The purpose of social science research is question answering: –What are the population characteristics of Visalia California? –What is the history of Visalia? Answering such questions requires cross-genre search Currently only humans can search cross-genre information Search across genres requires metadata linkage Geography is a major linkage between numbers which describe a place and text which explains it Gazetteers uniquely identify places in spaceGazetteers Fredric C. Gey

Searching Text and Data via Common Geography 3 HETEROGENEOUS DIGITAL INFORMATION SEARCH Current Search Technology (multiple independent searches without search aids) Bibliography Full Text Maps and other Geospatial data Music and other media QUERY Numeric Statistical Databases Patents

Searching Text and Data via Common Geography 4 OUR PRIOR RESEARCH LINKED TEXT AND NUMBERS: U.S. STANDARD INDUSTRIAL CLASSIFICATION SYSTEM U.S. Standard Industrial Classification System (SIC) Used to classify and aggregate industrial activity in the U.S. Codes defined by Office of Management and Budget Descriptions are incomplete “Lobster” In U.S. SIC System “Nothing found”

Searching Text and Data via Common Geography 5 MINING TEXT TO SEARCH NUMBERS WITH ENTRY VOCABULARY TECHNOLOGY Mapped between ordinary language and specialized classifications Implemented using text categorization techniques Required text collections which have been manually indexed Preserves and leverages investment in creation of complex classification structures For: “Lobster” In U.S. SIC classification Try: “Shellfish”

Searching Text and Data via Common Geography 6 Kinds of metadata Physical metadata for location of data Structural metadata to identify the logical structure of the data –matrix of age(31) by race(5) by sex(2) Measurement metadata (units of measure, universe of discourse) Semantic metadata which describes the meaning of the data (usually a descriptive segment of text which identifies the meaning of a classification, e.g. value label) Locational metadata for identifying the geospatial and temporal aspects of data –latitude, longitude, altitude, time Fredric C. Gey

Searching Text and Data via Common Geography 7 Place is pivotal for interdisciplinary inquiry: anthropologists, economists, historians, military strategists, political scientists have common ground in space, place, spatial changes over time. Geography links numeric information about a place with textual information which elucidates the place. One can do new and unique things in library catalogs: –“find books describing the history of all towns within 30 miles of Visalia CA” Why Geographic Searching Presents Special Opportunities

Searching Text and Data via Common Geography 8 LINKING NUMERIC AND TEXT DATABASES Effective search requires clues and evidence Numeric/statistical datasets –Limited textual descriptive information Library catalogs (bibliographic databases) –Ambiguity of places ( Vienna, VA or Vienna, Austria) Improve search between these evidence-poor and ambiguous databases by metadata linkage via common geography Fredric C. Gey

Searching Text and Data via Common Geography 9 Ambiguity of Geography in Text Ambiguity of identical names: Alameda (city) or Alameda (county)? Galicia (region of Spain) or Galicia (region of Poland)? Different transliterations for from non-Roman scripts: Peking is a variant spelling of Beijing. Different names in different languages: Deutchland, Allemagne, Germany Name changes: Bombay is now Mumbai; St Petersburg became Leningrad became St Petersburg again. Political changes disrupt place-name stability: Poland ceased to be a country (late 18 th century) when partitioned between Austria- Hungary, Prussia and Russia; Prussia is no longer a country Footprint: even when a place’s name is stable the area it denotes may not be.

Searching Text and Data via Common Geography 10 Gazetteer Characteristics can be Exploited Gazetteers map place-names to unique positions: Washington DC – Latitude: North – Longitude: West (geographic centroid of the polygon representing the city) Gazetteers may contain other useful features: –Feature type: city, lake, church, bridge, … –Larger region in which the place resides: county, state, country –Additional items such as a reference to a map showing the place-name –Information about the time-range in which a particular place was or is current Gazetteers allow spatial relationships between named places to be calculated and utilized –Numeric data: How many people live within 30 miles of Visalia?

Searching Text and Data via Common Geography 11 Our Gazetteer Research Project Proposal Make use of library catalog MARC record geographic features in new ways Connect numeric data with library record information using gazetteers as intermediate metadata Utilize gazetteer information to extend library catalog search Display catalog search results in map displays to enable users to visualize search results Exploit feature type metadata to develop more complex spatial queries: “What books on travel and description concern paces within 25 miles of dams in California?” Utilize map interfaces to allow users to generate visual queries Extend searches from catalogs to other geo-referenced datasets, e.g. museum and digital cultural heritage collections.

Searching Text and Data via Common Geography 12 SUMMING UP Semantic metadata for numbers often has limited textual content Semantic metadata for text may be ambiguous in time or space –Aberdeen Scotland or Aberdeen Maryland –St. Petersburg or Leningrad –Prussia is no longer a county Gazetteers may be exploited to uniquely specify location and provide clues for disambiguation. Linking numbers, text and gazetteers allows for new ways to search library catalogs Fredric C. Gey

Searching Text and Data via Common Geography 13 Further Information: Michael Buckland and Ray Larson (buckland, Fredric Gey