Applying Text Classification in Conference Management: Some Lessons Learned Andreas Pesenhofer, Helmut Berger, Michael Dittenbach, Andreas Rauber.

Slides:



Advertisements
Similar presentations
GMD German National Research Center for Information Technology Darmstadt University of Technology Perspectives and Priorities for Digital Libraries Research.
Advertisements

Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Interoperability Scenarios All Working Groups Meeting May, Rome, Italy.
Query Chains: Learning to Rank from Implicit Feedback Paper Authors: Filip Radlinski Thorsten Joachims Presented By: Steven Carr.
GENERATING AUTOMATIC SEMANTIC ANNOTATIONS FOR RESEARCH DATASETS AYUSH SINGHAL AND JAIDEEP SRIVASTAVA CS DEPT., UNIVERSITY OF MINNESOTA, MN, USA.
STARDAT DATA ARCHIVING SUITE European Survey Research Association (ESRA), July 18 – 22, 2011, Lausanne, Switzerland Monika Linne, Evelyn Brislinger, Wolfgang.
Selecting Preservation Strategies for Web Archives Stephan Strodl, Andreas Rauber Department of Software.
Choosing an Optimal Digital Preservation Strategy Andreas Rauber Department of Software Technology and.
Presentation Outline  Project Aims  Introduction of Digital Video Library  Introduction of Our Work  Considerations and Approach  Design and Implementation.
Humboldt University: A workflow model for digital theses and dissertations ETD A workflow model for digital theses and dissertations Developments.
Automatic Evaluation of Migration Quality in Distributed Networks of Converters Miguel Ferreira Supervisors Ana Alice Baptista.
Mobile Web Search Personalization Kapil Goenka. Outline Introduction & Background Methodology Evaluation Future Work Conclusion.
A metadata-based approach Marti Hearst Associate Professor BT Visit August 18, 2005.
INFORMATION RETRIEVAL WEEK 1 AND 2
Department of Computer Science and Engineering, CUHK 1 Final Year Project 2003/2004 LYU0302 PVCAIS – Personal Video Conference Archives Indexing System.
Internet Resources Discovery (IRD) IBM DB2 Digital Library Thanks to Zvika Michnik and Avital Greenberg.
Antonella De Robbio, Dario Maguolo Mathematics Library – University Library System University of Padova – ITALY Mathematics Subject Classification and.
Xiaomeng Su & Jon Atle Gulla Dept. of Computer and Information Science Norwegian University of Science and Technology Trondheim Norway June 2004 Semantic.
Libraries and Institutional Content Management Systems
EMu and Archives NA EMu Users Conference – Oct Slide 1 EMu and Archives Experiences from the Canada Science and Technology Museum Corporation.
CONTI’2008, 5-6 June 2008, TIMISOARA 1 Towards a digital content management system Gheorghe Sebestyen-Pal, Tünde Bálint, Bogdan Moscaliuc, Agnes Sebestyen-Pal.
Title of the Poster. “Digital library services and their impact with reference to a developing country: The case of the Faculty of Health Sciences library,
European Organization for Nuclear Research Organisation Européenne pour la Recherche Nucléaire CDS Invenio CERN’s open source digital library information.
Research paper: Web Mining Research: A survey SIGKDD Explorations, June Volume 2, Issue 1 Author: R. Kosala and H. Blockeel.
Automatic Subject Classification and Topic Specific Search Engines -- Research at KnowLib Anders Ardö and Koraljka Golub DELOS Workshop, Lund, 23 June.
The Web Archiving Service Tracy Seneca California Digital Library California Digital LibraryNew York UniversityUniversity of North Texas National Digital.
1 Intra- and interdisciplinary cross- concordances for information retrieval Philipp Mayr GESIS – Leibniz Institute for the Social Sciences, Bonn, Germany.
Visual-Spatial Thinking in Digital Libraries —Top Ten Problems Chaomei Chen Brunel University June 28th 2001, Hotel Roanoke and Conference Center, Roanoke,
1 DELOS Network of Excellence on Digital Libraries with a focus on the Preservation Cluster Andreas Rauber Vienna University of Technology
Digital Libraries: Background and Overview NAWeb 2003 Jeremy Rowe Arizona State University Partnership for Research In Spatial Modeling.
27. August Kyung-Ho Choi Manager of Digital Archiving Division The National Library of Korea Sang-hoon Oh Secretary of General in.
Chapter 1 Introduction to Data Mining
Department of Computer Science and Engineering, CUHK 1 Final Year Project 2003/2004 LYU0302 PVCAIS – Personal Video Conference Archives Indexing System.
Dec 9-11, 2003ICADL Challenges in Building Federation Services over Harvested Metadata Hesham Anan, Jianfeng Tang, Kurt Maly, Michael Nelson, Mohammad.
Xiaoying Gao Computer Science Victoria University of Wellington Intelligent Agents COMP 423.
Collaborative Research: Curriculum Development for Digital Library Education Presentation in May 1,2006
UOS 1 Ontology Based Personalized Search Zhang Tao The University of Seoul.
When Experts Agree: Using Non-Affiliated Experts To Rank Popular Topics Meital Aizen.
Internet Information Retrieval Sun Wu. Course Goal To learn the basic concepts and techniques of internet search engines –How to use and evaluate search.
Proposal for Term Project J. H. Wang Mar. 2, 2015.
Topic Rathachai Chawuthai Information Management CSIM / AIT Review Draft/Issued document 0.1.
The DPubS Development Project: Building an Open Source Electronic Publishing System David Ruddy Cornell University Library.
MIND: An architecture for multimedia information retrieval in federated digital libraries Henrik Nottelmann University of Dortmund, Germany.
PSEUDO-RELEVANCE FEEDBACK FOR MULTIMEDIA RETRIEVAL Seo Seok Jun.
ALA Institutional Repository Update ALA Archives at the University of Illinois Urbana-Champaign Chris Prom Cara Bertram Denise Rayman.
I.R.I.S. © 2006, All rights reserved 1 GENERALI Belgium, a global Documentum Content Management Solution since 2004.
Translating Dialects in Search: Mapping between Specialized Languages of Discourse and Documentary Languages Vivien Petras UC Berkeley School of Information.
Automatic Video Tagging using Content Redundancy Stefan Siersdorfer 1, Jose San Pedro 2, Mark Sanderson 2 1 L3S Research Center, Germany 2 University of.
Of 33 lecture 1: introduction. of 33 the semantic web vision today’s web (1) web content – for human consumption (no structural information) people search.
Data and Applications Security Developments and Directions Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #15 Secure Multimedia Data.
Digital Libraries1 David Rashty. Digital Libraries2 “A library is an arsenal of liberty” Anonymous.
Millman—Nov 04—1 An Update on Digital Libraries David Millman Director of Research & Development Academic Information Systems Columbia University
May 26-28ICNEE 2003 ARCHON: BUILDING LEARNING ENVIRONMENTS THROUGH EXTENDED DIGITAL LIBRARY SERVICES Hesham Anan, Kurt Maly, Mohammad Zubair,et al. Digital.
Oct 12-14, 2003NSDL Challenges in Building Federation Services over Harvested Metadata Kurt Maly, Michael Nelson, Mohammad Zubair Digital Library.
Digital Video Library Network Supervisor: Prof. Michael Lyu Student: Ma Chak Kei, Jacky.
35-th Consultative Meeting of INIS Liaison Officers, October, 2010, Vienna Austria BULGARIAN INIS CENTRE INIS INPUT AND PRODUCTION Ms. Albena Georgieva.
The ERES Digital Library An Updated Interface Bob Martens Vienna University of Technology.
Margret Plank 17th International Conference on Grey Literature 1st and 2nd December 2015, Amsterdam (Netherlands) Move beyond text – How TIB manages the.
Electronic Theses and Dissertations: The bepress Approach Ben Hermalin Interim Dean, Haas School of Business, UC Berkeley & Co-Founder, bepress.
September 2003, 7 th EDG Conference, Heidelberg – Roberta Faggian, CERN/IT CERN – European Organization for Nuclear Research The GRACE Project GRid enabled.
Shaping 3-dimensional Multimedia Environments Objectives.
Use cases for BnF broad crawls Annick Lorthios. 2 Step by step, the first in-house broad crawl The 2010 broad crawl has been performed in-house at the.
Proposal for Term Project
RECENT TRENDS IN METADATA GENERATION
Austrian Statistical Datawarehouse (sDWH)
Library Technology Conference: Building Exhibits
DIGITAL LIBRARY.
Data Mining Chapter 6 Search Engines
A Suite to Compile and Analyze an LSP Corpus
Citation databases and social networks for researchers: measuring research impact and disseminating results - exercise Elisavet Koutzamani
Presentation transcript:

Applying Text Classification in Conference Management: Some Lessons Learned Andreas Pesenhofer, Helmut Berger, Michael Dittenbach, Andreas Rauber

Overview  Conference Management Systems  Classification & Clustering  Case Studies  ECDL 2005  ECR  Conclusions

Conference Management Systems  Set of tools to support conference workflow  Basic support for paper submission & review collection  Many tasks for further automation  Selection of the program committee  Topic assignment of submission  Paper to reviewer assignment  Support in review generation  Poster arrangement  Post-conference access to papers

Classification & Clustering  Topic assignment of submission  Problem: authors uncertain about precise topic assignment (conference terminology)  Solution: support by automatic assignment  Method: ATC based on abstracts  Poster arrangement & Post-conference access to papers  Problem: topic based arrangement  Solution: clustering  Method: SOM & Mnemonic SOM

ATC for topic assignment  Train model based on previous conferences  Abstract submission  Automatic assignment  Confirmation

Clustering for organization  Arrange posters thematically  Non-rectangular SOMs reflecting conference site  Mnemonic SOMs simplify post-conference paper access

Overview Conference Management Systems  Classification & Clustering  Case Studies  ECDL 2005  ECR  Conclusions

ECDL 2005 – ATC data  English abstracts of previous ECDL conferences  Topics of the conference call -> defined seven categories  Pre-processing (removing all numbers, punctuation marks, special characters, transformation to lower case)  tfidf-weighting  4,141 unique terms  IG of 3,460 top ranked terms average - accuracy over all category is 58.60%

ECDL – training data class-IDclass descriptionsum 1 Concepts of Digital Libraries, Concepts of Documents and Metadata 34 2 System Architectures, Open Archives, Collection Building, Integration and Interoperability 40 3 Information Retrieval, Information Organization, Search and Usage 67 4 User Studies, System Evaluation, Personalization, User Interfaces and User Centered Design 50 5Digital Preservation, Web Archiving and Long Term Access12 6Digital Library Applications and Case Studies65 7 Multimedia, Mixed Media, Audio, Video, 3D and non-traditional Objects 43 sum over the selected abstracts311

ECDL 2005 – classification results class-ID totalrecallF1F precision

ECDL 2005 – SOM data  Poster and Paper Organization:  full text of accepted posters of ECDL 2005  term selection based on minimal word length and document frequencies  30 posters terms  Post-conference access  71 papers and posters – 5,654 terms

ECDL 2005 – SOM

ECDL 2005 – SOM (2)

Overview Conference Management Systems  Classification & Clustering  Case Studies  ECDL 2005  ECR  Conclusions

ECR - Data  Abstracts of the ECR: European Congress for Radiology  Training set: ECR 2003 & ,952 documents  Test set: ECR documents  Same steps as for the ECDL data  Resulting in 14,887 unique terms  IG: 5,720 top ranked terms, average accuracy over all categories of 73.57%

ECR – training data class-IDclass description sum 1Abdominal and Gastrointestinal Breast Cardiac Chest Computer Applications Contrast Media Genitourinary Head and Neck Interventional Radiology Musculoskeletal Neuro Pediatric Physics in Radiology Radiographers Vascular sum over the selected abstracts

ECR 2005 – classification results class-ID totalrecallF1F precision

Conclusions  Quality is proportional to amount of training documents  Structure of the classes (overlapping?)  The bulk of submissions can be dealt with automatically  May be used for session assignment  Arrange poster & papers thematically  Easy to memorize & find

Questions? E-Commerce Competence Center Donau-City-Strasse Vienna Austria Phone:+43/1/ Fax: +43/1/ Internet:

ECDL 2005