Use of Hierarchical Keywords for Easy Data Management on HUBzero HUBbub Conference 2013 September 6 th, 2013 Gaurav Nanda, Jonathan Tan, Peter Auyeung,

Slides:



Advertisements
Similar presentations
RP Designs Semi-Custom e-Commerce Package. Overview RP Designs semi- custom e-commerce package is a complete website solution. Visitors can browse a catalog.
Advertisements

Usage Statistics in Context: related standards and tools Oliver Pesch Chief Strategist, E-Resources EBSCO Information Services Usage Statistics and Publishers:
RCM-Tool v1.0 Demo Performing Welcome to the RCM-Tool RCM-Tool
WEB DESIGN TABLES, PAGE LAYOUT AND FORMS. Page Layout Page Layout is an important part of web design Why do you think your page layout is important?
Business Development Suit Presented by Thomas Mathews.
Usage of the memoQ web service API by LSP – a case study
Iowa Code and Rules Easy Navigation and Search Scope Analysis &Planning Phases Completed Request for Execution Funding.
PolyAnalyst Data and Text Mining tool Your Knowledge Partner TM www
Taxonomies, Lexicons and Organizing Knowledge Wendi Pohs, IBM Software Group.
T-FLEX DOCs PLM, Document and Workflow Management.
April 22, Text Mining: Finding Nuggets in Mountains of Textual Data Jochen Doerre, Peter Gerstl, Roland Seiffert IBM Germany, August 1999 Presenter:
Semantic Web and Web Mining: Networking with Industry and Academia İsmail Hakkı Toroslu IST EVENT 2006.
Michael Donovan, River Campus Libraries – 12/03 DocuShare Overview and Training.
MSIS 110: Introduction to Computers; Instructor: S. Mathiyalakan1 Systems Design, Implementation, Maintenance, and Review Chapter 13.
1 of 5 This document is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS OR IMPLIED, IN THIS DOCUMENT. © 2006 Microsoft Corporation.
Text Mining: Finding Nuggets in Mountains of Textual Data Jochen Dijrre, Peter Gerstl, Roland Seiffert Presented by Drew DeHaas.
Overview of Search Engines
1/16 Final project: Web Page Classification By: Xiaodong Wang Yanhua Wang Haitang Wang University of Cincinnati.
1 Web Development Life Cycle  Ensures project consistency and completeness –Planning –Analysis –Design and Development –Testing –Implementation and Maintenance.
Knowledge Science & Engineering Institute, Beijing Normal University, Analyzing Transcripts of Online Asynchronous.
Presented by INTRUSION DETECTION SYSYTEM. CONTENT Basically this presentation contains, What is TripWire? How does TripWire work? Where is TripWire used?
THE SYSTEMS LIFE CYCLE ANALYSE DESIGN IMPLEMENT MAINTENANCE IDENTIFY/INVESTIGATE.
Louisa Lambregts, What Makes a Web Site Successful and Effective? Bottom Line... Site are successful if they meet goals/expectations.
1 Introduction to Web Development. Web Basics The Web consists of computers on the Internet connected to each other in a specific way Used in all levels.
HTML Comprehensive Concepts and Techniques Intro Project Introduction to HTML.
Advances in Technology and CRIS Nikos Houssos National Documentation Centre / National Hellenic Research Foundation, Greece euroCRIS Task Group Leader.
Ihr Logo Data Explorer - A data profiling tool. Your Logo Agenda  Introduction  Existing System  Limitations of Existing System  Proposed Solution.
Joel Bapaga on Web Design Strategies Technologies Commercial Value.
Implementation of HUBzero as a Knowledge Management System in a Large Organization HUBBUB Conference 2012 September 24 th, 2012 Gaurav Nanda, Jonathan.
Web Search Created by Ejaj Ahamed. What is web?  The World Wide Web began in 1989 at the CERN Particle Physics Lab in Switzerland. The Web did not gain.
Improving the Catalogue Interface using Endeca Tito Sierra NCSU Libraries.
Page 1 WEB MINING by NINI P SURESH PROJECT CO-ORDINATOR Kavitha Murugeshan.
Databases and Education Access Access Course Progression Access courses can be designed for intensive immersion or semester-long courses. Basic.
Section 1: Introducing Group Policy What Is Group Policy? Group Policy Scenarios New Group Policy Features Introduced with Windows Server 2008 and Windows.
Chapter 16 Structured Systems Analysis. Learning Objectives Know goals, plans, tasks, tools, & results of systems analysis Understand/appreciate costs.
Principles of Information Systems, Sixth Edition Systems Design, Implementation, Maintenance, and Review Chapter 13.
PAUL ALEXANDRU CHIRITA STEFANIA COSTACHE SIEGFRIED HANDSCHUH WOLFGANG NEJDL 1* L3S RESEARCH CENTER 2* NATIONAL UNIVERSITY OF IRELAND PROCEEDINGS OF THE.
CSE 548 Advanced Computer Network Security Document Search in MobiCloud using Hadoop Framework Sayan Cole Jaya Chakladar Group No: 1.
1-1 System Development Process System development process – a set of activities, methods, best practices, deliverables, and automated tools that stakeholders.
Markup and Validation Agents in Vijjana – A Pragmatic model for Self- Organizing, Collaborative, Domain- Centric Knowledge Networks S. Devalapalli, R.
D1.HGE.CL7.01 D1.HGA.CL6.08 Slide 1. Introduction Design, prepare and present reports  Classroom schedule  Trainer contact details  Assessments  Resources:
Resource Description Framework (RDF) Presented by: Jonathan Catlett.
Introduction to SQL Server Data Mining Nick Ward SQL Server & BI Product Specialist Microsoft Australia Nick Ward SQL Server & BI Product Specialist Microsoft.
Chapter 9 Publishing and Maintaining Your Site. 2 Principles of Web Design Chapter 9 Objectives Understand the features of Internet Service Providers.
Principles of Information Systems, Sixth Edition Systems Design, Implementation, Maintenance, and Review Chapter 13.
Introducing HingX now with Capacity Development Network.
Slide 12.1 Chapter 12 Implementation. Slide 12.2 Learning outcomes Produce a plan to minimize the risks involved with the launch phase of an e-business.
Internet Architecture and Governance
Harvesting Social Knowledge from Folksonomies Harris Wu, Mohammad Zubair, Kurt Maly, Harvesting social knowledge from folksonomies, Proceedings of the.
CASE (Computer-Aided Software Engineering) Tools Software that is used to support software process activities. Provides software process support by:- –
Principles of Information Systems, Sixth Edition 1 Systems Design, Implementation, Maintenance, and Review Chapter 13.
Search Engine using Web Mining COMS E Web Enhanced Information Mgmt Prof. Gail Kaiser Presented By: Rupal Shah (UNI: rrs2146)
Toward Semantic Search: RDFa based facet browser Jin Guang Zheng Tetherless World Constellation.
Lecture VIII: Software Architecture
Adaptive Faceted Browsing in Job Offers Danielle H. Lee
Introduction to HTML Simple facts yet crucial to beginning of study in fundamentals of web page design!
WebScan: Implementing QueryServer 2.0 Karl Geiger, Amgen Inc. BRS NA UG August 1999.
CIS-NG CASREP Information System Next Generation Shawn Baugh Amy Ramirez Amy Lee Alex Sanin Sam Avanessians.
5/29/2001Y. D. Wu & M. Liu1 Content Management for Digital Library May 29, 2001.
Data mining in web applications
Advanced Higher Computing Science
Recommender Systems & Collaborative Filtering
Chapter 1 Introduction to HTML
System Design, Implementation and Review
Web Development Life Cycle
Kenneth Baclawski et. al. PSB /11/7 Sa-Im Shin
Software Design and Architecture
Implementing an Institutional Repository: Part III
Web Mining Department of Computer Science and Engg.
Intro Project Introduction to HTML.
Presentation transcript:

Use of Hierarchical Keywords for Easy Data Management on HUBzero HUBbub Conference 2013 September 6 th, 2013 Gaurav Nanda, Jonathan Tan, Peter Auyeung, Bill Gaskill, Chris Smoak, Mark Lehto School of Industrial Engineering, Purdue University

Reliability Tools as Resources 2 Failure Mode Effects and Criticality Analysis (FMECA) –Analyzes failures of a system through failure modes, then identifies causes and effects, detection procedures and corrective actions for each failure mode. Reliability Growth Analysis –Uses Logistics to model various developmental data such as time-to-failure, discrete (success/failure) and reliability values at different times or stages Shakedown Testing –Records results of equipment testing during development or installation Functional Block Diagram –Used for process planning by describing all the input and output relations.

HUBzero Implementation Challenges 3 Collecting data from people Getting owner’s consent before publishing Selecting good quality resources for publishing Interfacing HUBzero with other Software/Groupware Access Control of the files Selection of server to host HUBzero Maintaining security of the HUBzero server

HUBzero Implementation Summary 4 Automated the process of acquiring, publishing and sharing data. Linked HUBzero with existing software in the organization. Developed new navigational features on HUBzero to improve search and review process. Semi-automated keyword assignment based on the content of the RE tool file

HUBzero Customizations 5 Sophisticated search mechanisms using metadata. Multiple views of the information Different navigation layouts (Tag Browser, Lists, Filters) Automated tagging based on content Social networking features of reviews and comment Automated Keyword assignment for each RE tool usage

HUBzero Customizations 6 Navigation Made Easy Customization done to provide quick summary of the quality and popularity of a resource

Keywords/Tags Use in Knowledge Management 7 Content Organization Content Discovery Widely used in WEB 2.0 Ontologies have been proven to be good additions to knowledge management systems: –CoMMA(Corporate Memory Management through Agents) –FRODO (a Framework for Distributed Organizational Memories) Keywords summarize a document concisely and give a high-level description of the document’s content.

Keyword Extraction Different Approaches 8 User Centered: uses historical tagging behavior of the user Need a large user group, Vague meaning issue Document Centered: uses document content Keyword Assignment Controlled vocabulary of terms Keyword Extraction Linguistics: Lexical analysis, Syntactic analysis Machine Learning: naïve Bayes, Support Vector etc. Simple Statistics: n-gram, word frequency, term frequency*inverse document frequency etc. Better for RE data since it doesn’t require proper sentence structure or training cases.

Keyword Extraction Steps Involved 9 Read and parse reviewed RE tool files Count the file specific and overall word frequencies Calculate the file and global scores and normalize them Recommend a set of keywords to the administrator for each file based on the criteria Administrator to select the final set of keywords for a file and publish them to HUBzero System to recommend a set of possible global keywords Administrator to choose global keywords and publish them to HUBzero

Keyword Extraction 10 File Keywords: Represent specific content of an RE file Global/Popular Keywords: Represent a group of RE files Both type of keywords displayed in order of decreasing scores

Keywords Display Keywords on HUBzero Resource Page 11

Keywords Display Keywords on HUBzero Resource List Page 12

Future Work 13 Implementation of more sophisticated algorithms for keyword assignment to handle complexities such as misspellings, synonyms etc. Prepare training dataset with growing number of RE tool files and use data mining techniques. Compare the results of different methods for keyword assignment. Perform usability analysis to check if users are finding the keywords helpful for browsing.

Thank You Questions?