Which Feature Location Technique is Better? Emily Hill, Alberto Bacchelli, Dave Binkley, Bogdan Dit, Dawn Lawrie, Rocco Oliveto.

Slides:



Advertisements
Similar presentations
Computer Science and Engineering Inverted Linear Quadtree: Efficient Top K Spatial Keyword Search Chengyuan Zhang 1,Ying Zhang 1,Wenjie Zhang 1, Xuemin.
Advertisements

+ Multi-label Classification using Adaptive Neighborhoods Tanwistha Saha, Huzefa Rangwala and Carlotta Domeniconi Department of Computer Science George.
Multi-AbstractionRetrievalMulti-AbstractionRetrieval MotivationMotivation ExperimentsExperiments Overall Framework Multi-Abstraction Concern Localization.
Linked data: P redicting missing properties Klemen Simonic, Jan Rupnik, Primoz Skraba {klemen.simonic, jan.rupnik,
Comparing Twitter Summarization Algorithms for Multiple Post Summaries David Inouye and Jugal K. Kalita SocialCom May 10 Hyewon Lim.
Beyond Bags of Words: A Markov Random Field Model for Information Retrieval Don Metzler.
Ranking models in IR Key idea: We wish to return in order the documents most likely to be useful to the searcher To do this, we want to know which documents.
Software Fault Prediction using Language Processing Dave Binkley Henry Field Dawn Lawrie Maurizio Pighin Loyola College in Maryland Universita’ degli Studi.
Explorations in Tag Suggestion and Query Expansion Jian Wang and Brian D. Davison Lehigh University, USA SSM 2008 (Workshop on Search in Social Media)
Exploring the Neighborhood with Dora to Expedite Software Maintenance Emily Hill, Lori Pollock, K. Vijay-Shanker University of Delaware.
Information Retrieval Ling573 NLP Systems and Applications April 26, 2011.
Student simulation and evaluation DOD meeting Hua Ai 03/03/2006.
Expertise Networks in Online Communities: Structure and Algorithms Jun Zhang Mark S. Ackerman Lada Adamic University of Michigan WWW 2007, May 8–12, 2007,
T.Sharon - A.Frank 1 Internet Resources Discovery (IRD) IR Queries.
Modern Information Retrieval Chapter 2 Modeling. Can keywords be used to represent a document or a query? keywords as query and matching as query processing.
Modeling Modern Information Retrieval
Learning to Advertise. Introduction Advertising on the Internet = $$$ –Especially search advertising and web page advertising Problem: –Selecting ads.
FACT: A Learning Based Web Query Processing System Hongjun Lu, Yanlei Diao Hong Kong U. of Science & Technology Songting Chen, Zengping Tian Fudan University.
Evaluating the Performance of IR Sytems
1 CS 430 / INFO 430 Information Retrieval Lecture 3 Vector Methods 1.
The Vector Space Model …and applications in Information Retrieval.
Online Learning for Web Query Generation: Finding Documents Matching a Minority Concept on the Web Rayid Ghani Accenture Technology Labs, USA Rosie Jones.
Modern Information Retrieval Chapter 2 Modeling. Can keywords be used to represent a document or a query? keywords as query and matching as query processing.
Information retrieval: overview. Information Retrieval and Text Processing Huge literature dating back to the 1950’s! SIGIR/TREC - home for much of this.
Text-Based Content Search and Retrieval in ad hoc P2P Communities Francisco Matias Cuenca-Acuna Thu D. Nguyen
Information Retrieval
Learning Objectives LO1 Explain the role of professional judgment in audit sampling decisions. LO2 Distinguish audit sampling work from nonsampling work.
SIGIR’09 Boston 1 Entropy-biased Models for Query Representation on the Click Graph Hongbo Deng, Irwin King and Michael R. Lyu Department of Computer Science.
Modeling (Chap. 2) Modern Information Retrieval Spring 2000.
Suggesting Friends using the Implicit Social Graph Maayan Roth et al. (Google, Inc., Israel R&D Center) KDD’10 Hyewon Lim 1 Oct 2014.
A Comparison of Statistical Significance Tests for Information Retrieval Evaluation CIKM´07, November 2007.
Modern Information Retrieval: A Brief Overview By Amit Singhal Ranjan Dash.
Glasgow 02/02/04 NN k networks for content-based image retrieval Daniel Heesch.
Xiaoying Gao Computer Science Victoria University of Wellington Intelligent Agents COMP 423.
Giorgos Giannopoulos (IMIS/”Athena” R.C and NTU Athens, Greece) Theodore Dalamagas (IMIS/”Athena” R.C., Greece) Timos Sellis (IMIS/”Athena” R.C and NTU.
Binxing Jiao et. al (SIGIR ’10) Presenter : Lin, Yi-Jhen Advisor: Dr. Koh. Jia-ling Date: 2011/4/25 VISUAL SUMMARIZATION OF WEB PAGES.
University of Malta CSA3080: Lecture 6 © Chris Staff 1 of 20 CSA3080: Adaptive Hypertext Systems I Dr. Christopher Staff Department.
LANGUAGE MODELS FOR RELEVANCE FEEDBACK Lee Won Hee.
1 Mining the Web to Determine Similarity Between Words, Objects, and Communities Author : Mehran Sahami Reporter : Tse Ho Lin 2007/9/10 FLAIRS, 2006.
1 A Formal Study of Information Retrieval Heuristics Hui Fang, Tao Tao and ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign.
Language Model in Turkish IR Melih Kandemir F. Melih Özbekoğlu Can Şardan Ömer S. Uğurlu.
Similarity & Recommendation Arjen P. de Vries CWI Scientific Meeting September 27th 2013.
Multi-Abstraction Concern Localization Tien-Duy B. Le, Shaowei Wang, and David Lo School of Information Systems Singapore Management University 1.
Mining Dependency Relations for Query Expansion in Passage Retrieval Renxu Sun, Chai-Huat Ong, Tat-Seng Chua National University of Singapore SIGIR2006.
Date: 2013/4/1 Author: Jaime I. Lopez-Veyna, Victor J. Sosa-Sosa, Ivan Lopez-Arevalo Source: KEYS’12 Advisor: Jia-ling Koh Speaker: Chen-Yu Huang KESOSD.
Michael Bendersky, W. Bruce Croft Dept. of Computer Science Univ. of Massachusetts Amherst Amherst, MA SIGIR
Divided Pretreatment to Targets and Intentions for Query Recommendation Reporter: Yangyang Kang /23.
The Loquacious ( 愛說話 ) User: A Document-Independent Source of Terms for Query Expansion Diane Kelly et al. University of North Carolina at Chapel Hill.
Natural Language Processing Topics in Information Retrieval August, 2002.
Xiaoying Gao Computer Science Victoria University of Wellington COMP307 NLP 4 Information Retrieval.
Content-Based Image Retrieval Using Color Space Transformation and Wavelet Transform Presented by Tienwei Tsai Department of Information Management Chihlee.
QUERY-PERFORMANCE PREDICTION: SETTING THE EXPECTATIONS STRAIGHT Date : 2014/08/18 Author : Fiana Raiber, Oren Kurland Source : SIGIR’14 Advisor : Jia-ling.
IR 6 Scoring, term weighting and the vector space model.
CSCE 590 Web Scraping – Information Extraction II
Neighborhood - based Tag Prediction
Yiming Yang1,2, Abhay Harpale1 and Subramanian Ganaphathy1
Web News Sentence Searching Using Linguistic Graph Similarity
An Empirical Study of Learning to Rank for Entity Search
Applying Key Phrase Extraction to aid Invalidity Search
Wikitology Wikipedia as an Ontology
Trank: Ranking Entity Types Using the Web of Data
Evaluation of IR Performance
Structure and Content Scoring for XML
John Lafferty, Chengxiang Zhai School of Computer Science
Structure and Content Scoring for XML
Jonathan Elsas LTI Student Research Symposium Sept. 14, 2007
Information Retrieval and Web Design
Bug Localization with Combination of Deep Learning and Information Retrieval A. N. Lam et al. International Conference on Program Comprehension 2017.
A Neural Passage Model for Ad-hoc Document Retrieval
VECTOR SPACE MODEL Its Applications and implementations
Presentation transcript:

Which Feature Location Technique is Better? Emily Hill, Alberto Bacchelli, Dave Binkley, Bogdan Dit, Dawn Lawrie, Rocco Oliveto

Motivation: Differentiating FLTs Totally unrelated In vicinity Precision = 0.20

Example Developer works down ranked list At each item can explore or not When exploring structure, can bail at any time

Proposed Approach: Rank Topology Use evaluation measures that consider the likelihood of a developer finding fix locations Use textual information to approximate developers interest (i.e., likelihood) of following trail in structural topology, starting from ranked list Rank topology = inverse of the number of hops in topology

Example Developer works down ranked list At each item can explore or not 3 rd rank result + 4 structural hops = 7 total hops Rank topology metric = 1 / 7

No discrimination: explores everything How smart is the user? Semi-intelligent: only follows a structural hop if the next method exhibits textual clues – Rank topology uses VSM cosine similarity (tf-idf) – Structural edge added if both methods > median scores for query – Supported by user studies of information foraging theory [Lawrance, et al TSE 2013] Omniscient: makes no wrong choices, exploring only those ranks and structural hops that lead to a bug

Preliminary Study: Distinguish QLM from Random Ranked list of results all have same bug fixes at exactly the same ranks

Conclusion Rank topology differentiates between randomly ordered lists and a state of the art IR technique (QLM) with relevant results at the exact same ranks Future work – How well does rank topology mimic developer behavior in practice? – How closely can/should we model user behavior? Our question: Does the research community need to revise how we evaluate FLTs?

Preliminary Study Effect of program structure on the rank topology metric for each JabRef bug used in the case study.