Example Developer works down ranked list At each item can explore or not When exploring structure, can bail at any time
Proposed Approach: Rank Topology Use evaluation measures that consider the likelihood of a developer finding fix locations Use textual information to approximate developers interest (i.e., likelihood) of following trail in structural topology, starting from ranked list Rank topology = inverse of the number of hops in topology
Example Developer works down ranked list At each item can explore or not 3 rd rank result + 4 structural hops = 7 total hops Rank topology metric = 1 / 7
No discrimination: explores everything How smart is the user? Semi-intelligent: only follows a structural hop if the next method exhibits textual clues – Rank topology uses VSM cosine similarity (tf-idf) – Structural edge added if both methods > median scores for query – Supported by user studies of information foraging theory [Lawrance, et al TSE 2013] Omniscient: makes no wrong choices, exploring only those ranks and structural hops that lead to a bug
Preliminary Study: Distinguish QLM from Random Ranked list of results all have same bug fixes at exactly the same ranks
Conclusion Rank topology differentiates between randomly ordered lists and a state of the art IR technique (QLM) with relevant results at the exact same ranks Future work – How well does rank topology mimic developer behavior in practice? – How closely can/should we model user behavior? Our question: Does the research community need to revise how we evaluate FLTs?
Preliminary Study Effect of program structure on the rank topology metric for each JabRef bug used in the case study.