Predicting Reuse of End-User Web Macro Scripts Chris Scaffidi 1  2, Chris Bogart 2, Margaret Burnett 2, Allen Cypher 3, Brad Myers 1, Mary Shaw 1 1 Carnegie.

Predicting Reuse of End-User Web Macro Scripts Chris Scaffidi 1  2, Chris Bogart 2, Margaret Burnett 2, Allen Cypher 3, Brad Myers 1, Mary Shaw 1 1 Carnegie Mellon University 2 Oregon State University 3 IBM-Almaden

2 Repositories of end-user code: The good, the great, and the “other” C. Bogart, et al. End-User Programming in the Wild: A Field Study of CoScripter Scripts. VL/HCC 2008. Previous study: Of 1445 CoScripter macros… ~ 10% had many runs ~ 10% had many users ~ 80% were “other” This is the largest web macro repository > 6000 users, > 3000 “public” scripts Problem  Traits  Predictions  Conclusion

3 What if our repositories could… … omit pieces of code from search results if they are unlikely to be reused, anyway?... provide a UI for administrators to review (and remove?) old code that’s unlikely to be used? … advise programmers, when they upload code, about how to improve the reusability of their code? Problem  Traits  Predictions  Conclusion

4 So how do we separate the wheat from the chaff? Providing such features requires predicting whether code will ever be reused –Without relying on information that’s available after code is reused (“chicken and egg”) Ratings, reviews, etc… (For some features, of course, we can always add this information in later.) –With a fairly simple model for making predictions So that predictions can be explained to users Especially when we’re advising users about how to improve reusability of their code!!!!! Problem  Traits  Predictions  Conclusion

5 Needed: a model for predicting reuse Key questions for discovering such a model… –What information about the code indicates reusability? –How do we combine this information to predict reuse? Similar models have been successful on OO code –Predicting reuse based on coupling & cohesion –Predicting bugginess based on code complexity metrics, information about code authors, code churn, … Web macros are much simpler (don’t call each other, don’t have loops, etc)… we need different information here. Problem  Traits  Predictions  Conclusion

6Approach Approach: –Consider the steps required for reusing code –Identify macro traits that might support reusing code –Empirically test whether code with these traits is more likely to be reused –Empirically test whether these traits together can accurately predict reuse (using machine learning) Problem  Traits  Predictions  Conclusion

7 What are the traits of reusable web macros? Four fundamental steps of reuse in general: –Finding code –Understanding it –Modifying it –Composing it We expect that code is more reusable if it does not need modification to be reused. Users rarely combine CoScripter web macros. Traits should support finding, understanding, and not needing to modify. Problem  Traits  Predictions  Conclusion

8 We identified 35 candidate traits in 8 categories Mass appeal – eg popular keywordsF Language – eg data values are in EnglishU Annotations – eg commentsU Flexibility – eg parameterization (variables)M Length – eg small # distinct lines of codeUM Author information – eg at IBM IP addressM Advanced syntax – eg “control-click” keywordUM No Preconditions – eg no cookies neededM F = findability, U = understandability, M = not modifying All candidate traits values’ are computed fully automatically. Problem  Traits  Predictions  Conclusion

9 Getting some data to work with Extracted 6 months of IBM wiki data –Source code & usage logs for 937 public scripts –Four (binary) measures of reuse Execution by author > 24 hours after initial creation Execution by any other user Editing by any other user Clone/copy-paste by any other user –Why not use non-binary, absolute # of reuse counts?? Macros that call themselves (infinite loops) Macros called periodically by other (non-macro) programs Information cascades: popularity leads to popularity (purely an artifact of the wiki’s UI) (But we come back to absolute numbers later on…) Problem  Traits  Predictions  Conclusion

10 Testing for correspondence For each candidate trait, divide scripts into two groups –For boolean traits, based on true/false –For numerical traits, based on above/below mean Performed z-test of proportions: –Does the trait correspond as expected to higher likelihood of reuse? Problem  Traits  Predictions  Conclusion

11 We found many traits that empirically corresponded to reuse. Traits significant at p<0.00036 wrt at least one reuse measure –If websites hit by the macro contain certain keywords –If the macro was intended by IBM as a “tutorial” script –Number of comments in the macro’s code –If the macro has a title –Number of parameters in the macro –Number of literals hard-coded in the macro –Number of distinct lines of code in the macro –ID number of the macro author (indicates early adopter) –ID number of the script (generally lower for early adopters) –If the author was at an IBM IP address –Number of author’s previous scripts that had been reused –If the macro used ordinal advanced syntax –If the macro used “control-click”/”control-select” syntax –If the macro required user to be at a certain URL prior to run –If the macro hits a lot of different websites Traits significant at p<0.00036 wrt at least one reuse measure –If websites hit by the macro contain certain keywords –If the macro was intended by IBM as a “tutorial” script –Number of comments in the macro’s code –If the macro has a title –Number of parameters in the macro –Number of literals hard-coded in the macro –Number of distinct lines of code in the macro –ID number of the macro author (indicates early adopter) –ID number of the script (generally lower for early adopters) –If the author was at an IBM IP address –Number of author’s previous scripts that had been reused –If the macro used ordinal advanced syntax –If the macro used “control-click”/”control-select” syntax –If the macro required user to be at a certain URL prior to run –If the macro hits a lot of different websites Mass appeal traits Annotation traits Length traits Traits hinting higher author expertise Use of advanced syntax Problem  Traits  Predictions  Conclusion

12 These traits are “raw materials” for a predictive model. A model of the form reuse-measure = F(trait 1, trait 2, …, trait N ) –For starters, continue to use binary reuse measures. –Approachable with supervised machine learning. –F should be pretty simple, so that we can generate those explanations. Problem  Traits  Predictions  Conclusion

13 For each trait –Find the threshold that optimally divides the reused macros from the un-reused macros –Retain trait only if its optimal divide does a good job of dividing reused macros from un-reused macros We call each trait-based constraint a “reuse predictor”. Model that we developed (in words & pictures) Trait level Threshold Trait level Threshold Problem  Traits  Predictions  Conclusion

14 Predicting if a macro will be reused Count how many predictors are satisfied Predict that the macro will be reused if this count exceeds some minimum –Also a tunable parameter –A higher minimum implies a higher bar that a macro must overcome to be predicted as to be reused Fewer false positives, higher false negatives Problem  Traits  Predictions  Conclusion

15Example E.g.: Suppose the predictors were… comments ≥ 3 inet_urls ≤ 1 prev_created ≥ 10 literals ≤ 4 Explanations might someday be formed like, –“You might be able to raise the reusability of this macro by providing a few more comments and by replacing some literals with variables.” Show me how. –“Macros in the search results have many comments, experienced authors, few inaccessible URLs, and few hardcoded literals.” Tell me more. Problem  Traits  Predictions  Conclusion

16 Algorithm accuracy with varying values for tunable parameters Alternate machine learning algorithms True Positive Rate False Positive Rate Problem  Traits  Predictions  Conclusion

17 Absolute level of reuse rose sharply with the number of matches Problem  Traits  Predictions  Conclusion

18 Conclusions Conclusions Traits contain enough information to predict reuse –Can we improve accuracy by tweaking how predictors are built and selected? –Can we improve accuracy by using more traits and/or information available after reuse is attempted? –Can we generalize to other kinds of programs? –Can we also predict reusability? Predictions combine trait data fairly simply –Work with IBM to enhance the CoScripter wiki Improving the search results Providing UI for administrators to review macros Giving programmers advice automatically Problem  Traits  Predictions  Conclusion

19 Conclusions and future work Traits contain enough information to predict reuse –Can we improve accuracy by tweaking how predictors are built and selected? –Can we improve accuracy by using more traits and/or information available after reuse is attempted? –Can we generalize to other kinds of programs? –Can we also predict reusability? Predictions combine trait data fairly simply –Work with IBM to enhance the CoScripter wiki Improving the search results Providing UI for administrators to review macros Giving programmers advice automatically Problem  Traits  Predictions  Conclusion

20 Thank You To the VL/HCC for this opportunity To the EUSES Consortium for feedback To NSF for funding Problem  Traits  Predictions  Conclusion

21 On closer examination… Many rarely-reused macros… Reference non-public (intranet) URLs Assume the browser is at a certain URL Require user to be logged into a site Hardcode values for form fields

22 Typical activities toward achieving end-user programming goals Create a new end-user program from scratch Clone or copy-paste from existing end-user program Tweak code Programmatically call existing end-user code (rare) Manually run a series of existing end-user programs Or any combination of the above. Create a new end-user program from scratch Clone or copy-paste from existing end-user program Tweak code Programmatically call existing end-user code (rare) Manually run a series of existing end-user programs Or any combination of the above. Note: 4 activities reuse or operate on existing code.

23 Details of picking thresholds and choosing whether to retain traits Pick trait’s threshold by maximizing difference between (fraction of above-threshold macros that are reused) &(fraction of below-threshold macros that are reused) Retain trait if its difference exceeds a minimal distance Raising the minimum means that –Only the best traits are retained –Information can be lost Similar to a z-test, but simpler to explain “95% of macros with >3 comments were reused, versus only 24% of macros with fewer comments” Problem  Traits  Predictions  Conclusion

Predicting Reuse of End-User Web Macro Scripts Chris Scaffidi 1  2, Chris Bogart 2, Margaret Burnett 2, Allen Cypher 3, Brad Myers 1, Mary Shaw 1 1 Carnegie.

Similar presentations

Presentation on theme: "Predicting Reuse of End-User Web Macro Scripts Chris Scaffidi 1  2, Chris Bogart 2, Margaret Burnett 2, Allen Cypher 3, Brad Myers 1, Mary Shaw 1 1 Carnegie."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Predicting Reuse of End-User Web Macro Scripts Chris Scaffidi 1  2, Chris Bogart 2, Margaret Burnett 2, Allen Cypher 3, Brad Myers 1, Mary Shaw 1 1 Carnegie.

Similar presentations

Presentation on theme: "Predicting Reuse of End-User Web Macro Scripts Chris Scaffidi 1  2, Chris Bogart 2, Margaret Burnett 2, Allen Cypher 3, Brad Myers 1, Mary Shaw 1 1 Carnegie."— Presentation transcript:

Similar presentations

About project

Feedback