Presentation is loading. Please wait.

Presentation is loading. Please wait.

Automatically Grading Programming Assignments with Web-CAT Stephen H. Edwards Virginia Tech Dept. of Computer Science

Similar presentations


Presentation on theme: "Automatically Grading Programming Assignments with Web-CAT Stephen H. Edwards Virginia Tech Dept. of Computer Science"— Presentation transcript:

1 Automatically Grading Programming Assignments with Web-CAT Stephen H. Edwards Virginia Tech Dept. of Computer Science

2 Stephen EdwardsVirginia Tech My goals today are to … Explain how requiring students to formulate and test hypotheses about their own code can improve their understanding and performance Describe our experiences with an alternate grading approach supported by a new tool: Web-CAT Describe some of the flexibility in Web-CAT for supporting other approaches Convince you software testing can be an important—and practical—addition to classroom practices Automatically Grading Programming Assignments with Web-CAT

3 Stephen EdwardsVirginia Tech Automatically Grading Programming Assignments with Web-CAT Students hold onto ineffective techniques Too often, intro students believe that if their code: compiles, the errors are mostly gone runs correctly when I try it once, it is correct runs on the instructor-provided sample input, it is correct has a problem, it can be fixed by trial and error

4 Stephen EdwardsVirginia Tech Automatically Grading Programming Assignments with Web-CAT What is reflection-in-action? For an expert, when the current technique is failing … Step back and reflect: “I must be missing something” Re-examine the situation, your solution, and your implicit assumptions about the problem Leads to guesses (hypotheses) about why the solution isn’t working or why something else will be better “[Carry] out an experiment which serves to generate both a new understanding of the phenomenon and a change in the situation”

5 Stephen EdwardsVirginia Tech Automatically Grading Programming Assignments with Web-CAT Practicing software testing will help students frame and carry out experiments The problem: too much focus on synthesis and analysis too early in teaching CS Need to be able to read and comprehend source code Envision how a change in the code will result in a change in the behavior Need explicit, continually reinforced practice in hypothesizing about program behavior and then experimentally verifying their hypotheses

6 Stephen EdwardsVirginia Tech Student comments suggest their current testing practices are often weak I run them through some simple tests to ensure that it is operating as expected. But for the most part I have always relied on supplied test data I don’t think about test cases until I am confident my program is 100% working. Of course, it almost never is … I usually write the whole thing up and then start doing rapid-fire tests of everything I can think of. Automatically Grading Programming Assignments with Web-CAT

7 Stephen EdwardsVirginia Tech Automatically Grading Programming Assignments with Web-CAT A comprehensive strategy is necessary for a culture shift in what students do Students cannot test their own code Want a culture shift in student behavior A single upper-division course would have little impact on practices in other classes So: Systematically incorporate testing practices across many courses CS1 CS2 OO Design OO Design Data Struct Data Struct Testing Practices Testing Practices

8 Stephen EdwardsVirginia Tech Automatically Grading Programming Assignments with Web-CAT Expect students to apply their testing skills all the time in programming assignments Expect students to test their own work Empower students by engaging them in the process of assessing their own programs Require students to demonstrate the correctness of their own work through testing Do this consistently across many courses

9 Stephen EdwardsVirginia Tech Automatically Grading Programming Assignments with Web-CAT What tools and techniques should I teach? We want to start with skills that are directly applicable to authentic student-oriented tasks Don’t want to add bureaucratic busywork to assignments Without tool support, this is a lost cause! It is imperative to give students skills they value … But most textbooks only give a “conceptual” intro to idealized industrial practices, not techniques students can use in their own assignments

10 Stephen EdwardsVirginia Tech Automatically Grading Programming Assignments with Web-CAT Test-driven development is very accessible for students Also called “test-first coding” Focuses on thorough unit testing at the level of individual methods/functions “Write a little test, write a little code” Tests come first, and describe what is expected, then followed by code, which must be revised until all tests pass Encourages lots of small (even tiny) iterations See for on-line references

11 Stephen EdwardsVirginia Tech Automatically Grading Programming Assignments with Web-CAT Students can apply TDD in assignments and get immediate, useful benefits Conceptually, easy for students to understand and relate to Increases confidence in code Increases understanding of requirements Preempts “big bang” integration

12 Stephen EdwardsVirginia Tech The problem is devising an effective assessment strategy Need to assess student performance at testing Need to give productive feedback Need to provide rapid turnaround Cannot afford huge increase in resources required Automatically Grading Programming Assignments with Web-CAT

13 Stephen EdwardsVirginia Tech Conventional automated assessment does not encourage good testing habits Student uploads program Program is compiled Executed against test data Scored based on output Automatically Grading Programming Assignments with Web-CAT

14 Stephen EdwardsVirginia Tech The conventional approach provides useful benefits that do lead to a cultural change Fast, precise feedback to students Chance(s) to improve based on feedback Good assessment of behavior Systematic use resulted in culture change Automatically Grading Programming Assignments with Web-CAT

15 Stephen EdwardsVirginia Tech But the conventional approach may discourage desired behavior and skills Focus is on output correctness, first and foremost “Get it working first, work on commenting, structure, etc. later” Students not encouraged or rewarded for testing on their own Students often do less testing Automatically Grading Programming Assignments with Web-CAT

16 Stephen EdwardsVirginia Tech Proper grading and feedback can provide positive incentive for desirable behavior Decide what behavior to foster Choose a corresponding scoring/reward system Design feedback approach Use students’ adaptive nature to drive cultural change Automatically Grading Programming Assignments with Web-CAT

17 Stephen EdwardsVirginia Tech Automatically Grading Programming Assignments with Web-CAT Proper grading and feedback is critical to reinforcing desired behavior Assess test validity: correctness of student’s tests Assess test completeness: the “thoroughness” of student’s tests Assess program correctness: behavior of student’s solution Multiply scores as percentages

18 Stephen EdwardsVirginia Tech Automatically Grading Programming Assignments with Web-CAT Students improve their code quality when using Web-CAT Newly written “untested” code Commerical-quality code

19 Stephen EdwardsVirginia Tech Automatically Grading Programming Assignments with Web-CAT Students start earlier and finish earlier when they use Web-CAT

20 Stephen EdwardsVirginia Tech Automatically Grading Programming Assignments with Web-CAT An evaluation of submitted code indicates students program more effectively Bold  p =.05 significance WithoutWith TDD Recorded grades90.2%96.1% TA assessment98.1%98.2% Automated grader assessment76.8%94.0% Faults on master test suite36.7%24.9% Projected Defects/KSLOC 7038 (45% less!) How early was first submission?2.2 days4.2 days

21 Stephen EdwardsVirginia Tech After using TDD and Web-CAT, students clearly perceive practical benefits AgreeDisagree More helpful at detecting errors than Curator4.3 Provides excellent support for TDD4.1 Increases my confidence in correctness3.9 Increases my confidence when making changes3.8 Makes me test my solution more thoroughly3.8 Makes me more systematic in devising tests3.8 Would like to use, even if not required3.8 Automatically Grading Programming Assignments with Web-CAT

22 Stephen EdwardsVirginia Tech Student reactions are very positive toward TDD I am very excited about using TDD. I agree that TDD can be beneficial and I’m glad we are being required to experiment with it in this course. If it increases the effectiveness of my programming and decreases the time I spend debugging, then I am all for it. [Previously,] I had to quit my detailed testing and stick to making the program appear to work with the sample data given every time a deadline drew near. With [TDD], the tests are such an integral part of the project that no time- conserving measure will save me. Automatically Grading Programming Assignments with Web-CAT

23 Stephen EdwardsVirginia Tech Automatically Grading Programming Assignments with Web-CAT We use Web-CAT to automatically process student submissions and check their work Web application written in 100% pure Java Deployed as a servlet Built on Apple’s WebObjects Uses a large-grained plug-in architecture internally, providing for easily extensible data model, UI, and processing features

24 Stephen EdwardsVirginia Tech Automatically Grading Programming Assignments with Web-CAT Web-CAT’s strengths are targeted at broader use Security: mini-plug-ins for different authentication schemes, global user permissions, and per-course role- based permissions Portability: 100% pure Java servlet for Web-CAT engine Extensibility: Completely language-neutral, process- agnostic approach to grading, via site-wide or instructor- specific grading plug-ins Manual grading: HTML “web printouts” of student submissions can be directly marked up by course staff to provide feedback

25 Stephen EdwardsVirginia Tech Automatically Grading Programming Assignments with Web-CAT Grading plug-ins are the key to process flexibility and extensibility in Web-CAT Processing for an assignment consists of a “tool chain” or pipeline of one or more grading plug-ins The instructor has complete control over which plug-ins appear in the pipeline, in what order, and with what parameters A simple and flexible, yet powerful way for plug-ins to communicate with Web-CAT, with each other We have a number of existing plug-ins for Java, C++, Scheme, Prolog, Pascal, Standard ML, … Instructors can write and upload their own plug-ins Plug-ins can be written in any language executable on the server (we usually use Perl)

26 Stephen EdwardsVirginia Tech Automatically Grading Programming Assignments with Web-CAT The most well-known plug-in is for grading Java assignments that include student tests ANT-based build of arbitrary Java projects PMD and Checkstyle static analysis ANT-based execution of student-written JUnit tests Carefully designed Java security policy Clover test coverage instrumentation ANT-based execution of optional instructor reference tests Unified HTML web printout Highly configurable (PMD rules, Checkstyle rules, supplemental jar files, supplemental data files, java security policy, point deductions, and lots more)

27 Stephen EdwardsVirginia Tech Automatically Grading Programming Assignments with Web-CAT Web-CAT supports a variety of languages, and its Java plug-in is aimed at software testing ANT-based build of arbitrary Java projects PMD and Checkstyle static analysis ANT-based execution of student-written JUnit tests Carefully designed Java security policy Clover test coverage instrumentation ANT-based execution of optional instructor reference tests Unified HTML web printout Highly configurable (PMD rules, Checkstyle rules, supplemental jar files, supplemental data files, java security policy, point deductions, and lots more)

28 Stephen EdwardsVirginia Tech Automatically Grading Programming Assignments with Web-CAT Web-CAT provides timely, constructive feedback on how to improve performance Indicates where code can be improved Indicates which parts were not tested well enough Provides as many “revise/ resubmit” cycles as possible

29 Stephen EdwardsVirginia Tech Automatically Grading Programming Assignments with Web-CAT The most important step in writing testable assignments is … Learning to write tests yourself Writing an instructor’s solution with tests that thoroughly cover all the expected behavior Practice what you are teaching/preaching

30 Stephen EdwardsVirginia Tech Automatically Grading Programming Assignments with Web-CAT Students get frustrated without feedback, so reference tests must provide some If students only get a score, but no other feedback for how to improve, they get easily frustrated We augment our reference tests to provide “hints” for failed tests, cross-referenced to the program assignment Requirements in assignment spec mul : this command takes two arguments from the evaluation stack and multiplies them 11. Feedback to student on failed test Your testing does not fully cover (11) More detailed alternate feedback (11) mul command failed, expected 4 but received 8

31 Stephen EdwardsVirginia Tech Automatically Grading Programming Assignments with Web-CAT Students will try to get Web-CAT to do their work for them Students appreciate the feedback, but will avoid thinking at (nearly) all costs Too much feedback encourages students to use Web- CAT for testing instead of writing their own tests— they use it as a development tool instead of simply to check their work This limits the learning benefits, which come in large part from students writing their own tests Lesson: balance providing suggestive feedback without “giving away” the answers: lead the student to think about the problem

32 Stephen EdwardsVirginia Tech Automatically Grading Programming Assignments with Web-CAT We have also tried to influence student work habits to improve their success Encourage early submission by providing extra incentives or using late penalties Score bonuses and/or penalties are easy Another useful approach: Generous limit on the total number of submissions (60) Hints disappear one day before the due date Project closes for one day to encourage students to step away and reflect on “the last bug” Project opens again for one day with hints re- enabled, but with a cap on how much the score can improve

33 Stephen EdwardsVirginia Tech Automatically Grading Programming Assignments with Web-CAT Lessons for writing program assignments intended for automatic grading Requires greater clarity and specificity Requires you to explicitly decide what you wish to test, and what you wish to leave open to student interpretation Requires you to unambiguously specify the behaviors you intend to test Requires preparing a reference solution before the project is due, more upfront work for professors or TAs Grading is much easier as many things are taken care by Web-CAT; course staff can focus on assessing design

34 Stephen EdwardsVirginia Tech Automatically Grading Programming Assignments with Web-CAT Areas to look out for in writing “testable” assignments How do you write tests for the following: Main programs Code that reads/write to/from stdin/stdout or files Code with graphical output Code with a graphical user interface

35 Stephen EdwardsVirginia Tech Automatically Grading Programming Assignments with Web-CAT Testing main programs The key: think in object-oriented terms There should be a principal class that does all the work, and a really short main program The problem is then simply how to test the principal class (i.e., test all of its methods) Make sure you specify your assignments so that such principal classes provide enough accessors to inspect or extract what you need to test

36 Stephen EdwardsVirginia Tech Automatically Grading Programming Assignments with Web-CAT Testing input and output behavior The key: specify assignments so that input and output use streams given as parameters, and are not hard-coded to specific sources destinations Then use string-based streams to write test cases; show students how In Java, we use BufferedReaders and PrintWriters for all I/O In C++, we use istreams and ostreams for all I/O

37 Stephen EdwardsVirginia Tech Automatically Grading Programming Assignments with Web-CAT Testing programs with graphical output The key: if graphics are only for output, you can ignore them in testing Ensure there are enough methods to extract the key data in test cases We use this approach for testing Karel the Robot programs, which use graphic animation so students can observe behavior

38 Stephen EdwardsVirginia Tech Automatically Grading Programming Assignments with Web-CAT Testing programs with graphical UIs This is a harder problem—maybe too distracting for many students, depending on their level The key question: what is the goal in writing the tests? Is it the GUI you want to test, some internal behavior, or both? Three basic approaches: Specify a well-defined boundary between the GUI and the core, and only test the core code Switch in an alternative implementation of the UI classes during testing Test by simulating GUI events

39 Stephen EdwardsVirginia Tech Automatically Grading Programming Assignments with Web-CAT Conclusion: including software testing helps promote learning and performance If you require students to write their own tests … Our experience indicates students are more likely to complete assignments on time, produce one third less bugs, and achieve higher grades on assignments It is definitely more work for the instructor But it definitely improves the quality of programming assignment writeups and student submissions

40 Stephen EdwardsVirginia Tech Automatically Grading Programming Assignments with Web-CAT Visit our SourceForge project! Info about using our automated grader, getting trial accounts, etc. Movies of making submissions, setting up assignments, and more Custom Eclipse plug-ins for C++- style TDD Links to our own Eclipse feature site


Download ppt "Automatically Grading Programming Assignments with Web-CAT Stephen H. Edwards Virginia Tech Dept. of Computer Science"

Similar presentations


Ads by Google