Enhancing Configuration Management Systems with Information of Parallel Activities Topic Proposal Anita Sarma October 2005
2 A Typical Development Scenario CM repository Pete’s workspace A Ellen’s workspace DCE
3 A Typical Development Scenario CM repository Pete’s workspace A Ellen’s workspace DCE Pete and Ellen modify entirely different artifacts with no dependencies No conflicts
4 A Typical Development Scenario CM repository Pete’s workspace A Ellen’s workspace DCE Ellen starts to modify artifact “C” No conflicts C
5 A Typical Development Scenario CM repository Pete’s workspace A Ellen’s workspace DCE Pete and Ellen modify entirely different files with no dependencies No conflicts C Direct conflicts Pete and Ellen concurrently modify the same artifact “C” lines changed lines 5-10; changed
6 A Typical Development Scenario CM repository Pete’s workspace A Ellen’s workspace DCE Pete and Ellen modify entirely different files with no dependencies No conflicts C Direct conflicts Pete modifies artifact “B” on which E depends B
7 A Typical Development Scenario CM repository Pete’s workspace A Ellen’s workspace D Pete and Ellen modify entirely different files with no dependencies No conflicts C C Direct conflicts Pete and Ellen modify the same file BE No conflicts Direct conflicts Indirect conflicts Pete modifies “B” that affects “E”; Ellen modifies “E” in parallel signature of interface I 1 changed Changes to method body that calls interface I 1
8 Goal Help mitigate the impact of direct and indirect conflicts
9 Traditional CM Approaches Coordination mechanism Direct conflicts Indirect conflicts Pessimistic (RCS, DSEE) Locking before changes are made Avoided, at the expense of project delays Not addressed Optimistic (CVS, ClearCase) Automated merging after changes have been made Resolved, except for overlapping changes Not addressed
10 Field Studies Perry, et al. – 1994 –Number of conflicts proportional to amount of parallel development Grinter – 1995 –Use information from CM systems to pace their development De Souza, et al. – 2003 –Send detailing changes and their expected effects before check-in
11 Inferences from Field Studies Conflicts regularly occur and affect productivity – (Perry, et al.,1994) Developer responses: –Actively obtain information from the CM systems – (Grinter,1995) –Pace their development to avoid conflict resolution – (Grinter,1995) –Send extra information to enable detection of effects of changes on other artifacts – (De Souza,2003) –Place their work in the context of others’ changes – (De Souza,2003)
12 Developer Responses CM repository Pete’s workspace CBA Ellen’s workspace CED Informal coordination conventions
13 Hypothesis Providing information of parallel development activities and their effects enables developers to: –place their work in the context of others’ –self-coordinate their actions –reduce the magnitude and occurrence of direct and indirect conflicts
14 Current Developer Response CM repository Pete’s workspace CBA Ellen’s workspace CED Informal coordination conventions
15 More efficient and effective coordination Solution: Support Informal Coordination CM repository Pete’s workspace CBA Ellen’s workspace CED Enhanced workspace
16 Context Current infrastructure Issues – late conflict detection Proposed solution – early conflict detection Private workspaces Well structured coordination protocols Isolated workspacesInsulated workspaces “Pull-based”“Push-based” Information only at specific synchroniza- tion points Relevant real-time information provided continuously Ensure syntactic correctness – Merge Overlapping direct conflict not resolved Identify and characterize direct conflicts through severity analysis Ensure semantic correctness – Builds Indirect conflicts undetected until build, test, or even deployment stage Identify and characterize indirect conflicts through impact analysis
17 Overarching Plan of Attack Research question: Does information of parallel activities and their effects provide an effective context in which to place one’s own work? Approach: Provide real-time information of parallel development activities continuously Validation: –Implicit validation from field studies –Evaluation results: Usability case studies – observations on how users monitor information provided and interact with each other Conflict detection case studies – analysis of the number and magnitude of conflicts detected in a real-life project
18 Research Questions 1. How can a tool provide this information in a usable, scalable, and effective manner? 2. To what extent does this information affect self-coordination? 3. To what degree can self-coordination reduce the occurrence of conflicts? 4. To what degree can self-coordination reduce the magnitude of conflicts?
19 Approach Provide information of parallel activities continuously at real time Identify direct and indirect conflicts Provide metrics to denote the size and effect of conflicts Allow detection of conflicts earlier while changes are still in progress Build on existing CM infrastructure Present information through visualizations that are unobtrusive and contextualized Filter events based on relevance Research prototype Palantír embodies this approach
20 Palantír Architecture Visualization Extractor Internal State Palantír Client Visualization Extractor Internal State Palantír Client Event Database Palantír Server BootstrapCapture Workspace Wrapper CM System CM Server Repository The Eclipse Platform Event Listeners CM Plug-in Workspace The Eclipse Platform Event Listeners CM Plug-in Workspace Pete’s WorkspaceEllen’s Workspace
21 Palantír Client – Visualizations
22 Severity Analysis Severity Analysis – The amount (size) of change between two versions of an artifact –Artifact severity: the percentage of lines of code that have changed Example: 25 lines on 100 = 25% –Directory severity: ∑Actual artifact severities ∑Possible artifact severities Example: dirA – 12.5% dirB – 25% foo.c – 50 lines of 100 changed – 50% bar.c – 0 lines of 100 changed – 0% dirC – 0% head.c – 0 lines of 100 changed – 0%
23 Impact Analysis – Remaining Research Impact of changes – “The effect of changes on my current work(space)” Requirements –Identification of levels of analysis Local workspace level Remote workspace level Repository level –Identification and presentation of effects of indirect conflicts Outgoing conflicts – artifacts that affect others Incoming conflicts – artifacts that are affected –Identification and implementation of metrics to denote the size of “effects” of a change
24 Impact Analysis – Local Workspace CM repository Pete’s workspace CB A Ellen’s workspace CED Local workspace level Effect of local changes on local workspace (A → B)
25 Impact Analysis – Remote Workspace CM repository Pete’s workspace CB A Ellen’s workspace C E D Local workspace level Remote workspace level Effect of changes in remote workspace on local workspace (E → A)
26 Impact Analysis – Repository CM repository Pete’s workspace CBA Ellen’s workspace C E D Local workspace level Remote workspace level Repository level Effect of changes in repository on local workspace (E → A) E A
27 Impact Analysis – Outgoing/Incoming Conflicts CM repository Pete’s workspace CB A Ellen’s workspace C E D Outgoing conflicts Incoming conflicts Outgoing conflicts – artifacts that affect others (C) Incoming conflicts – artifacts that are affected (A,B)
28 Options for Impact Analysis Program analysis –Dependency analysis –Call graphs –Abstract syntax trees Eclipse IDE capabilities –Refactoring –Outline of a program XML tags – Java-XML, Castor, Apache (XML Beans) …
29 Evaluation – Criteria for Success 1. Usability study RQ#1: tool provides information in a usable, scalable, and effective manner RQ#2: information of parallel activities increases self-coordination 2. Conflict detection RQ#3: reduction in occurrence of direct and indirect conflicts RQ#4: reduction in magnitude of direct and indirect conflicts
30 Evaluation 1 – Usability Study Five or more confederate case studies –At least one of them without the use of Palantír Subjects: –Undergraduates, graduates, volunteer programmers –Role: Complete given programming assignment in teams within the time limit Confederate: –Fellow graduate students, research group members –Role: Act as a team member, but run predefined scripts that lead to conflicts in the assignment The subject is not aware of the identity and intentions of the confederate
31 Usability Studies: Goals How subjects monitor and interact with Palantír Whether subjects monitor conflict warnings How subjects interact with their partners on detecting potential conflicts Identify differences (if any) in coordination efforts: –Only direct conflicts –Direct and indirect conflicts Conduct structured interviews
32 Evaluation 2 – Conflict Detection Create light-weight Palantír client –Use in research group, open-source and commercial companies –Capture relevant Palantír events onsite and investigate them later offsite –Compare direct conflict detection with SCM merge conflicts –Compare indirect conflict detection with build logs, bug trackers, build managers…
33 Timeline Fall 05: Preliminary research on Impact Analysis (Oct – Dec 05) Investigate and design Impact Analysis metrics (Oct – Dec 05) Restructure Palantír events (Nov – Dec 05) Winter 06: Build Impact analysis tool / algorithm (Jan – Mar 06) Create Palantír client wrapper (Feb – Mar 05) Spring 06: Test Impact analysis tool (Apr 06) Reconstruct Palantír to be plug-in oriented (May 06) Integrate Impact analysis component with Palantír (May 06) Evaluate Palantír with Severity and Impact (Jun 06) Release Palantír client wrapper (Jun 06) Summer 06: Collect data from Palantír client wrapper (Jul – Sept 06) Create scenario for evaluation (Jul – Sept 06) Fall 06: Schedule user case studies (Jul – Sept 06) Analyze data from Client wrapper (Oct – Dec 06) Winter 07: Conduct user case studies (Jan –Apr 07) Dissertation writing (Jan – Mar 07) Spring 07: Evaluate scenario case studies (Apr – May 07) Complete dissertation (Apr – Jun 07)
34 Contributions A coordination approach built on traditional CM coordination protocols to help mitigate the effects of conflicts –Enhances workspaces with information of parallel activities –Enables placing one’s work in the context of others –Sparks self-coordination to avoid conflicts Approach that leads to reduction in occurrence and magnitude of conflicts Palantír embodies this approach –Detection of potential conflicts –Metrics to represent the magnitude of conflicts –Set of visualizations with varying degree of obtrusiveness –Easy adoption into current practices
35 Questions???