Harvesting & Analyzing Interaction Data in R: The Case of MyLyn Sean P. Goggins, PhD Drexel University email@example.com MyLyn Research Collaborators: Peppo Valetto, PhD (PI) & Kelly Blincoe
I Study Small Groups MyLyn – Software Engineering CANS/Sakai – Online Learning Virtual Math Teams http://www.mathforum.org Health Care Communities (Under NDA) Small Group Interactions I use electronic trace data, interviews, field notes, electronic content & surveys for raw data
Coolest Open* Data to Me Group’s Emerging & Evolving Group Formation & Development The long tail of social computing, which I describe as everything *except* Wikipedia & Facebook Groups constructing knowledge, creating information and forming identity. *Available, but not always easy to get in an analyzable form
Points Harvesting Small, Open Data [MyLyn] Analyzing Temporal Changes in the MyLyn Network Work Talk Libraries Used & Source Code StatNet iGraph TNET R Sourcecode and Data will be available for download at http://www.groupinformatics.org. If you use this data or scripts please cite: http://www.groupinformatics.org Goggins, S. P., Laffey, J., Amelung, C., and Gallagher, M. 2010. Social Intelligence In Completely Online Groups. IEEE International Conference on Social Computing. 500-507. DOI=10.1109/SocialCom.2010.79. Blincoe, K., Valetto, G., and Goggins, S. 2011. Leveraging Task Contexts for Managing Developers’ Coordination. Under Review. Data Harvest Analyze
Data for R An Example From the MyLyn Project Data Harvest Analyze
More About MyLyn: http://tasktop.com/blog/ http://www.eclipse.org/mylyn/ Bug Database HTML Parser MySQL Database MyLyn Context Uploads Work Talk.zip file Talk
Coordination Requirements & Dependencies MyLyn Data Has 2 Advantages for Analysis compared to source Control systems analysis: 1.You see files *viewed* together 2.Discourse on a Bug is directly connected to the files read and edited 1.Closer connection between analysis of work & talk. Talk Work
Harvesting Data for R An Example From the MyLyn Project Data Harvest Analyze
MyLyn Interaction Datamart Files Accessed & Edited Bugs/Tasks Worked on Developer Context Initial Bug Description -> Task Discussion Related to Bugs Bug/Task Context Work – Bug/Task Action Talk – Bug/Task Discussion Integrated Repository Interaction Warehouse MyLyn CANS ETC Data Harvest Analyze Talk Work TalkWork
Analyzing Open Data with R An Example From the MyLyn Project Data Harvest Analyze
Analysis Tools Eight Mylyn Releases (Temporal Analysis) R Packages Used TNET iGraph Statnet
Release 1 (2.0) iGraph & Statnet Talk Clusters In Degree & Out Degree Red = Bug Commenter Blue = Bug Opener iGraph StatNET
Google Summer Coder Release One (2.0): Filtered CodeDiscussion 304, 373, 399 & 143 form The Strongest Connections In both networks Red = Bug Commenter Blue = Bug Opener Talk Work
Release One (2.0): Filtered CodeDiscussion 304, 373, 399 & 143 form The Strongest Connections In both networks Red = Bug Commenter Blue = Bug Opener Google Summer Coder TalkWork 457, 391 & 159 – Comment & Open
Release 8 (3.3): Filtered Code Discussion Red = Bug Commenter Blue = Bug Opener Talk Work Nobody is “Just Blue”
Release 8 (3.3): Filtered Code Discussion Red = Bug Commenter Blue = Bug Opener Talk Work Notice 416 in Talk & Second Coder Graph
Talk Clusters In Degree & Out Degree Red = Bug Commenter Blue = Bug Opener iGraph StatNET Release 8 (3.3) iGraph & Statnet 399, 118 & 159 are significant, But play with different clusters of Other people. Blue Cluster
Releases One Eight High Level Views Over Time
Discussion, Releases 1 – 8 Where there is no color, There are multiple, incomplete Graphs.
Code, Releases 1 – 8 One Possible explanation: A few central People who slowly but Observably begin to engage Other contributors in An open source software Development project. Structure evolves Key Groups Evolve iGraph
Next Step: The Story But that’s the research part, not the cool “R Stuff” Part
The People 399 304 159 143 373 Our next step is piecing together a narrative about the groups that emerged on this project, and describing each of the individuals. This is all open data. When we finish this part, we will publish one or more papers. For now, Let’s look at the cool “R Stuff”
Interaction Traces from Small Groups: The Case of MyLyn Sean P. Goggins, PhD Drexel University firstname.lastname@example.org Collaborators: Peppo Valetto, PhD & Kelly Blincoe Questions? In the after session.
Your consent to our cookies if you continue to use this website.