Presentation is loading. Please wait.

Presentation is loading. Please wait.

By Chris Zachor.  Introduction  Background  Changes  Methodology  Data Collection  Network Topologies  Measures  Tools  Conclusion  Questions.

Similar presentations


Presentation on theme: "By Chris Zachor.  Introduction  Background  Changes  Methodology  Data Collection  Network Topologies  Measures  Tools  Conclusion  Questions."— Presentation transcript:

1 By Chris Zachor

2  Introduction  Background  Changes  Methodology  Data Collection  Network Topologies  Measures  Tools  Conclusion  Questions

3  Use network analysis to better understand the SourceForge and Github community developers  Identify key differences (if any) within the two communities  Examine the diversity of collaborations within these two communities

4  The addition of Github to the study  Contains some of the same attributes to allow for a comparison  Other communities were looked at, but they either were not large enough or did not provide enough public data.

5  Crawling the websites using a simple Perl script and regular expressions  Collect a project list from Sourceforge  www.sourceforge.net/projects/projectTitle www.sourceforge.net/projects/projectTitle  No specified request limit  Check for duplicates

6

7  Using the Github API provides our data  Limited to 60 API calls per minute  Use multiple computers to collect all 1.5 million projects

8

9

10

11

12  Degree  Clustering Coeficient  Modularity  Power Law  Small World Phenomenon

13  Average number of projects worked on by a developer  Average number of collaborations  Average number of developers on a project

14  Examine how likely developers are to stick together in groups  Examine both average clustering coefficient for the entire network and the local clustering coefficient for nodes of interest

15  Provide us with a measure of how diverse developer collaborations are.  Range -1 < Q < 1  Ranges closer to one show less diversity in collaboration choices  Ranges closer to negative one show more diversity in collaboration choices

16  Previous studies have found that the Sourceforge community does follow the power law  No such study has been done on the Github community  Fewer developers should be apart of many project while many developers should be involved with only one project

17  Previous studies have shown the Sourceforge community does exhibit small world properties  Once again, no study has been done on the Github community  Using Pajek, I will create a random network of the same nodes and edges  Then, compare the clustering coefficient and the average shortest path

18  Perl  Pajek  cURL  wget  GUESS

19  Through the use of network analysis, we hope to gain a better understanding of the developers of Sourceforge and Github communities.

20 Suggestions? Comments?


Download ppt "By Chris Zachor.  Introduction  Background  Changes  Methodology  Data Collection  Network Topologies  Measures  Tools  Conclusion  Questions."

Similar presentations


Ads by Google