Presentation is loading. Please wait.

Presentation is loading. Please wait.

What ’ s on Wikipedia, and What ’ s Not … ? Completeness of Information on the Online Collaborative Encyclopedia Cindy Royal, Ph.D. Assistant Professor.

Similar presentations


Presentation on theme: "What ’ s on Wikipedia, and What ’ s Not … ? Completeness of Information on the Online Collaborative Encyclopedia Cindy Royal, Ph.D. Assistant Professor."— Presentation transcript:

1 What ’ s on Wikipedia, and What ’ s Not … ? Completeness of Information on the Online Collaborative Encyclopedia Cindy Royal, Ph.D. Assistant Professor Texas State University School of Journalism and Mass Communication Deepina Kapila Graduate Student Texas State University School of Journalism and Mass Communication

2 Introduction - Wikipedia Wikipedia (www.wikipedia.com), deemed “the free encyclopedia,” was launched on the web in 2001.www.wikipedia.com Since then, it has become the Web’s 3rd most popular news and information source It uses the Wiki software format, which allows a community of users to develop and monitor content Wikipedia operates under the assumption that the public will act as a policing force, keeping content reliable and up to date.

3 Introduction - Research Denning et al. (2005) listed the risks inherent in Wikipedia’s model: accuracy, motives, uncertain expertise, volatility, coverage, sources. Bopp and Smith (2001) state that coverage in an encyclopedia should be “Even across all subjects” Shoemaker and Reese (1995) identified the individual as a news influencer. Web users and content creators tend to be young. Tankard/Royal (2005) – inherent biases in Web content, based on systematic searches.

4 Research Questions This project measures the content of Wikipedia against various indexes or standards of completeness to identify and uncover potential inherent biases. We are asking: 1. Are there some systematic gaps or biases in the overall presentation of information made available on Wikipedia? 2. Is recency (or currency) a predictor of amount of information on Wikipedia? 3. Is importance of information a predictor of amount of information on Wikipedia? 4. Is population a predictor of amount of information about particular countries on Wikipedia? 5. Is economic power a predictor of amount of information about individual corporations on Wikipedia?

5 Method Using predictors of recency, importance, country population, and economic power, several systematic searches on Wikipedia were conducted Each article for each topic was visited, the relevant content highlighted, and the selection ’ s words were counted Word counts were captured in a spreadsheet, and items were plotted on charts Ascending order Predictor variable

6 Topics Covered Years (1900-2010) Academy Award Winning Films Time Magazine ’ s Person of the Year #1 Song on Billboard Top 100 (1940-2006) Encyclopedia Terms Countries in the United Nations Fortune 1000 companies

7 Results - Years Ascending OrderChronological Order -Backward L-shaped curve -Clear progression of length of article with year; dramatic increase in years after 2001 -Years in the future displayed understandably shorter word counts -Spearman Correlation between variables:.79

8 Results - Films Ascending OrderChronological Order -Backward L-shaped curve is apparent. -With few exceptions (ie. Gone with the Wind, 1939 and Casablanca, 1943) the results show progression favoring more current films. Recency is important, but certain films transcend time and are deemed important for other reasons. -Average word count for films since 2001 was 80% higher than word count before 2001. -Spearman correlation between variables:.49; increased to.62 simply by removing 2 outliers

9 Results - Person of the Year Ascending OrderChronological Order -Softer backward-shaped L curve -Even distribution shows bias is unrelated to recency, measured by another variable of importance -Spearman Correlation between variables: O-there was no relationship with time.

10 Results - Billboard Top 100 Ascending Order Chronological Order -Backward L-shaped curve -Although Average word count was 32% higher for artists since 1990, distribution shows trend similar to movies in that some artists transcend time. -Spearman correlation between variables:.40 (by eliminating 2 outliers)

11 Encyclopedia Terms Ascending Order -Comparison between Encyclopedia Britannica and Wikipedia articles -Backward L-shaped distribution apparent -Spearman correlation used to compare inches of content in Encyclopedia Britannica with word count in Wikipedia:.26 -Of 100 terms, 14 were not represented in Wikipedia

12 Results - UN Countries Ordered by populationAscending Order -Backward L-shaped curve - although fairly evenly distributed, a SHARP increase appears for the top 22 countries. -Gradual upward curve in 2 nd chart shows that as population increases, so does word count -Average word count for top 10% of countries was 63% higher than the rest on the list -Spearman correlation between variables:.55

13 Results - Fortune 1000 Ascending OrderOrdered by Revenue -Backward L-shaped curve -SHARP increase for top 10% of companies by revenue -Top 10% of companies by revenue counted for 30% of total word count on companies -Spearman correlation between variables:.49

14 Conclusion -Information on Wikipedia is volatile, dynamic and constantly changing over time -Wikipedia’s purpose is to serve as a general reference source, but the content is weighted due to its contributors’ demographics -In each search performed for the dimensions, strong biases were evident and strong correlations experienced: -Currency/Recency: the more current topics were covered the most -Random Selection: Encyclopedia terms showed clear bias towards more common or popular terms -Relevancy: Wikipedia’s word count correlates to inches in a traditional encyclopedia, showing a strong agenda by each publication -Population: the larger the country and the larger its population, the higher the word count -Revenue: The larger the revenue, the higher the word count


Download ppt "What ’ s on Wikipedia, and What ’ s Not … ? Completeness of Information on the Online Collaborative Encyclopedia Cindy Royal, Ph.D. Assistant Professor."

Similar presentations


Ads by Google