Collaborative Writing: Wiki and Wikipedia Keshava P Subramanya Roopa Kannan

3 Today’s Talk  Quick introduction about the wiki and collaborative writing idea.  Wikipedia  Two views of how Wikipedia works  Criticisms  Details about the Community  Future

4 What is collaborative writing?  Projects where written works are created by multiple people together (collaboratively) rather than individually  Some projects are overseen by an editor or editorial team  Many grow without any top-down oversight.

5 Computer based collaborative writing  Revision control software providing check-in/out ( example subversion, cvs )  Enterprise information portal, Content management system  SharePoint  Wikis

6 Some Collab projects  Novel Twists Online collaborative novel where each of the 150 pages is written one at a time by a different person.   The Linux documentation project  OOoAuthors

7 What is a Wiki  Essentially a dynamic, collectively authored set of web pages.  Invented in 1995 by Ward Cunningham to facilitate online collaboration about programming and design best practices.  Evolved by the early 2000’s into a way to facilitate all kinds of online collaboration.

8 Wiki – Definition  A wiki (according to Ward Cunningham) is a type of website that allows users to add and edit content and is especially suited for constructive collaborative authoring.  In essence, a wiki is a simplification of the process of creating HTML pages combined with a system that records each individual change that occurs over time, so that at any time, a page can be reverted to any of its previous states. As defined in Wikipedia.

9 How the Wiki Got Its Name  Wiki is the Hawaiian word meaning “quick”, “fast”, or “to hasten”.  Wiki-Wiki is the name of the bus line in the Honolulu International Airport.

10 How the Wiki Got Its Name

11 “Wiki-wiki to the beach.” - Elvis Presley (as Chad Gates) in the movie Blue Hawaii (1961). The line was said with a snap of the fingers.

12 Some more … Wiki (according to UIC Prof. Steve Jones) Web-based Interactive Kollaborative (collaborative) Iterative Wiki is sometimes interpreted as the backronym for “What I Know Is”, which describes the knowledge contribution, storage and exchange function.

13 More Uses for a Wiki  100 things to do before you die 100 things to do before you die  The world’s largest “How-To” manual – wikiHowwikiHow  Things to do in Seattle Things to do in Seattle  World-wide travel guide –  Everything you want to know about VoIP Everything you want to know about VoIP  All about the flu – Flu WikiFlu Wiki

14 Free Hosting of Wikis

15 What is Wikipedia?  Wikipedia is a freely licensed encyclopaedia written by thousands of volunteers in many languages  Free license allows others to freely copy, redistribute, and modify work commercially or non- commercially  Founded January 15, 2001  Run by the wikimedia foundation.

16 What is the Wikimedia Foundation?  Non-profit foundation Its 4th Quarter 2005 costs were $321,000 USD, with hardware making up almost 60% of the budgetUSD Where does it get the money ?  Aim: to distribute a free encyclopaedia to every single person on the planet in their own language  Wikipedia and its sister projects

17 Advantages of Freely Licensed Content  GNU Free Documentation Licence  Remains non-proprietary  Enhances the popularity of Wikipedia  Decreases individual sense of ownership  Increases a sense of shared ownership

18 Free Software  MediaWiki is GPL  Uses all free software on the website  GNU/Linux  Apache  MySQL  Php

19 How big is Wikipedia?  English Wikipedia is largest and has over 260 million words  English Wikipedia larger than Britannica and Microsoft Encarta combined  In 15 months the publicly distributed compressed database dumps may reach 1 terabyte total size

20 How big is Wikipedia Globally? Total more than 5 million articles!  English – 1,412,000 articles  German – 172,000 articles  Japanese – 87,000 articles  French – 66,000 articles  Swedish –53,000 articles  Over 5 million across 250 languages  19 with >10, with >1000 (statistics could be dated)

21 How popular is Wikipedia?  According to, Wikipedia (ranked ~ 20th) is more popular than the websites of:  IBM  Paypal  Open Directory Project  Geocities  ~400 Million page views monthly

22 Wikipedia vs. Britannica  AP article on CNN website AP article on CNN website This study was challenged by Encyclopædia Britannica, who described it as "fatally flawed.“ source

23 Wikimedia Projects  Wikipedia  Wiktionary  Wikibooks  Wikiquote  Wikispecies  Wikimedia Commons  Wikinews

24 Wikinews  Community edited news along the same principles of Wikipedia  Fairly new project  Aim of the project

25 Wikimedia’s Hardware  30+ servers  Squid caching servers in front to serve cached objects quickly  Apache/PHP webservers in the middle  Database backend (MySql)

26 MediaWiki  MediaWiki is one of many wiki engines  Collaborative software that allows users to add or edit content  Primarily developed for Wikipedia from 2002 onwards  Scalable and multilingual  Free license

27 MediaWiki features  Quality control features (versioning)  Editing features (simple markup)  Community features (talk pages, profiles, access levels)

28 Page History DEMO

29 Interlanguage linking DEMO

30 Criticism Workshop Hints:  Can Wikipedia Content Be Trusted?  Systematic bias  Reliability of Information  Technology requirement

31 Can Wikipedia Content Be Trusted?  Review processes  Partly post-moderation, partly reactive moderation  Linking to particular revisions  Development of a stable version  Free license allows you to modify it

32 Reliability of Information  Criticism  The community contribution approach allows for too much false information.  Without an expert background a person can not present an unbiased, factual position.  Rebuttal  The open source approach allows for new information to be added on a daily basis.  The articles that exist on Wikipedia are a group effort where any wrong information can be edited.  The group editing also lets people combine information to get a broad background.

33 Reliability of Information  Criticism  The large quantity of daily information added prevents proper fact checking.  The daily edits allow too many mistakes to go unnoticed or be reintroduced.  Rebuttal  Wikipedia does maintain a staff whose sole purpose is to review and edit articles.  Each day articles are viewed by thousands of people, any one person can implement changes to correct mistakes.  Printed encyclopedias can not fix errors once released, while Wikipedia is always able to make corrections.

34 Systematic Bias  Criticism  Systematic bias exposes WIkipedia to unbalanced amounts of information.  People are more likely to write about topics that interest them as opposed to more historically significant topics.  Rebuttal  Past requests for information have been met with quick action.  These responses have created huge increases in the amount of coverage of topics.  Wikipedia also includes a inquiry page. Any topic can be requested and the Wiki community is quick to respond.

35 Technology Requirements  Criticism  Wikipedia faces technology constraints as an online encyclopedia.  A reader must have Internet access at all times.  The possibility of tech failure on the Wikipedia’s end also presents problems.  Rebuttal  The technology constraints constantly decrease as the world becomes more advanced.  The student population has almost 100% Internet access due to school resources and class requirements.

36 Latest Information  Wikipedia is built on the belief that collaboration among users will improve articles over time.  The software of Wikipedia allows for rapid updating of existing articles, as well as constant introduction of new topics.

37 Quick Vandalism Response  Most vandalisms on Wikipedia are reverted within five minutes.  There is a record of change made to every page and Wikipedia volunteers watch the list of recent changes.  If a user constantly vandalizes pages of Wikipedia, individuals can be blocked and pages can be locked down.

38 Neutral Point of View  Three sides to everything, your version, my version, and the truth  Editors are asked to maintain a neutral point of view when writing for Wikipedia.  When editing wars break out and neutral points of view are not maintained, Wikipedia volunteers usually remove the information posted. Click here

39 Two Views of Wikipedia Emergent Community of thoughtful users

40 Emergent  Thousands of individual users who don’t know each other each contribute a little bit  Out of this emerges a coherent body of work

41 A Community? A dedicated group of a few hundred volunteers who know each other and work to guarantee the quality and integrity of the content. London Berlin Genoa

42 Implications Emergent Model  Need reputation mechanisms like Ebay, Slashdot  Users are tiny, have no power Community Model  Reputation is a natural outcome of human interactions  Users are powerful, must be respected

43 80/10 Rule  Counting only logged in users, and even excluding some prominent approved bot users  10 percent of all users make 80% of all edits  5 percent of all users make 66% of edits  Half of all edits are made by just 2 1/2 percent of all users

44 Edits by Anons  Controversial, intriguing  Yes, you can edit this page  Without logging in!  Anonymous ip numbers can edit Wikipedia  But these edits make up a total of around 18% of all edits, with some evidence of a downward trend over time

45 Edits across namespaces  Articles 85%  Talk pages 8%  User Page 3%  User Talk Pages 4% These percentages are stable in 2003 And 2004

46 Wikipedia is a community… How does it work? Who are the users? How do they self-regulate?

47 Many types of users  As in any society, there are many types of people -- these types are reflected in editing patterns  Individual users may not fit cleanly into a single type, but thinking about editing patterns is a helpful way to understand the community

48 Broad Types  Worker Bees, POV pushers  Police, Judges  Controversy lovers - Moths  Pseudo-users - Sock puppets, Vandals  Extra-Wiki - Mailing list, IRC, Board activities, Developers

49 Bees  The most important users at Wikipedia  But may go unnoticed unless special attention is given  Generalists  Specialists  Proof-readers Question: What attracts the bees??

50 Sock Puppet  Not all sock puppets are bad  Privacy  The chance to start over  But when used wrongly, is one of the worst offenses

51 Moth  Drawn to flames  Not necessarily a bad thing - some people thrive on controversy

52 Vandal  Less of a problem for the community than most people assume  Vandalism is easy to revert, and blocking vandals (temporarily) slows them down and takes the fun away

53 Outside the Wiki  Developers - coders and system admins  IRC Channels  Mailing lists

54 Wikipedia Governance  A confusing but workable mix of  Consensus  Democracy  Aristocracy  Monarchy

55 Community Challenges  How can such a large community scale? –Through software features –Through policy (mediation, arbitration) –Through an atmosphere of love and respect

56 Community Self-Regulation  Quality control features: recent changes, watchlists, related changes, page histories, user contributions lists  Community features: talk pages, user profiles, access levels, user-to-user , message notification.

57 International Community  Interlanguage linking of articles  Choice of language interface  Global newsletter: Quarto  “Translation of the week”

58 Conclusion  Wikipedia is a community  Automated and artificial Slashdot-style reputation metrics are not needed and may not be desirable  Achieving quality levels equalling or exceeding traditional publishing models can be expected without “emergent” magic

59 Credits  and related sites  Some slides adapted from –Jimmy Wales President, Wikimedia Foundation Wikipedia Founder –Prof. Burks Oakley II Prof of E.C.E University of Illinois

