Presentation on theme: "Plagiarism Investigation"— Presentation transcript:
1 Plagiarism Investigation Practical Free-textPlagiarism InvestigationFintan CulwinSchool of ComputingLondon South Bank UniversityLondon SE1 0AA
2 Important Disclaimer(?) On 18th August, the JISC plagiarism service launched an improved version, addressing some of the points that are raised in these notes!On the same day all SBU servers were taken off-line due to the slammer worm!!Some of the comments on the JISC service made may have been addressed by the improvements! These notes and the JISC brochure refer largely to the old version, comments on the new version are shown in red.
3 General Description, as advertised This tutorial will introduce the processes involved in using the JISC service and give examples of its use from the 300 plus final year computing and BIT projects that were processed in The limitations of the service as discovered will be explored and the design and implementation of additional tools needed to complement the JISC service will be presented.Free text plagiarism is a large and growing problem. Tools to assist with its detection are sadly necessary but unfortunately not sufficient. Some of the reasons why some students resort to cheating will be explored and some of the pedagogic responses that can possibly forestall it will be presented.
4 Specific Learning Objectives Following the tutorial attendees will have knowledge of:The nature and limitations of the JISC plagiarism detection service.The operation of the JISC service.Interpretation of the results of the JISC service and use of the JiscView utility.The use of the OrCheck tool to follow up a JISC investigation.The use of the Praise tool to detect intra-corporal plagiarism.The use of Freestyler to investigate single documents.
5 My QualificationsMuch of the background and experience for this tutorial results from processing over 300 final year projects through the JISCsystem in the summer of 2003.Additionally, a number of utilities and systems have been developed at SBU for free text originality investigation.Previously, experience of developing and operating sourcecode detection systems since circa This led to theJISC commissioned report on source code plagiarismdetection.
6 Available Services JISC plagiarism detection service using iParadigms (aka TurnItIn) technology. Free to UK institutions forat least the next year.UKRUND, originally a Swedish service now based in Brussels (awaiting evaluation).CopyCatch, desktop intra-corporal system, nowfree of charge.OrCheck, (also PRAISE, VAST & FreeStyler) free of charge from SBU.Various other systems with varying degrees ofcapability and availability (FindSame, HowOriginal etc.).
7 Classification space desktop server in house remote document corpa intra-corporalextra-corporalcommercialfreedatabasestylisticstext-onlystyled documentsopenproprietary
8 Why do students cheat?because the task they have been set it too difficult for thembecause they are not capable of doing the task setbecause they are capable but not sufficiently organisedbecause they are capable but want a better markbecause their families want them to get a better markbecause they are not prepared to devote the amount of time the task would takebecause they have devoted the time and feel they deserve the markbecause the number of assessment tasks set is unreasonablebecause the resources required are not availablebecause cheating has become a habitbecause they do not agree that they are cheatingbecause the institution is inhumanebecause everyone else is cheatingbecause the tutor connives with the cheating
9 essentially . . .Because the perceived chances of being caught and the perceived punishment if caught are less than than theperceived benefit of cheating,at the time when the cheating occurs.
10 JISC Plagiarism Report “. . . technology can only assist us, it will never replace the expertise of humans ... the answer to problems usually lies in process and procedures not technology alone. Electronic detection has its place in institutions but the real solutions lie in appropriate assessment mechanisms, supportive institutional culture, clear definitions of plagiarism and policies for dealing with it and adequate training for staff and students. If these areas are improved, the need, desire, and appeal of plagiarism can be taken away for most students."
11 Implications for Practice change the assignment specification for every presentationassess process as well as productassess at a higher level (of Bloom’s taxonomy)individualise assessment tasksparticipate in groupworksinnovate assessment techniquesit is your responsibility to educate your registrar about the exactnature of academic misconductit is your responsibility to educate your students about the boundariesbetween cooperation, collusion and copyingit is your responsibility to ensure that an average student can completean assessed task in a reasonable time
18 Final Year ProjectAbout 20 reports were categorised as ‘extensive’, ‘substantive’, or ‘significant’. Summary notes were made onall of these and JiscView and/or OrCheck visualisations produced.The project panel decided to proceed with the 9 ‘extensive’ and ‘substantive’ cases.First supervisors (some who should have known better!) were prepared to excuse extensive (~50%) demonstrated non-originality and/or suggested informal capping.Of the 9 cases processed formally, penalties ranged from cancellation of all level 3 marks (and award of DipHE), cancellation of the project mark (and award of unclassified), cancellation of the project mark (but allowed to resubmit next year).
19 Quantitiative Corpa Analysis Hypothesised ‘real’ line‘ColdFusion’ area
22 Revised Service - Side by Side Comparison the two panes are not hyperlinked
23 Comments on the JISC service 1 The nature of the detection engine is unknown (although guesses can be made).It is (necessarily) administratively cumbersome.There is no facility for batch enrolment of students onto the system. (Possibly addressed.)There is no batch submission of documents (although a tutor can submit on behalf of a student). (Possibly addressed.)There is no facility for batch downloading. (I had to manually review about 50 originality reports over a weekend and had to obtain each one individually to take them home.)There is no batch submission of additional URLs each has to be submitted individually, (with a re-analysis after each one).The four hour turnaround on reanalysis of a document made semi-manual investigation cumbersome. (Addressed in the upgrade.)
24 Comments on the JISC service 2 There is no facility to integrate it with WebCT or BlackBoard.The system has some aspects of a MLE (e.g peer review, on-line grading). (Not in the JISC version.)The precise quantitative degree of similarity is not stated or used to precisely order the list . (Possibly addressed.)There is no side by side comparison of submission and hit(s). (Addressed in the upgrade.)The significant and extent of the non-originality within the document can be unclear, particularly with large documents. (See the JiscView utility.)The system can lose some hits (i.e. a hit reported may disappear if a reanalysis of the document is requested). (Addressed in the upgrade.)There is no management reporting capability. (e.g. a convenient printer friendly list of all submissions received, etc.)
26 JiscView The JISC textual representations, whilst adequate for small documents, proved less useful for large projects. The colour coding did not give a precise quantitative measure and the relative location of the various non-original parts was also unclear.To address these problems a small utility, JiscView, was developed to provide a high level, non-interactive, ‘map’ of a JISC non-originality report.The utility may have been invalidated by the revised JISC service. It is only available upon request with many caveats and no documentation.
27 JiscView in OperationA JiscView image contains one pixel for every character, colour coded as in the originality report. The width is arbitrary (just wide enough to accommodate the text at the top). It gives a precise quantitative measure of non-originality, in this case 24%.
28 OrCheck OriginalityChecker is an in-house, desktop, single-document, free-of-charge, database (Google) driven, text only,non-proprietary tool.Essentially, it provides some assistance with the processof manually performing a Google driven keyword search and(in particular) with interpreting the extent and significance ofany matches in the documents returned.In the final year project investigation it was used to locateURLs to manually feed into the JISC service.It was also used in ‘passive’ mode to prepare evidentialreports for the investigation phase.
29 OrCheck in Operation 1document loadedconcordance generated
30 OrCheck in Operation 2search in progresshits obtained
31 OrCheck in Operation 3textual comparisongraphical representation
32 PRAISE Prioritised Ring to Assist In Similarity Evaluations is an in-house, desktop, intra-corporal, free-of-charge, stylistic, (text only), non-proprietary tool.It is used to detect and display the degree of similarity between the documents in a corpus.Although designed for text-only use it will operate upon styled texts (though its behaviour is somewhat unknown).It uses the words2 metric, shown from Thomas Lancaster’s - thesis to be efficient and effective.It is intended to allow an OrCheck and/or VAST viewer to be spawned from it for detailed investigation.
35 PRAISE in Operation 3The documents are arranged on the torc in gross similarity sequence. Controls are provided to vary the number of documents and the degree of similarity shown.When one document is selected all other documents linked to it, at or above the similarity level are also shown. (From here an OrCheck visualisation will be launched).When two documents (i.e. one link) are selected details of that degree of similarity are shown. (From here a VAST visualisation will be launched)An alternative tabular view of the information also needs to be provided.Extra-corporal Web sourced documents can be included and are shown in a different colour. (An OrCheck style capability to obtain such documents needs to be included.)
36 VAST Visual Analysis of Similarity Tool is an in-house, desktop, double-document, free-of-charge, stylistic driven, text only,non-proprietary tool.It provides a detailed OrCheck like visualisation and investigation of a pair of documents.VAST is more capable of fuzzy matching than OrCheck and so is more capable of detecting similarity beneath superficial disguises. However it is less precise in its highlighting and is unable to give a (precise) quantitative value to the similarity.VAST can also be used to track changes in the drafts of a document.
38 FreeStyler FreeStyler is an in-house, desktop, single-document, free-of-charge, stylistic, text only, non-proprietary tool.It provides rolling-average, interactive graphs of various stylistic measurements. The intention is that if there is more than one ‘voice’ in a document, the differences should become visible in the graphs. (In practice this has not proved to be so easy!).FreeStyler can also be used as a writing tool (checking reading age across a document, ensuring consistency of voice and spelling conventions etc.).
40 Final Year Project 2004Inform students clearly and demonstrate the technology at the first project lecture (as was done in 2002/3).Have students sign and return the JISC DPA form as part of project registration.Encourage final year core unit tutors to use the JISC service routinely.Require students to submit the body of the report (only) to JISC, but to submit the full report in-house.Staff development and clear agreed guidelines to all tutors regarding the significance of non-originality.Have agreed time relief for coordinating the systems and advising on issues.