Building and analysing your own corpus 1. Building a corpus.

Slides:



Advertisements
Similar presentations
Manage your sources for more effective research quick tips for creating an APA template Trinity Writing Center (2011)
Advertisements

Garland Library Online Orientation. Introduction  This portion of the Online orientation is intended to help library users gain the basic knowledge and.
Accessing and Using the e-Book Collection from EBSCOhost ® When an arrow appears, click to proceed to the next slide at your own pace. To go back, click.
Exit Microsoft Outlook Skills Using Categories for Sorting, Filtering and Creating Group Oklahoma Department of Corrections Training Administration.
Garland Library Online Orientation. Introduction  This portion of the Online orientation is intended to help library users gain the basic knowledge and.
1.)Please visit to begin this tutorial. Note: You must register with MY NCBI before beginning tutorial. Registration is free.
Virtual Library Workshop. To access the Virtual Library you must be signed into Campus Connect. Once you are signed in: 1. Click on the Library tab at.
Here is a list of citations the database retrieved for us. To find out more about an article, click on the “complete reference” link.
WISER: Newspapers online : an introduction to the scope and range of recent and current newspapers available on Oxlip, including hints on effective search.
1 Nursing: Concept Models for Professional Practice Introduction to Research Resources at the Kean University Library.
Intro to Searching For Journal Articles A Review of Lab 4: 2/2/11.
Garland Library Online Orientation. Introduction  This portion of the Online orientation is intended to help library users gain the basic knowledge and.
How to Create a Book Purchase Request using Books in Print?
The basics of the Online Portal
Read to Learn How to prepare for and complete a job application How to write an effective résumé and cover letter.
MGMS Databases Cool, reliable resources just a few clicks away!
Microsoft Office 2003 Illustrated Introductory with Programs, Files, and Folders Working.
Academic Search Premier How to search an international database with bibliographic references, including some full text articles for the Social Sciences.
SRI Assessment- Electronic (Online) Format
Moodle (Course Management Systems). Assignments 1 Assignments are a refreshingly simple method for collecting student work. They are a simple and flexible.
Friday, October 02, 2015 Prepared and presented by Victor G. Kamau1 Digital Services Usage Training Session KeMU Nairobi Campus.
Using the University of Northampton Library A student guide Please note: The slides are animated but you need to click to move on to each new slide.
Using the University of Northampton Library: an ‘EWO’ guide for students based at other locations Please note: The University’s official term for arrangements.
Whitney HS Databases Britannica Online EBSCO World Book Online.
SEARCHING A-TO-Z A QUICK GUIDE. WHAT IS EBSCO A-TO-Z? Our Library purchases online access to many resources, including electronic journals and MEDLINE.
Limits From the initial (HINARI) PubMed page, we will click on the Limits search option. Note also the hyperlinks to Advanced search and Help options.
Electronic Resources for Education: Nexis UK February 2012.
A guide to creating a power point display Essentials Ctl M =New Slide: a new slide can be inserted. It is placed after the slide that you are viewing.
Basic searching on Ovid databases on the NHS Scotland eLibrary Maria Henderson Library NHS Greater Glasgow &
Support.ebsco.com Introduction to EBSCOhost Tutorial.
How Can Corpora Help Me To Be Successful in CO150?
1 EndNote X2 Your Bibliographic Management Tool 29 September 2009 Humanities and Social Sciences Resource Teams.
Introduction to EBSCOhost Tutorial support.ebsco.com.
Argumentative Research. Where Do I Find Information?
Delivering Knowledge for Health Get in the Good Books: eBooks training day Thursday 22 nd March 2007 Wolfson Training Suite, Edinburgh University.
Mrs. Herrera English Language Arts and Composition.
Using Middle Search® Plus For Junior Academic Bowl Competitions.
MT435 – OPERATIONS MANAGEMENT SEMINAR The Unit 3 Paper – Approach & Suggestions.
HOW DO I SEARCH LIBRARY DATABASES (EbscoHost)? Compiled by Helene van der Sandt.
What is a Corpus? What is not a corpus?  the Web  collection of citations  a text Definition of a corpus “A corpus is a collection of pieces of language.
Text2PTO: Modernizing Patent Application Filing A Proposal for Submitting Text Applications to the USPTO.
01 OPTIONS 01 OPTIONS Lorem Ipsum In play mode, click the image in the tab to the left. This will load the slide for the option clicked. Be sure to edit.
Using the University of Northampton Library: an ‘EWO’ guide for students based at other locations Please note: The University’s official term for arrangements.
Using Google Scholar Ronald Wirtz, Ph.D.Calvin T. Ryan LibraryDec Finding Scholarly Information With A Popular Search Engine Tool.
Introduction to EBSCOhost
Using the University of Northampton Library: a guide for Law students based at other locations Please note: The University’s official term for arrangements.
Finding Scholarly Articles in a Library Database
Using the Result List EBSCOhost
Using the University of Northampton Library
PubMed Database Interface (Basic Course Module 4 Part B)
Argumentative Research
Using the University of Northampton Library
Exploring the BNC Corpus
Argumentative Research
Adding Assignments and Learning Units to Your TSS Course
Randolph C. Watson Library Kilgore College
Tutorial Introduction to support.ebsco.com.
CAB Abstracts, Medline & Zoological Record
Introduction to EBSCOhost
Using the Result List EBSCOhost
Academic Search Premier
Georgia Public Library Service
Georgia Public Library Service
To view, enable editing, select Slide Show, select From Beginning
Introduction to EBSCOhost
Criminal Justice Databases: Research Articles
Presentation transcript:

Building and analysing your own corpus 1. Building a corpus.

Why bother with corpora? “Language users cannot accurately report language usage, even their own” (Sinclair, 1987) “Using a language is a skill that most people are not conscious of; they cannot examine it in detail, but simply use it to communicate” (Sinclair 1995) “There are many facts about language that cannot be discovered by just thinking about it, or even reading and listening very intently” (Sinclair, 1995) As language teachers and professionals, we often have strong intuitions about language use… Corpus- based research, however, shows us that our intuitions are often completely wrong. (Biber 2005)

There are many free online corpora like COCA or COHA, but you could also build your own corpus.

1. Building a corpus. You can collect data from a variety of sources, but the most important thing to remember is that you need to save it in plain text (.txt) format. It also needs to be fairly big to make the corpus analysis worthwhile (I would recommend at least 100,000 tokens).

White house briefings Transcripts of the press conferences room/press-briefings room/press-briefings

The Brown family Part of the Brown family of corpora (which includes Brown, Frown, LOB, FLOB and BE06) /index.html /index.html

International Corpus of English ICE Twenty four research teams preparing electronic corpora of their own national or regional variety of English for comparative purposes (e.g Indian English/ Australian English/South African English)

Corpora galore Learner corpora Courtroom discourse Academic English Specialised small corpora: RIP Sex education

UK parliamentary discourse /cmhansrd.htm /cmhansrd.htm Select committees: es/committees-a-z/commons-select/ es/committees-a-z/commons-select/

Where you collect your data from will depend on the type of corpus that you need to create, but the principles remain the same.

Downloading a newspaper 1. Go to a database like Lexis Nexis/ Westlaw 2. Check which newspapers are available (be careful sometimes they lie - Westlaw claims to have Corriere but actually just has the articles that have been translated into English)

Downloading a paper 3. Choose the newspaper that you want 4. Use the name of the newspaper as the search term

Downloading a paper 5. If there is an option to remove duplicates - select it 6. Choose one day only for the date range

Downloading a paper 7. Download that day's articles and save as txt. To download the articles, click on the save icon on the right of the screen which will open another window. Make sure that you download all the articles in text format.

Downloading a paper 8. If there are more than 500 articles for one day then you will have to download them as and the (or whatever the maximum is). Click on the link to open and save your new file 9. Remember to save the file with a sensible name that includes the paper and date e.g. GUA (for the Guardian from 15 Oct 2012)

c. Building a corpus of fiction language from Project Gutenburg Project Gutenberg contains about 30,000 books which are no longer bound by copyright restrictions. This could be very useful if you wanted to look at different time periods, or different genres e.g. children’s writing. Go to

Building a corpus of fiction language from Project Gutenburg Think of a book you would like to download, for instance The Princess and the Goblin. Type the book that you want into the search box on the left. Scroll down the page to select a text only format and click to open. The text file will open within your browser.

Building a corpus of fiction language from Project Gutenburg Copy and paste into either Wordpad (look under ‘programs’ then ‘accessories’) or a Word document. Remember to save as.txt There is a large introductory section at the beginning of the file which could skew your results. In order to tell AntConc to ignore this you will have to enclose it in angle brackets

Building a corpus of fiction language from Project Gutenburg Save your document as text with a sensible name eg ‘PrincessGoblin’ and make sure it is saved somewhere that you can find it easily

Class task Academic English: research papers introduction. Go to the CL 2015 abstract book Copy the introduction paragraph and save in text format You need to decide how to label them Collect as many as you can

Introduction to research What phraseologies can you discover from a corpus of the indtroduction to research papers? You can use this to help you write your own abstract for your project.