SAMUELS Closing Symposium Huddersfield Project Lesley Jeffries, Brian Walker and Jane Demmen.

Slides:



Advertisements
Similar presentations
Dr. Padam Simkhada Dr Jane Knight
Advertisements

Computer English For Computer Major Master Candidates
EQUINOX DATA DELIVERY SYSTEM May 31, 2011 –Elizabeth Hill Equinox.uwo.ca.
Getting published in academic publications Tips to Help you Publish Successfully June 2004.
Corpus Linguistics Richard Xiao
Feed Corpus : An Ever Growing Up to Date Corpus Akshay Minocha, Siva Reddy, Adam Kilgarriff Lexical Computing Ltd.
I Need Out Because He Wants In the House: The Subject Pronoun in need and want Phrasal Constructions 1 Gregory Paules & Dr. Erica J. Benson English Department,
VOICES AT WORK: Legal Effects on Organisation, Representation and Negotiation Professor Tonia Novitz, University of Bristol Dr Alan Bogg, Hertford College,
Jing-Shin Chang National Chi Nan University, IJCNLP-2013, Nagoya 2013/10/15 ACLCLP – Activities ( ) & Text Corpora.
CL Research ACL Pattern Dictionary of English Prepositions (PDEP) Ken Litkowski CL Research 9208 Gue Road Damascus,
PUBLISH OR PERISH Skills Building Workshop. Journal of the International AIDS Society Workshop Outline 1.Journal of the International.
The Unreasonable Effectiveness of Data Alon Halevy, Peter Norvig, and Fernando Pereira Kristine Monteith May 1, 2009 CS 652.
A multimodal dialogue-driven interface for accessing the content of recorded meetings Agnes Lisowska ISSCO/TIM/ETI University of Geneva IM2.MDM Work done.
1 Information Retrieval and Extraction 資訊檢索與擷取 Chia-Hui Chang, Assistant Professor Dept. of Computer Science & Information Engineering National Central.
Information Retrieval and Extraction 資訊檢索與擷取 Chia-Hui Chang National Central University
Information retrieval Finding relevant data using irrelevant keys Example: database of photographic images sorted by number, date. DBMS: Well structured.
Symposium On Usable Privacy and Security Carnegie Mellon University 25 July 2008 Expressions of Expertness The Virtuous Circle of Natural Language for.
WISER: Newspapers online : an introduction to the scope and range of recent and current newspapers available on Oxlip, including hints on effective search.
Corpus Linguistics: session 2 Corpus Linguistics (2): The Tools of the Trade 669o4zt
Phonetics, Phonology, Morphology and Syntax
Bringing XBRL tax filing to the UK Jeff Smith, Customer Contact, Online Services,
GL12 Conf. Dec. 6-7, 2010NTL, Prague, Czech Republic Extending the “Facets” concept by applying NLP tools to catalog records of scientific literature *E.
Metaphor Analysis in Social Science: The problem Lynne Cameron and Rob Maslen.
Mining and Summarizing Customer Reviews
 State Standards Initiative.  The standards are not intended to be a new name for old ways of doing business. They are a call to take the next step.
Doctoral Training Workshops Project Planning Sue Oreszczyn, Dave Scott, Julius Mugwagwa October 2014.
How To Do A Literature Review: An Overview
McEnery, T., Xiao, R. and Y.Tono Corpus-based language studies. Routledge. Unit A 2. Representativeness, balance and sampling (pp13-21)
MediaEval Workshop 2011 Pisa, Italy 1-2 September 2011.
Personalization of the Digital Library Experience: Progress and Prospects Nicholas J. Belkin Rutgers University, USA
1 Practice: Social Work in Action Established for over 20 years, it is a forum for the publication of research and knowledge from practice. Promotes the.
Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.
1 WI/IAT/WIRSS PeWe, Bratislava, Slovakia Web Intelligence Intelligent Agent Technologies Web IR Support Systems Michal Tvarožek
 What is the BNC?  What is Xaira?  How to use the BNC for: › Language teaching and learning › Research.
Researching language with computers Paul Thompson.
2 InfoTrac College Edition Over 20 million online articles. Nearly 6,000 full-text journals Instant access to periodicals. Includes journals, magazines,
Welcome to the Science Direct tutorial By the end of this tutorial you should be able to: Do a basic search to find references Use search techniques to.
© Copyright 2013 ABBYY NLP PLATFORM FOR EU-LINGUAL DIGITAL SINGLE MARKET Alexander Rylov LTi Summit 2013 Confidential.
Welcome Plans for the day Key milestones progress Requirements for final report Update on communication with social partners Identify any problems &
KLUWER JOURNALS
UCREL: from LOB to REVERE Paul Rayson. November 1999CSEG awayday Paul Rayson2 A brief history of UCREL In ten minutes, I will present a brief history.
Combining terminology resources and statistical methods for entity recognition: an evaluation Angus Roberts, Robert Gaizauskas, Mark Hepple, Yikun Guo.
A Weakly-Supervised Approach to Argumentative Zoning of Scientific Documents Yufan Guo Anna Korhonen Thierry Poibeau 1 Review By: Pranjal Singh Paper.
SAMUELS End of Project Meeting Glasgow Updates: Thematic Categories Updating of Manually Tagged Texts Changes to Historical Thesaurus Structure.
A Language Independent Method for Question Classification COLING 2004.
Changing Strategies of Persuasion in Political Rhetoric: a corpus-based critical analysis of UK government discourse n Jane Mulderrig n University.
LREC 2008 Marrakech1 Clustering Related Terms with Definitions Scott Piao, John McNaught and Sophia Ananiadou
Controlled Vocabulary & Thesaurus Design Planning & Maintenance.
FINAL PROJECT (CE3216) The Literature Review Dr Deepak T.J. SCHOOL OF CIVIL ENGINEERING.
L JSTOR Tools for Linguists 22nd June 2009 Michael Krot Clare Llewellyn Matt O’Donnell.
SPRINGER ONLINE
Using a Named Entity Tagger to Generalise Surface Matching Text Patterns for Question Answering Mark A. Greenwood and Robert Gaizauskas Natural Language.
Digital Libraries1 David Rashty. Digital Libraries2 “A library is an arsenal of liberty” Anonymous.
G042 Lecture G042 Feedback 2011 Mr C Johnston ICT Teacher
REFERENCES Bargh, J. A., Gollwitzer, P. M., Lee-Chai, A., Barndollar, K., & Troetschel, R. (2001). The automated will: Nonconscious activation and pursuit.
Corpus Linguistics MOHAMMAD ALIPOUR ISLAMIC AZAD UNIVERSITY, AHVAZ BRANCH.
June REU 2003 How to Conduct Research Some Rules of Thumb.
1 ICASSP Paper Survey Presenter: Chen Yi-Ting. 2 Improved Spoken Document Retrieval With Dynamic Key Term Lexicon and Probabilistic Latent Semantic Analysis.
Research Methods: Level 6 Final Year Project Toolkit.
Steps in development of action plans ITC-ILO/ACTRAV Course A3 – Trade Union Training on Information Management for Trade Union Organization, Research.
4 Steps to follow when writing an original research article.
Dissertation Guidelines for Students MSc in Econ; MSc EIFE 2016/2017
Use PowerPoint deck to share with your staff details about:
CORPUS LINGUISTICS Corpus linguistics is the study of language as expressed in samples (corpora) or "real world" text. An approach to derive at a set of.
User Requirements in the Cultural Heritage Domain
John Corbett USP-CAPES International Fellow & Visiting Professor
Project: IEEE P Coexistence TAG
CSE 635 Multimedia Information Retrieval
CS246: Information Retrieval
SURE Network Conference 2019
Presentation transcript:

SAMUELS Closing Symposium Huddersfield Project Lesley Jeffries, Brian Walker and Jane Demmen

Huddersfield Project Aims 1.Investigate language used to talk about labour relations, particularly trades unions, across time in parliamentary language –project title: Is there a Baron in the Commons? –builds on previous work into the way unions & their leaders are discussed in the British press (Language Unlocked 2012) 2.Include the analysis of semantic collocates –builds on previous work into lexical collocates of keywords in the British press when Tony Blair was UK prime minister (Jeffries & Walker 2012) 3.In the course of carrying out 1 & 2, assist with testing the Hansard Corpus data and the HTST (tagger)

Progress and interim findings Data needed to meet aims 1 & 2: –frequency counts of lexical items concerning labour relations from Hansard, HT-tagged –broken down by diachronic periods –semantic collocates enabled Background research/lit review till Jan 2015 Corpus querying began early Feb with CQPWeb Hansard V3.0 Currently working with V3.1 to overcome some technical problems and access full data

Initial data and methods Preliminary analysis of Callaghan/Thatcher data extract began September 2014 Testing of lexical searches and USAS (Rayson et al 2004) semantic domains to find language used around unions/labour relations Prototypical items (strike, union) mainly in –I3.1 Work and employment: Generally –G1.2 Politics Broad categories which would require a lot of manual screening Diverted to analysis of formulaic language (in progress)

Advantages of using HT tags for this study HT offers more specific categories, so should enable a more nuanced analysis with less need for manual filtering of irrelevant items HT overarching structure: 03 (Society) -> (Occupation and work) -> (working) 82 HT sub-categories relating to labour relations

HT categories relating to labour relations (Labour relations) -> 24 subcategories (Association of employers/employees) -> 32 subcategories (Those involved in labour relations/associations) -> 23 subcategories

Methods using CQPWeb Hansard (3.1, HT)

Building up the diachronic view of labour relations talk in Hansard

Identifying semantic collocates

1 2

Semantic collocates of and sub-categories The world > Space > Relative position > Arrangement/fact of being arranged > State of being gathered together > An assemblage/collection: group: a set of things forming a complex unity The world > Time > Frequency > rhythm/measure The mind > Mental capacity > Belief > Uncertainty, doubt, hesitation > Possibility Geographical names (extra HT code for SAMUELS project) 04.03Grammatical items (extra HT code for SAMUELS project) NULL(Not recognised by the tagger) The world > Existence and causation > Existence > State/condition 04.06Pronouns (extra HT code for SAMUELS project)

Collocation of labour relations with An assemblage/collection: group: a set of things forming a complex unity 1 word at this level: system

Collocation of labour relations with State/condition

Interim findings and progress to date Some cases of measure (as a collocate) are tagged incorrectly. According to the online HT, conciliation does not have a meaning association with labour relations until We may find evidence of earlier cases, but, as with striking, we know some cases of conciliation are not tagged correctly. These would need manual filtering (conciliation was the most frequently-occurring item when the Labour Relations HT code was queried).

Interim findings and progress to date We hope to provide some feedback on accuracy of the tagging, once we can get data for all decades The larger decades have proved problematic in processing, so we are trying to create subcorpora (rather than use the Restricted Query form) to see if this works better We hope to complete the analysis in due course, for conferences and publications

Outputs: Conference papers Abstracts accepted for: –The 13th International Cognitive Linguistics Conference (ICLC- 13), July 2015, Northumbria University, Newcastle –Poetics and Linguistic Association Conference (PALA), July 2015, University of Kent, Canterbury Abstract submitted for: –Political Discourse: Multidisciplinary Approaches, June 2015, University College London

Outputs: Publications Language styles at the dispatch box: comparing the language used by two former UK Prime Ministers (in preparation; proposed submission for the Journal of Language & Politics) “Is there a Baron in the Commons?” An investigation of the way industrial unions and their leaders are discussed in the UK House of Commons (details to be decided once data retrieved and analysis under way) A diachronic study of language used to talk about labour relations in UK House of Commons debates (details to be decided once data retrieved and analysis under way)

References Jeffries, L. and Walker, B. (2012) “Key words in the press”. English Text Construction 5(2): Language Unlocked (2012) 20 years of Unions21: Union identity in print media. Report to Unions21, Stylistics Research Centre, University of Huddersfield. See search/mhm/stylisticsresearchcentre/Unions21report pdf Rayson, P. (2008) From key words to key semantic domains. International Journal of Corpus Linguistics, 13(4), Rayson, P., Archer, D., Piao, S. & McEnery, T. (2004). The UCREL semantic analysis system. In Proceedings of the workshop on Beyond Named Entity Recognition Semantic labelling for NLP tasks in association with the 4th International Conference on Language Resources and Evaluation (LREC 2004), Lisbon, Portugal, 25 May 2004 (pp. 7-12). See (last accessed March 2015).