Presentation is loading. Please wait.

Presentation is loading. Please wait.

The main advantage of concordancing tools is that they allow translators to see terms in a variety of contexts simultaneously to detect various kinds of.

Similar presentations


Presentation on theme: "The main advantage of concordancing tools is that they allow translators to see terms in a variety of contexts simultaneously to detect various kinds of."— Presentation transcript:

1 The main advantage of concordancing tools is that they allow translators to see terms in a variety of contexts simultaneously to detect various kinds of linguistic and conceptual.patterns The majority of corpus analysis tools also offer a number of other features, which often combine the data produced by the concordancer and word frequency counts

2 Towards a Methodology for a Corpus-Based Approach to Translation Evaluation By Mojgan Heydarali Professor : DR, Behbahani Course : Translation Assessment Azad University of literature & Foreign Languages, Tehran South Branch

3 Content 1. Translator trainers responsibility 2. Evaluation tools limitations 3. Importance of corpus-based approach 4. Characteristics of corpus-based approach 5.Challenges facing evaluators in academic context 6. Corpora and corpus analysis tools 7.Designing an evaluation corpus a. Comparable Source Corpus b. Quality Corpus c. Quantity Corpus d. Inappropriate Corpus

4 Translator trainers are responsible for : Grading students’ work and importantly feedback, providing useful

5 In the past translators and trainers worked with resources such as : Dictionaries Printed parallel texts Unverified intuition Subject field experts But they were not always conductive to providing the conceptual and linguistic knowledge necessary to.an objective translation evaluation

6 What is the importance of corpus-based approach? It removes a great deal of subjectivity : 1 2: Provides improved access to appropriate conceptual and linguistic information of specialized subject field which is documented by experts in that field.

7 In another word a specially designed evaluation corpus can act as a benchmark for comparing students translations on a number of different levels

8 so Translator trainers by having access to wide range of authentic and suitable texts can: Verify or correct both conceptual and linguistic Information and, Provide more constructive feedback based on evidence.

9 What is a corpus-based approach characteristics ? Firstly, It is based on the analysis of a comparatively large and carefully selected collection of naturally occurring texts that are stored in machine-readable form( i. e, a corpus).

10 Secondly, Because it analyzes actual patterns of language use in the corpus, it is empirical and therefore objective.

11 Thirdly, :This approach takes advantage of Computational Tools, Methods for Manipulating the corpus, Arranging the Data, in ways that make it possible to spot items and patterns that would be difficult to identify in other types of resources.

12 Additionally Computers provide consistent and reliable analysis (i.e., they do not change their minds or get distracted.)

13 Finally The corpus-based approach combines both Quantitative and Qualitative techniques; A computer is capable of churching out counts of linguistic features, but translator trainer is responsible for exploring and interpreting data in order to learn about patterns of language use.

14 1. Challenges Facing Evaluators in an Academic Context A. The main difficulty surrounding translation evaluation is its subjective nature ; the notion of quality has very fuzzy and shifting boundaries. B. Clients who commission translations are not interested educating the translator while trainer has.obligation to help students improve their performance

15 C. In order to properly preparing students for entering the translation profession, students needs to be exposed to wide range of translation material and text types, but naturally trainers are not expert in all subjects. So specially designed evaluation corpus can help to meet this need.

16 Corpora and corpus analysis tools Similarity between corpus and conventional parallel texts: In translation context, a suitable corpus might be one containing texts that correspond to the intended skopos of target text. In this way a corpus is similar to the conventional parallel.texts used by many translators

17 However an electronic corpus is generally much larger and can be processed with the help of computerized tools known as corpus analysis tools.

18 Most corpus analysis tools contains at least two main features: Word Frequency lists and Concordancers A. Word Frequency lists, allows users to discover how many different words are in the corpus and how often each appears. DVD 765 * video 126* not 89 * player 80 Is 341 * we 121* said 85 * all 79 Will 208 * have 116 * PC 82* MPEG 81

19 “I really like translation because I think that translation is really, really interesting.” : Tokens (total word ) = 13 They can be stored in * Alphabetical order * Ascending order * Descending frequency Types (different words) = 9

20 Words belonging to the same lemma can be counted together or separately, as can words beginning with upper or lower case. Lemma refer to words which have same stem and belong to the same major word class, differing only by spelling or inflection. Stop lists refer to lists of words to be ignored and can also be used In order to eliminate common function words such as prepositions or conjunctions. Frequency information can be used for helping translators decide which term to use when faced with a number of potential synonyms or translation equivalents.

21 B. Concordancer A concordancer retrieves all the occurrences of particular search pattern in its immediate contexts and displays these in an easy-to-read format. The most commonly used format is KWIC (key word in context) shows one occurrence of the search pattern per line with the search pattern itself high-lighted in the center of the screen. Atsushita slick, portable DVD player with a color LCD and Ndows explorer, but their movie player software refused to pla Ers with a “record” button. The player will not even have the o three years,” he says. Such a player would have a display

22 * The extent of the context on either side of the search pattern is variable, * These contexts can be sorted in a variety of ways such as : a. Order of appearance in the corpus, b. Alphabetically, c. The words preceding or following the search pattern

23 Concordancers are flexible and allow functions such as: * Case-Sensitive VS Non-Case Sensitive searches (Bill ex president of USA & bill,Polish people of poland & polish) * Wildcard searches( e.g. ‘play’ to retrieve ‘play’, ‘player’, ‘played’, etc.) * Another term must appear within a user- specified distance of search term (e.g. contexts where ‘play’ appear within five words of ‘DVD’ )

24 The majority of corpus analysis tools also offer a number of other features, which often combine the data produced be the Concordancer and Frequency Counts. It must be considered: * The value of what comes out of a corpus is largely dependent on what texts are included in it. * Criteria for designing general language corpora have been well- documented in literature ; however, these criteria cannot be adopted wholesale for the design of a special-purpose corpus such as an Evaluation Corpus.

25 Designing an Evaluation Corpus The evaluation Corpus is the collective name for the collection of texts that is divided into four main sub- corpora: 1. The Comparable Corpus 2. The Quality Corpus 3. The Quantity Corpus 4. The Inappropriate Corpus These sub-corpora differ in content and intended function.

26 1. Comparable Source Corpus (CSC) It is optional and depends on factors such as Time, text type, skopos of the target text. CSC contains a selection of SL texts that are similar to the source text in term of text type, publication date, subject matter.

27 The purpose of CSC Its purpose is to allow the evaluator to gauge the “normality” of the source text with regard to other source language texts of that type. Normalization is a feature of translated texts; normalized texts display exaggerated features of the target language and conform to its typical pattern (Baker.1997)

28 Sanitization: The suspected adaptation of a source text reality to make it more palatable for target audiences.(Kenny) Both Normalization and Sanitization result in deliberately chosen unconventional lexical or syntactic ST features being changed in translation so that the TT fits in with the conventions of the target language.

29 Determining inappropriate normalization or sanitization :Evaluators can first: use the CSC as a reference corpus to establish the relative normality of the ST. Second: they can then use Quantity Corpus as reference corpus to establish the relative normality of TT.

30 If the ST is deemed to be normal ( in vocabulary, register, style, etc.) with reference to texts in the comparable source corpus, then the text should be normal when compared with texts in the Quantity Corpus( and vise versa).

31 2. Quality Corpus The Quality Corpus is a high quality sub-corpus consisting of hand picked texts primarily for their conceptual content, It is very small by corpus linguistics standards containing four or five texts with total word 5,000 words

32 The Quality Corpus is used primarily as a source of conceptual information rather than linguistic information so it is not necessary that all texts to be of the same text type. But it is important to be complete texts (not a sample or extract of the text). At list some of the texts should be current. Using Quality Corpus will help translator trainer become familiar with basic concepts in the field and identify some of the key terms. If the texts are well chosen they can serve as benchmark for evaluating students translation.

33 3. Quantity Corpus Why it is not appropriate to rely exclusively on the Quality Corpus ? Firstly Because it is a relatively small collection, There is no real way to know that if selected texts are truly representative of the text type at large. Secondly The texts contained in the Quality Corpus may be “older” texts and a term which was appropriate in the past may no longer be so.

34 The Quantity Corpus is designed to provide a larger and more representative sample of specialized language in question. External factors such as time and availability of data have influence on the question of how large and. how representative By experience the Quantity Corpora from 20,000 to 200,000 words have proved useful. 20,000 for highly specialized subject field,.200,000 for subject field that are not extremely narrow

35 It is useful to divide the Quantity Corpus into further sub-corpora, one for each year, this enables translators or evaluators track terminological changes over time. A Corpus analysis tool such as Word Smith allows users to consult multiple corpora at once.

36 The Quantity Corpus: pros and cons Pros: The Quantity corpus is compiled in semi-automated fashion and can be used by translator trainer to verify terminological, phraseological, and stylistic appropriateness made by students. Most Corpus analysis software gives users the option of expanding the context to several lines or the complete text. The volume of the data makes it possible to spot pattern more easily, to make generalizations and provide concrete evidence to support decisions.

37 Cons : Interacting solely with a large electronic corpus Causes loosing sight of the fact that translation is a text- based activity. In corpus analysis the focus is on micro- contexts and the primary power of corpus analysis remains at a sub- text level. The texts are not readily available in electronic form.

38 4. Inappropriate corpus It is a corpus containing “inappropriate” parallel texts. Its size vary based on the subjects. In well established or with wider interest it would be larger, but it is smaller in very recent subjects. Its purpose is to help translator trainer uncover the mysteries of the unsuitable equivalents in students translation. If a student has used a term which does not appear in Quality and Quality Corpus it can be checked in this corpus.

39 THE END


Download ppt "The main advantage of concordancing tools is that they allow translators to see terms in a variety of contexts simultaneously to detect various kinds of."

Similar presentations


Ads by Google