Presentation is loading. Please wait.

Presentation is loading. Please wait.

Dictionaries for the Human Language Technologies virtual network Dr Mariëtta Alberts Focus Area Manager Standardisation and Terminology Development Pan.

Similar presentations


Presentation on theme: "Dictionaries for the Human Language Technologies virtual network Dr Mariëtta Alberts Focus Area Manager Standardisation and Terminology Development Pan."— Presentation transcript:

1 Dictionaries for the Human Language Technologies virtual network Dr Mariëtta Alberts Focus Area Manager Standardisation and Terminology Development Pan South African Language Board (PanSALB)

2 Afrilex,13 - 15 July 2005, UFS, Bloemfontein Outline of presentation Introduction Reviewing Human Language Technologies –Scope of HLT –Potential of HLT –Multilingualism and HLT The South African HLT initiative –History of South African HLT project –National Facility –South African HLT model Terminology Training initiative of PanSALB Conclusion

3 Afrilex,13 - 15 July 2005, UFS, Bloemfontein 1.Introduction South Africa is on the verge of establishing a Human Language Technology (HLT) Centre The Centre will probably be managed as a national facility It will provide an appropriate and sustainable virtual (or otherwise) infrastructure conducive to the development and effective management of reusable electronic text and speech resources

4 Afrilex,13 - 15 July 2005, UFS, Bloemfontein 2.Reviewing Human Language Technologies (HLT) Human Language Technologies are enabling technologies They enable human beings to interact with computers by using human language (text and speech)

5 Afrilex,13 - 15 July 2005, UFS, Bloemfontein Human Language Technologies range from: high-level parsing and machine translation applications in education and training public service (e-governance and e-commerce applications) voice-operated educational systems voice-operated commercial systems that can be used by illiterate people

6 Afrilex,13 - 15 July 2005, UFS, Bloemfontein Human Language Technologies: Provide interfaces that enable spoken human-machine interaction (telephone- based information systems, automated booking systems); Provide linguistic assistance (spelling and grammar checking) Provide access to multilingual polythematic information Empower people to actively participate in the Information Society

7 Afrilex,13 - 15 July 2005, UFS, Bloemfontein 2.1The scope of HLT: Text based language processing –Text analysis (e.g. spellcheckers, term extraction, search engines) –Summarisation –Text translation Speech processing –Speech recognition (e.g. desktop or telephony environment) –Speech synthesis

8 Afrilex,13 - 15 July 2005, UFS, Bloemfontein 2.2Potential of HLT: Access for all to the information era Enhanced mother-tongue or first language teaching Affordable multilingual documents Improved functionality and quality of languages Contact with the developing-world context

9 Afrilex,13 - 15 July 2005, UFS, Bloemfontein Potential of HLT... Availability of multilingual words and polythematic terminology: indicator of development Specialised communication has a central axle or hub in terminology Standardised terminology contributes to quality of translations, interpreting and communication Streamlined translation and interpreting services provide competitive advantages

10 Afrilex,13 - 15 July 2005, UFS, Bloemfontein 2.3Multilingualism and HLT: South African situation South Africa has a severe illiteracy rate Only 22% of the citizens can function through medium of English A small percentage of South Africans have access to computers - fewer still are IT literate The divide is even greater in the rural versus urban scene

11 Afrilex,13 - 15 July 2005, UFS, Bloemfontein Effective e-government is necessary (i.e. birth certificates, identity documents, marriage and death certificates, telephone, electricity and water bills, traffic fines, etc.) All citizens should have access to information in the languages they understand best (e.g. 11 official languages; South African Sign Language; Khoe and San languages) Government should communicate to citizens in their own languages regarding key services (e.g. health; safety and security; education; postal services; justice (courts); banks (economy); media (electronic and print); labour (jobs); social welfare (pensions); etc.)

12 Afrilex,13 - 15 July 2005, UFS, Bloemfontein Language Policy and Legislation Multilingual policy since 1994 - South African Constitution of 1996 (Act 108 of 1996) Mechanisms of protecting and promoting linguistic rights were put in place Section 6 of the South African Constitution specifically mentions the principles of language policy which takes into consideration the multilingual nature of the South African society

13 Afrilex,13 - 15 July 2005, UFS, Bloemfontein Establishment of PanSALB The Pan South African Language Board (PanSALB) (Act 59 of 1995) was established: –to develop, promote and ensure use of South Africa’s eleven official languages, South African Sign Language (SASL) and the Khoe and San languages, and –to promote respect for other languages used in the country (e.g. heritage languages ( Dutch, French, German, Hindu, KiSwahili, Portuguese, Tamil, etc. )

14 Afrilex,13 - 15 July 2005, UFS, Bloemfontein PanSALB ensures the implementation of the National Language Policy Framework (NLPF) to ensure access to services to all citizens through: 9 Provincial Language Committees (PLCs) –Assist Provinces with language policy formulation and implementation 13 National Language Bodies (NLBs) –Standardisation (e.g. spelling and orthography rules) –Terminology development –Dictionary needs (general vocabulary) –Literacy and media –Research and Education 11 National Lexicography Units (NLUs) –Compilation of comprehensive monolingual and other types of dictionaries

15 Afrilex,13 - 15 July 2005, UFS, Bloemfontein 3. The South African HLT initiative 3.1 History Lexinet research programme of HSRC (1988) (Wordnet, Termnet, Docnet, Transnet, Ailang, etc.) PanSALB and DACST (now DAC) initiated the HLT project in 1999 The former Minister of DACST appointed a panel of experts to investigate the establishment of a HLT virtual network The HLT task team concluded that a HLT National Facility should be established The developers of the envisaged HLT National Facility should ensure that HLT advance multilingualism in different respects, i.e.:

16 Afrilex,13 - 15 July 2005, UFS, Bloemfontein Key government documents in the languages the citizens can understand best Electronic systems to connect lexicographers and terminologists with other language practitioners Electronic systems to disseminate lexicographical and terminological data Electronic systems to connect translators and other language workers with word and term banks Central government assistance to meet communication needs of all its citizens Local and provincial governments to serve as focal points of information dissemination ( e.g. multipurpose community centres )

17 Afrilex,13 - 15 July 2005, UFS, Bloemfontein 3. The South African HLT initiative 3.2 National Facility Purpose of HLT project: –to fast track the use and development of indigenous languages –to promote the SA government’s policy of multilingualism –to facilitate better service delivery for citizens to access or supply information in any of the official languages

18 Afrilex,13 - 15 July 2005, UFS, Bloemfontein Basic premises for the development of HLT: –development and effective management of reusable text and speech resources in all official languages of SA; –capacity building with respect to research and development in the field of HLT; and –stimulation of an HLT industry that will provide language-based electronic products which, in turn, will be applicable in all relevant sectors, especially in the government sector.

19 Afrilex,13 - 15 July 2005, UFS, Bloemfontein 3.3 SA Human Language Technologies Model The South African HLT model is based on a model being implemented by the European Union (EU) EU model is effectively implemented in the EU Framework Programmes (FP 3/4/5/6) South African HLT model will grow exponentially as expertise and resources are developed

20 Afrilex,13 - 15 July 2005, UFS, Bloemfontein

21 3.3.1 Aims of envisaged HLT virtual network An e-government process needs to provide citizens with: –Access to online facilities –Required and necessary service delivery –Infrastructure to make it work Two basic prerequisites are: –A technical infrastructure ( IT access; proven and multipurpose IT systems; online language services) –Human capital (capacity building e.g. trained and reskilled language practitioners)

22 Afrilex,13 - 15 July 2005, UFS, Bloemfontein 3.3.2Identified needs: Low general awareness level regarding HLT benefits Interdisciplinary curricula at tertiary level to advance HLT development Systematic presentation of short dedicated HLT courses Theoretical and practical training in the fields of lexicography and terminology Job creation should be carefully planned Upgrade and maintain a knowledge base on HLT

23 Afrilex,13 - 15 July 2005, UFS, Bloemfontein 3.3.3 Proposed three-step strategy for development of HLT model: Step 1: Applied research and capacity building, production of language resources, development of enabling technologies and of a HLT industry. Step 2: Development of a legal framework to ensure systematic acquisition, administration and conservation of electronic language resources. Step 3: Development of an infrastructure to manage the implementation of the proposed HLT model

24 Afrilex,13 - 15 July 2005, UFS, Bloemfontein

25 3.3.4 Role players Government services: national, provincial and local (e.g. e-government, e-learning, e- commerce, etc.) Parastatal institutions (e.g. PanSALB) Private sector Academia (tertiary education) Education (primary and secondary education)

26 Afrilex,13 - 15 July 2005, UFS, Bloemfontein 3.3.5 Progress Parsing (Zulu and other African languages) by Special Interest Group (SiG), African Languages Association of Southern Africa (ALASA) Speech recognition (Tourism: pilot booking service) Amalgamated Banks of South Africa (ABSA) multilingual pilot project: ATM screen prompts and telephone banking prompts in African languages ( Zulu, Xhosa and South Sotho )

27 Afrilex,13 - 15 July 2005, UFS, Bloemfontein Progress... TISSA (Telephone Interpreting Service of South Africa) (all ports of entry; health services; police charge offices; etc.) Spellcheckers: Afrikaans developed by North- West University; African Languages by University of Pretoria/North West University; future development combined effort Microsoft human/machine interface: combined effort re terminology development Afrilingo: e-learning tool for language acquisition (11 official SA languages)

28 Afrilex,13 - 15 July 2005, UFS, Bloemfontein Progress... TshwaneLex: dedicated computer software program for data capturing (lexicography) 11 National Lexicography Units (NLUs) of PanSALB: Monolingual dictionaries for each of the 11 official South African languages NLUs: Data collection and building of corpora NLUs: on-line dictionaries ( e.g. Afrikaans, Northern Sotho (Sesotho sa Leboa) ) TshwaneTerm: dedicated computer software program for data capturing (terminology)??

29 Afrilex,13 - 15 July 2005, UFS, Bloemfontein Progress... National term bank ( multilingual, polythematic ): Terminology Coordination Section (TCS) of the National Language Service (NLS), Department of Arts and Culture (DAC) Latin terminology: interactive multilingual e-learning project (PanSALB, CLTAL, Trydian Interactive) Mathematics on-line dictionary project: South African Multilingual Mathematical Lexicon ( SA MML)

30 Afrilex,13 - 15 July 2005, UFS, Bloemfontein Lexicographical and Terminological information available on HLT virtual network SA Government has approved the development of a human language technology (HLT) virtual network All lexicography and terminology endeavours to be part of HLT virtual network For multilingual words and terms to be available on HLT virtual network to end-users (subject specialists, students, language practitioners, general public) - dictionaries are needed!!!

31 Afrilex,13 - 15 July 2005, UFS, Bloemfontein 4. New terminology training initiative from PanSALB: Members of TCs, NLBs: Guidelines to verify and authenticate terms Skills development: Language practitioners: terminologists, lexicographers (e.g. NLUs), translators, interpreters, linguists, teachers, journalists, language students, etc. Skills development: subject specialists Reskilling: Unemployed language workers

32 Afrilex,13 - 15 July 2005, UFS, Bloemfontein

33 Lexicography School for Languages Terminology Statistics Zoology Psychology NLUs NLBs PLCs TCS NLS LUs

34 Afrilex,13 - 15 July 2005, UFS, Bloemfontein 5.Conclusion: –Development of skills –Enhancement of South African languages –Development of languages into functional languages –Dissemination of multilingual polythematic (speech and text) information within the South African community –Better communication among all citizens in different spheres of life –Improvement of computer literacy

35 “Utilising technology for the development of the South African languages and developing these languages for use with Human Language Technology applications such as spellcheckers, translation memories and speech-recognition systems will enhance the status of the indigenous languages and will result in increased job opportunities in the language field.” Dr Ben Ngubane (former Minister of Arts Culture Science and Technology) 2003


Download ppt "Dictionaries for the Human Language Technologies virtual network Dr Mariëtta Alberts Focus Area Manager Standardisation and Terminology Development Pan."

Similar presentations


Ads by Google