Free Pascal compiler internationalisation Rimgaudas Laucius Institute of Mathematics and Informatics, Vilnius University Lithuania.

Slides:



Advertisements
Similar presentations
Beyond Text Representation Building on Unicode to Implement a Multilingual Text Analysis Framework Thomas Hampp – IBM Germany Content Management Development.
Advertisements

Programming Paradigms and languages
Chapter 8 High-Level Programming Languages. 8-2 Chapter Goals Describe the translation process and distinguish between assembly, compilation, interpretation,
Object Oriented Programming in Java George Mason University Fall 2011
Tafseer Ahmed Department of Computer Science University of Karachi Urdu on Linux International Support.
Slides prepared by Rose Williams, Binghamton University Chapter 1 Getting Started 1.1 Introduction to Java.
ISBN Chapter 1 Preliminaries. Copyright © 2004 Pearson Addison-Wesley. All rights reserved.1-2 Chapter 1 Topics Motivation Programming Domains.
Executable XML Present by 吳昆澤. Outline  Introduction  Simkin  Jelly  o:XML  Conclusion.
CS-341 Dick Steflik Introduction. C++ General purpose programming language A superset of C (except for minor details) provides new flexible ways for defining.
ISBN Lecture 01 Preliminaries. Copyright © 2004 Pearson Addison-Wesley. All rights reserved.1-2 Lecture 01 Topics Motivation Programming.
The Structure of the GNAT Compiler. A target-independent Ada95 front-end for GCC Ada components C components SyntaxSemExpandgigiGCC AST Annotated AST.
Introduction to a Programming Environment
C and Unix. A Couple Basic Concept and Terms 1. File. 2. Process. 3. Memory 4. HD.
About the Presentations The presentations cover the objectives found in the opening of each chapter. All chapter objectives are listed in the beginning.
Chapter 1 Program Design
ISBN Chapter 1 Topics Motivation Programming Domains Language Evaluation Criteria Influences on Language Design Language Categories Language.
1 An Introduction to Visual Basic Objectives Explain the history of programming languages Define the terminology used in object-oriented programming.
Review1 What is multilingual computing? Bilingual, trilingual, vs. Multilingual What are the fundamental issues in multi-lingual computing? –Representation.
SOFTWARE SYSTEMS SOFTWARE APPLICATIONS SOFTWARE PROGRAMMING LANGUAGES.
Binary Numbers.
CCE-EDUSAT SESSION FOR COMPUTER FUNDAMENTALS Date: Session III Topic: Number Systems Faculty: Anita Kanavalli Department of CSE M S Ramaiah.
Computer Software.
Types of software. Sonam Dema..
COMPUTER FUNDAMENTALS David Samuel Bhatti
Unicode, character sets, and a a little history. Historical Perspective First came EBCIDIC (6 Bits?) Then in the early 1960s came ASCII – Most computers.
Microsoft Visual Basic 2005 CHAPTER 1 Introduction to Visual Basic 2005 Programming.
Programming 101 with Python: an open-source, cross-platform, and fun language By J. Burton Browning, Ed.D. Copyright © J. Burton Browning All rights reserved.
Introduction to Java Appendix A. Appendix A: Introduction to Java2 Chapter Objectives To understand the essentials of object-oriented programming in Java.
CS 355 – Programming Languages
1 herbert van de sompel CS 502 Computing Methods for Digital Libraries Cornell University – Computer Science Herbert Van de Sompel
Representing text Each of different symbol on the text (alphabet letter) is assigned a unique bit patterns the text is then representing as.
(1.1) COEN 171 Programming Languages Winter 2000 Ron Danielson.
CIS Computer Programming Logic
Computing with C# and the.NET Framework Chapter 1 An Introduction to Computing with C# ©2003, 2011 Art Gittleman.
Introduction to Programming Peggy Batchelor.
IBM Globalization Center of Competency © 2006 IBM Corporation IUC 29, Burlingame, CAMarch 2006 Automatic Character Set Recognition Eric Mader, IBM Andy.
Building digital libraries in Indian languages: case studies with Hindi and Kannada B.S. Shivaram Trainee ( ) National Center for Science Information.
Hans-Peter Plag October 9, 2014 Session 2 Storing Information File Formats Accessing Information Processing Information.
Sadegh Aliakbary Sharif University of Technology Fall 2012.
Programming With C.
Programming in C#. I. Introduction C# (or C-Sharp) is a programming language. C# is used to write software that runs on the.NET Framework. Although C#
History of C 1950 – FORTRAN (Formula Translator) 1959 – COBOL (Common Business Oriented Language) 1971 – Pascal Between Ada.
Copyright © 2007 Addison-Wesley. All rights reserved.1-1 Reasons for Studying Concepts of Programming Languages Increased ability to express ideas Improved.
SEC (1.4) Representing Information as bit patterns.
A. Frank - P. Weisberg Operating Systems Structure of Operating Systems.
 Programming - the process of creating computer programs.
Introduction Mehdi Einali Advanced Programming in Java 1.
Computer and Programming. Computer Basics: Outline Hardware and Memory Programs Programming Languages and Compilers.
Understanding Character Encodings Basics of Character Encodings that all Programmers should Know. Pritam Barhate, Cofounder and CTO Mobisoft Infotech.
Information Coding Schemes Group Member : Yvonne Tiffany Jurifah bt Junaidi Clara Jane George.
1 Asstt. Prof Navjot Kaur Computer Dept PRESENTED BY.
Lecture Set 1 Part B: Understanding Visual Studio and.NET – Structure and Terminology 1/16/ :04 PM.
Programming Languages Concepts Chapter 1: Programming Languages Concepts Lecture # 4.
Introduction to Computer Programming Concepts M. Uyguroğlu R. Uyguroğlu.
Lecture Coding Schemes. Representing Data English language uses 26 symbols to represent an idea Different sets of bit patterns have been designed to represent.
The language focusses on ease of use
Programming paradigms
Basic 1960s It was designed to emphasize ease of use. Became widespread on microcomputers It is relatively simple. Will make it easier for people with.
Why study programming languages?
CSCI-235 Micro-Computer Applications
Microprocessor and Assembly Language
C# and the .NET Framework
Java programming lecture one
Representing Information as bit patterns
TOPICS Information Representation Characters and Images
Choice of Programming Language
INTRODUCTION c is a general purpose language which is very closely associated with UNIX for which it was developed in Bell Laboratories. Most of the programs.
SVTRAININGS. SVTRAININGS Python Overview  Python is a high-level, interpreted, interactive and object-oriented scripting language. Python is designed.
Programming Language Translation
IS 135 Business Programming
Presentation transcript:

Free Pascal compiler internationalisation Rimgaudas Laucius Institute of Mathematics and Informatics, Vilnius University Lithuania

Introduction Institute of Mathematics and Informatics, Informatics Methodology Department Software localisation Teaching of informatics and programming E-learning and standards Informatics terminology Vilnius University Localisation course

Localisation in Lithuania One of the four priorities emphasised in the strategic project for the development of the information society in Lithuania is: “to uphold the inheritance of Lithuanian language and culture implementing the information technologies and telecommunications”

Open Source in Lithuania Research which was carried out in 2004, “Open Source in Education” revealed that open source software integration into education has a large positive economical and also pedagogical effect Education requires high quality and fully localised software Open source software is more flexible in terms of localisation

Free Pascal compiler Excellent, open source compiler Works under all widely used operating systems Windows, Linux and others Widely used. Has been used in International, Baltic and national Lithuanian Olympiads in informatics for a few years already Replacement for obsolete Turbo Pascal system in Lithuanian schools

FPS

Compilers’ internationalisation Internationalisation is part of the software development process, so the internationalisation of development tools is very important Most contemporary software development tools are not internationalised enough Though this research is done on Free Pascal compiler, most of represented issues are common to most of compilers

Programming language standards Internationalisation relates with programming language standards Pascal programming language standards Standards of other languages

Examples of internationalised compilers There are not many of these examples One of the most well known internationalised programming system is LOGO Vector Pascal

Structure of Free Pascal Free Pascal is system made up of the compiler program itself and run-time library (RTL) Compiler and RTL interaction: Sometimes to change compiler one needs to change the RTL

Support of multilingual source code This is the first stage of compiler internationalisation There are many scripts which require more than the 8-bit character set

UTF-8 implementation Unicode ~ UTF-8 Some utilities used by compilers do not support pure Unicode (Unicode chars may be treated as pairs of 8-bit chars; example U+0900 ~ 09 00, (tab and end of string)) Allows step by step implementation of lexical extensions

Lexical extensions Strings Identifiers Directives Reserved words Operators Numbers

Strings WideString implementation issues –Compatibility with other systems –Ambiguity –Conversions between Unicode and other character sets

Ambiguity example procedure go(const s: WideString); begin... end; procedure go(const s: String); begin... end; begin Go('Hi'); end. Which overloaded procedures have to be called?

Unicode support layer Unicode support layer wraps OS APIs’ in an OS independent way. Under Win9x implements Microsoft Layer for Unicode (MSLU)

Identifiers Identifiers have to reflect clear meaning of object, be easy to comprehend and memorize. Best way to support these features is to allow use of identifiers written in vernacular language Unicode Standard Annex #31: Identifier and Pattern Syntax

Directives Names Parameters –Logical (ON, OFF) –Strings ({$warning Possible malfunctioning}) –File names ({$includepath..\inc})

Reserved words Unification myth –Compared 13 similar programming languages (Algol, Pascal, Modula, Ada, C, Java,…) –Only ~3% of reserved words are same –56% met only in particular language Possible unambiguous translation

Example of localised reserved words

Operators Unicode has all mathematical symbols needed to express mathematical operations Example:

Numbers There are various scripts to express decimal numbers. Example:

Decimal separator JAV, GB ‘.’ Most European countries ‘,’ Localisation of delimiter may cause ambiguity. Solution needs to extend syntax of numbers. 25,88 – real number 25, 88 – two numbers

Punctuation Spaces: general U+0020, nonbreaking U+00A0, ideographic U+3000, etc Quotes: “English”, "Lithuanian“, Etc

Bi-directional text Bi-directional text is an issue of text representation, not the compiler

Unicode file names support Handling of files requires OS API, so it have to be done via RTL’s Unicode support layer Compilers have to use MSLU under Win9x

Input/Output File input/output requires additional support for Unicode encoding Windows console does not support Unicode –It can be replaced but is it the best solution?

Localisation framework Strings and other resources have to be externalised for easy localisation Localisation kits have to be prepared

Questions? Thank you Contact