Introduction to UNICODE (ஒருங்குறி)

Slides:



Advertisements
Similar presentations
Graphics 2D 1 Subject:T0934 / Multimedia Programming Foundation Session:6 Tahun:2009 Versi:1/0.
Advertisements

Introduction to Computers and Programming Lecture 7:
Representing Information as Bit Patterns
Representing Information as Bit Patterns Lecture 4 CSCI 1405, CSCI 1301 Introduction to Computer Science Fall 2009.
Working with the data type: char  2000 Prentice Hall, Inc. All rights reserved. Modified for use with this course. Introduction to Computers and Programming.
Data Representation in Computers
Data Representation (in computer system) Computer Fundamental CIM2460 Bavy LI.
Data Storage. SIGN AND MAGNITUDE Storing and representing numbers.
Chapter 3 Data Representation Text Characters. 2 Representing Text To represent a text document in digital form, we need to be able to represent every.
Lesson Objectives Explain the use of binary codes to represent characters Explain the term “Character set” Describe with examples (for examples ASCII and.
CHARACTERS Data Representation. Using binary to represent characters Computers can only process binary numbers (1’s and 0’s) so a system was developed.
Dale & Lewis Chapter 3 Data Representation
Introduction to Human Language Technologies Tomaž Erjavec Karl-Franzens-Universität Graz Tomaž Erjavec Lecture: Character sets
1 herbert van de sompel CS 502 Computing Methods for Digital Libraries Cornell University – Computer Science Herbert Van de Sompel
Representing text Each of different symbol on the text (alphabet letter) is assigned a unique bit patterns the text is then representing as.
Agenda Data Representation – Characters Encoding Schemes ASCII
Bits & Bytes: How Computers Represent Data
Computer System Basics 1 Number Systems & Text Representation Computer Forensics BACS 371.
Chapter 4: Representation of data in computer systems: Characters OCR Computing for GCSE © Hodder Education 2011.
It is pronounced ‘askee’
Data Representation S2. This unit covers how the computer represents- Numbers Text Graphics Control.
Computer Math CPS120: Data Representation. Representing Data The computer knows the type of data stored in a particular location from the context in which.
Globalisation & Computer systems Week 4 writing systems and their implications for globalisation character representation ASCII extended ASCII code pages.
Computer Structure & Architecture 7c - Data Representation.
Building digital libraries in Indian languages: case studies with Hindi and Kannada B.S. Shivaram Trainee ( ) National Center for Science Information.
INFOCODING BASICS & EXAMPLES OF CURRENT USE Introduction to Computer Science Using Ruby (c) 2010 Gideon Frieder.
Computers and Text Daniel Velasquez Scott Baranick.
1 INFORMATION IN DIGITAL DEVICES. 2 Digital Devices Most computers today are composed of digital devices. –Process electrical signals. –Can only have.
Character Encoding, F onts. Overview Why do character encoding and fonts matter to linguists? How can you identify problems? Why do these problems arise?
Computer System Basics 1 Number Systems & Text Representation Computer Forensics BACS 371.
Globalisation & Computer systems Week 5/6 Character representation ACII and code pages UNICODE.
CISC1100: Binary Numbers Fall 2014, Dr. Zhang 1. Numeral System 2  A way for expressing numbers, using symbols in a consistent manner.  " 11 " can be.
UNICODE & Indic Scripts
SEC (1.4) Representing Information as bit patterns.
Representing Characters in a computer Pressing a key on the computer a code is generated that the computer can convert into a symbol for displaying or.
Legal Informatics & E-Governance as tools for the Knowledge Society LEFIS Seminar, Reykjavik (Iceland), July 12-13, 2007 Oleksandr Pastukhov MPhil (Koretsky.
The character data type char. Character type char is used to represent alpha-numerical information (characters) inside the computer uses 2 bytes of memory.
Building Database-backended Multilingual, Multimedia Data Repositories: The aAQUA Experience.
1 Problem Solving using Computers “Data....Representation, and Storage.
M204 - Data Representation
DATA REPRESENTATION 4 Y. Colette Lemard February 2009.
Information Coding Schemes Group Member : Yvonne Tiffany Jurifah bt Junaidi Clara Jane George.
HNC COMPUTING - COMPUTER PLATFORMS 1 Micro Teach Binary.
1.4 Representation of data in computer systems Character.
There are 10 different types of people in the world. Those who understand binary and those that don’t.
Lecture Coding Schemes. Representing Data English language uses 26 symbols to represent an idea Different sets of bit patterns have been designed to represent.
1 Non-Numeric Data Representation V1.0 (22/10/2005)
Conversion of information in different coding systems
Unit 2.6 Data Representation Lesson 2 ‒ Characters
Chapter 3 Data Representation Text Characters
Lesson Objectives Aims You should be able to:
Character coding schemes
Data Transfer ASCII FILES.
Information Support and Services
Representing Information as bit patterns
Phnom Penh International University (PPIU)
Data Encoding Characters.
TOPICS Information Representation Characters and Images
Data Representation ASCII.
Data Representation Question: Characters
Text.
Presenting information as bit patterns
COMS 161 Introduction to Computing
Plan Attendance Files Posted on Campus Cruiser Homework Reminder
Digital Encodings.
How Computers Store Data
INFOCODING BASICS & EXAMPLES OF CURRENT USE
Learning Intention I will learn how computers store text.
ASCII LP1.
ASCII and Unicode.
Presentation transcript:

Introduction to UNICODE (ஒருங்குறி) T.N.C.Venkata Rangan, Blog: www.venkatarangan.com

Introduction Computers at their most basic level just deal with numbers. They store letters, numerals and other characters by assigning a number for each one. In the pre-Unicode environment, we had single 8-bit characters sets, which limited us to 256 characters max. No single encoding could contain enough characters to cover all the languages. So hundreds of different encoding systems were developed for assigning numbers to characters.

குறியாக்க (Encoding ) முறை: ஆஸ்கி முறை (ASCII - American standard code for Information Interchange) இஸ்கி (ISCII ) தகுதரம் (திஸ்கி) (TSCII) டேம் (TAM), டேப் (TAB) – Govt. of Tamilnadu ஒருங்குறி குறியாக்க முறை (Unicode Encoding)

Universal Character Encoding

Linguistic Diversity in India According to Census 2001 India has 122 major languages and 2371 dialects One Language –many script Many Language –one script Out of 122 languages 22 are constitutionally recognized languages All 22 Languages including Tamil has represented and included in UNICODE by TDIL, Govt. of India Declared as Text Encoding Standard for all E-Governance applications by Govt. of India

What is UNICODE? Provides a unique number for every character, for any Platform Program Language The globalization solution for scripts and languages Simple and consistent manner Supported by other standards bodies including ISO, W3C, IETF, ELRA and BIS Compatible with ISO 10646 Unicode is an encoding independent of font variations

ஒருங்குறி மொத்த எண்கள்:  65,536. 107,000 எழுத்துக்கள் (covering 90 scripts) தமிழ்: எண் 2944 முதல் எண் 3071 16 பிட்(16 BIT) மைக்ரோசாப்ட் நிறுவனம் - ‘லதா’, லினக்ஸ், அப்பிள் ஏராளமான எழுத்துருக்கள் - இலவச, தனியார் பயன்பாட்டுச் செயலிகள் ஏராளம்

ஒருங்குறியினால் உண்டாகும் பயன்கள் தரவுகள் பரிமாற்றம் தேடுதல் பொறி, மின் – அஞ்சல், இணையம் மற்ற மொழி தேடுதல் தரப்படுத்துதல் சார்புச்சேவை (Support Service) பலப்பல பயன்நிரல்கள் (User Programs) செல்பேசிகள்

கல்விக்கூடங்களில் பயன்பாடு பல்லாயிரக்கணக்கான கணினிகளை உடனடி தகவல் பரிமாற்றத்திற்கு தயார் செய்ய இயலும். ஆயிரக்கணக்கான பள்ளி மற்றும் கல்லூரிகள், மற்றும் அனைத்துப் பல்கலைக்கழகங்கள் ஆகியவற்றில் உள்ள கணினிகளை தமிழ் உபயோகத்திற்கு ஏற்றதாகச் செய்ய இயலும்.

மக்களுக்கு பயன்பாடு இணைய தளங்களிலும், மின் அஞ்சல், கணினியிலும் தமிழிலேயே தமிழ் மொழியில் உருவான ஆவணங்கள் (Documents), தரவுகள் (Data) ஆகியவற்றைத் தேட, உருவாக்க மற்றும் பரிமாறிக்கொள்ள இயலும். ஆராய்ச்சி, ஆய்வுக் கட்டுரைகள், பாடங்கள் மற்றும் அனைத்து ஆவணங்களை ஒன்றினைக்கும் வழிமுறைகள் ஆகியவற்றை வரையறுப்பதன் வழிவகைகளை உருவாக்க இயலும்.

அரசுக்கு பயன்பாடு ஒருங்குறி முறையில் உருவாக்கப்பட்ட ஆவணங்களை எந்தவித பிற மென்பொருட்கள், தனி எழுத்துருக்கள் (Fonts) இன்றி படிக்க இயலும். எதிர்கால சந்ததியினருக்கு தமிழின் அனைத்து ஆவணங்களும் பாதுகாப்பாகச் சென்றடையும் வழிவகையை ஏற்படுத்தல்.