Centre For Indian Language Technology

Slides:



Advertisements
Similar presentations
Lesson 4: Formatting Input Data for Arithmetic
Advertisements

The Marathi Portal with a Search Engine Center for Indian Language Technology Solutions, IIT Bombay.
Characters and Strings. Characters In Java, a char is a primitive type that can hold one single character A character can be: –A letter or digit –A punctuation.
Data Representation in Computers
Project on fuzzy keyword search over encrypted data in cloud computing
WEL COME PRAVEEN M JIGAJINNI PGT (Computer Science) MCA, MSc[IT], MTech[IT],MPhil (Comp.Sci), PGDCA, ADCA, Dc. Sc. & Engg.
Static VS Dynamic websites. 1-What are the advantages and disadvantages? 2- Which one should you choose and why?
Introduction to Human Language Technologies Tomaž Erjavec Karl-Franzens-Universität Graz Tomaž Erjavec Lecture: Character sets
By: Blake Peters.  OODB- Object Oriented Database  An OODB is a database management system in which information is represented in the form of objects.
Modular InfoTech’s Modular Infotech is proud to offer Tools and Components enabled with Indian language so as to address each & every client located across.
Reading Aid for Visually Impaired Veera Raghavendra, Anand Arokia Raj, Alan W Black, Kishore Prahallad, Rajeev Sangal Language Technologies Research Center,
Implementation Issues Mark Davis Properties.
Web software. Two types of web software Browser software – used to search for and view websites. Web development software – used to create webpages/websites.
NLP Research Group Meeting ( 27. March ) Text Processing Front End for Indian Language TTS System Text Processing Front End Speech Synthesizer Phonetic.
Your Search for Indian languages ends at Modular InfoTech, Pune Web-Samhita from Modular InfoTech Pvt. Ltd. Modular InfoTech is proud to offer various.
The character data type char. Character type char is used to represent alpha-numerical information (characters) inside the computer uses 2 bytes of memory.
An ISO 9001:2008 Company With all the tools you need to compute in Indian Languages.
Proposed Vedic Sanskrit Coding Scheme: Some suggestions Akshar Bharati Amba Kulkarni Department of Sanskrit Studies University of Hyderabad Hyderabad
System concept and development by: Tony Rees Divisional Data Centre CSIRO Marine Research, Australia c-squares - a new method for representing, querying,
17-Mar-16 Characters and Strings. 2 Characters In Java, a char is a primitive type that can hold one single character A character can be: A letter or.
Biocomputational Languages December 1, 2011 Greg Antell & Khoa Nguyen.
What is the DNS? System converts domain names to IP (Internet Protocol) addresses URL (Uniform Resource Locator) → IP address DNS Problems Limitations.
First Foray into Programming (the hard way). A reminder from last lesson: A machine code instruction has two parts:  Op-code  Operand An instruction.
1 Midterm Examination. 2 General Observations Examination was too long! Most people submitted by .
Pros and Cons of Static or Dynamic Websites. As a website user, you may not bother if a site you visit is static or dynamic as it is a sheer backend functionality.
Topic: Binary Encoding – Part 2
Unit 2.6 Data Representation Lesson 2 ‒ Characters
FHIR and Relational Databases
The Computer System.
Static Detection of Cross-Site Scripting Vulnerabilities
Why to use the assembly and why we need this course at all?
XML QUESTIONS AND ANSWERS
Create Virtual Directory Windows 8 - IIS 8.5
Exercise The university wants to create a database for teaching. The database needs to contain information about the different courses, the different versions.
Object-Oriented Databases
ITEC 313 Database Programming
Parameter Sniffing in SQL Server Stored Procedures
The Database Application
3 Computing System Fundamentals
Web software.
ICM, University of Warsaw
SQL – Application Persistence Design Patterns
Web Content FileSystem
Data Representation Question: Characters
Semantic Web: Commercial Opportunities and Prospects
Technology Development
FILE ORGANIZATION.
Automatic Language Identification – A Syntactic Approach
Unit 1: Introduction Lesson 1: PArts of a java program
Project Tukaram Sagar Tamhane

COMS 161 Introduction to Computing
Libraries of Code Notes from Wilson, Software Design and Development Preliminary Course pp
Java Online documentation
IntroductionToPHP Static vs. Dynamic websites
Chengyu Sun California State University, Los Angeles
INTRODUCTION TO COMPILERS (Pavan)
SESSION TRACKING BY DINESH KUMAR.R.
Databases Continued 10/18/05.
Major Design Criteria Clear separation between “data” and “algorithms”
FLIPPED CLASSROOM ACTIVITY CONSTRUCTOR – USING EXISTING CONTENT
Social Practice of the language: Describe and share information
Oriented Design and Abstract Data Type
LANGUAGE EDUCATION.
ASCII LP1.
Programming language translators
Arrays, Casting & User Defined Variables
ASCII and Unicode.
Introduction to UNICODE (ஒருங்குறி)
Presentation transcript:

Centre For Indian Language Technology Font Converters And Syntax Corrector Hemant Patil, Center For Indian Language Technology, IIT Bombay , Powai 2/28/2019 Centre For Indian Language Technology

Centre For Indian Language Technology MARATHI FONTS Existing marathi texts/web sites use many different fonts and encodings E.g. susha, dvtt-yogesh, akruti, subak, shivaji, devpooja, No unique standard has been followed Encodings of Glyphs are different in different fonts. 2/28/2019 Centre For Indian Language Technology

Centre For Indian Language Technology FONTS (continued…) e.g. Word : Baart Shusha: Baart DvttYogesh: ¦É®iÉ 2/28/2019 Centre For Indian Language Technology

Centre For Indian Language Technology FONTS (continued…) Disadvantages : Search cannot be performed directly . Information about each font must be stored in database. User may have to use different fonts for typing one search string 2/28/2019 Centre For Indian Language Technology

Centre For Indian Language Technology WHY CONVERTERS? There Must be some standard for representations Convert the string in given font encoding in a standard encoding and search only the standard encoding. We have choose ISCII coding for common representation. 2/28/2019 Centre For Indian Language Technology

CONVERTERS (continued…) Conversion from font to ISCII Many to one mapping. Conversion from ISCII to font One to many mapping. 2/28/2019 Centre For Indian Language Technology

NEW CONVERTERS WRITTEN BY US YOGESH AKRUTI ISCII IITK SHUSHA 2/28/2019 Centre For Indian Language Technology

Converters Provided by IIIT,Hyderabad SHREE962 ANKIT XDVNG ISCII MILAP YOGESH SHUSHA 2/28/2019 Centre For Indian Language Technology

Syntax checkers and correctors Syntax checker for Akruti_Priya_expanded font Syntax checker for ISCII Autocorrector for Akruti_Priya_expanded font 2/28/2019 Centre For Indian Language Technology