Presentation on theme: "Enabling Mobile & Wireless Handheld Devices with Indian Languages"— Presentation transcript:
1 Enabling Mobile & Wireless Handheld Devices with Indian Languages WELCOMESwapnil BelheTeam LeadC-DAC, GIST, Pune
2 Telecom Subscribers Base in India Ref: TRAI Press Release Dec 201170% of India’s Population lives in Rural Parts ~800 millions [Census 2001]
3 Future Growth Language enabled mobile devices Big and easy to read displaysMulti-modal interactions (like keyboard, pen, speech etc)Indian language contents and applications“..What matters most about a new technology is not how it works, but how people use it and the changes it brings about in human lives…”…. Frances Cairncross
4 CDAC-GIST: IL Display Solution Scalability Indian Language Ecosystem on All DevicesPublic Display SystemsPrinters with Indian language SupportMobiles (Feature Phones & Smart Phones)Tablet’sSet Top BoxesLifts, Washing Machine,Microwave Oven etc.
5 What is required from Handset? The mobile or PDA’s are important means of communications today. We believe the end-user expects minimum following text based Indian language components from mobile’s/PDA’sSMS editingIndian language MenusPhonebook dataNotes/Notepad/WordIndian language based GamesBrowser supporting Indian Languages.Multi-modal inputting e.g. Handwriting etc.Text to SpeechVAS (Value Added Services)alerts, reminders, mandi rates, farming tips etc.
7 W3C MW4D IG 2009 Recommendations Targeted at network operators Implementing Unicode support for SMS on all networksTargeted at handset manufacturersHandsets should be extensible to support external/new character sets and to be usable in all languages of the worldHandsets should provide software modules such as Text-to-Speech engines to improve accessibility and offer opportunity for a greater support of voice
8 Challenges in Wide Spread usage of Mobiles other than for Voice Calls
9 Mobile Handset Scenario - Past Initial mobiles contained small display sizes like64 x 64,96 x 72128 x 128 etc.And contained very small memory in Kilo bytes(KBs)This was sufficient for English like languages which contained only 52 linear charactersThere are many such legacy phones still in the market
10 Mobile Handset Scenario Now a days with the advent of better LED and TFT displays the screen sizes have increased to256 x 256512 x 420640 x 480And available memory is in Mega bytes (MB)But there is also increase in types of Operating Systems (OS) like Windows Mobile, Symbian, Android, Embedded-Linux, etc.Mobile HandsetFeaturesIL Communication throughLow EndSMSPicture SMSMid RangeSMS, MMS, WAP, J2MESMS, MMS(image)High EndSMS, MMS, WAP, J2ME,Browser,SMS,
11 Mobile Handset Scenario Even though the memory & display size have increased still to get seamless support to Indian languages on handsets requires following,Indian language Keyboard for text inputtingRasterizer for displaying textIndian language Layout engineFontsCommon storage formatIdeally all these components should be backward compatible with legacy handsets
12 Indian Language SMS Indian Language SMS Garbled Characters Current Scenario :In most of the handsets the Indian language text would appear garbled. Only few compatible handsets will display text properlyIndian Language SMSGarbled Characters
13 Most mobile manufacturers support Indian languages then where is the problem?Proprietary picture SMS based solutionsRequire picture enabled handsets to display; message size is limited to 72x28 to 72x56 hence only few words can be sentDifferent SMS encodingsKeypads with Indian languages available but everybody's keypad differs inNumber of characters on each keyChoice of characters placed on different keysPosition of characters on a specific keyHeight and width of characters displayed on the keypadEveryone is using different proprietary keyboard layouts
14 Fonts There are three types of fonts Bitmap fonts (used by low end handsets)Truetype fonts (used by high end handsets)Opentype fonts (currently not widely used)
15 Issues - FontsEvery handset model is different from other in terms of,Screen resolutionScreen color depthScreen technology (especially display pitch)Available MemoryThus bitmap font designed for one handset model may not be readable on other handsets. Hence the fonts have to be custom designed as per specifications of every modelFor Truetype fonts the display is governed by handset’s operating system (Symbian, Windows Mobile, Android etc.)and availability of Indic layout mechanism
16 What is Layout Mechanism? Layout mechanism allows proper display of Indian language text.Without layout mechanism, re-ordering of text will not happen e.g.,Lay-outing provides basic facilities likeText Re-orderingBold, ItalicsInsert, delete charactersCursor movementWord wrappingText Scrolling
17 Other Issues/Challenges for Mobiles Indic LanguagesOne script many languagesCovering only 10 languages may not support all 22 official languages. Some of the languages are written in more than one script and thus there is dependency at the implementation level.E.g.Sindhi can be written using Devanagari and Perso-Arabic Santhali can be written using Devanagari and Ol chiki script.Manipuri can be written using Bangla and Meetei-Mayek
18 Other Issues/Challenges for Mobiles To view Indian language websites,sending s from PC,displaying files created on PC has some issues like,Lack of ZWJ/ZWNJ support on Handsets
19 Explicit Virama:Halant is a dead consonant in the 1st case
20 Half Explicit consonant: Example of usage of ZWJ characterक + ् + ष = क्षक + ् + ZWJ + ष = क्षKannada example,ಕ + ್ + ಷ = ಕ್ಷಕ + ್ + ZWJ + ಷ = ಕ್ಷThe ZWJ in the above example prevents the ligature and displays Virama form of Ka.
21 Challenges in Overall Framework Stake Holders• Mobile Subscribers• Handset Manufactures• SMSC Vendors• Mobile Operators• Content Providers
22 Need StandardizationSMS: 3GPP TS standardfor sending receiving SMS and its versions are primarily made for English and European scriptsWork has started by CeWITUSSD: GSM (ETSI TS ),GSM (ETSI TS )No work started for Indian languagesCBS: 3GPP TS SABP StandardCDMA: TIA/EIA IS-824Above standards and many more other standards describes SMS protocols, trigger alerts, news broadcast etc. At present these standardizations do not cover Indian languagesSupporting Indian Language USSD & CBC will allow emergency disaster alerts and other e-governance alerts to be sent/broadcasted to handsets
23 Cell Broadcast (CB)It is the most important protocol which is overlooked and urgently requires Indianization.Cell Broadcast is a genuine one-to-many geographically focused messaging service.Cell Broadcast is ideal for delivering local or regional information suited to all the people in that area, such as, hazard warnings, local weather, health concerns (such as Swine Flu outbreaks), flight or bus delays, tourist information, parking and traffic information.Regardless of network state (congested or not) CB is always available.The CB is a mature system that has been around for over a decade and robust to support national public warning systems.There is no cost to the subscriber to receive the message.[ref: W3C MW4D 2009]
24 Encoding Scheme3GPP TS GSM standard supports 7-bit default alphabets (and their octets) and UCS2.Possible schemes include use of either of following encodings,7-bit Default GSM AlphabetsUCS27-bit EA-ISCII
25 Encoding Scheme for SMS Complexity of Indian scripts requires more characters to be entered than English7-Bit GSM : Supports Latin character setUCS : Supports all languages of the worldCost of the SMS becomes high in case of UCS-2 …But considering its advantage, it should be made mandatory to all Service Providers and SMSC’s to support UCS-2 without escalating the cost in order to promote use of Indian Languages
26 Encoding schemeEfforts are underway to add Indic language enhancements to 3GPP for sending SMSBut it does not cover support for ZWJ and ZWNJ characters. Hence it will not be pleasant to read Indian Language Websites.It also does not cover layouting of Indian text which is very crucial for common display and common storage of text in all handsets.Hence even if this standard is implemented it will be falling short of the goal of reaching out to more people.
27 Recommended Best Practices For Indian Languages on Mobile Parameter: Usable Screen Width – 120 Pixel min.With respect to Indian language text matter, to accommodate a complete valid word with limited width of 120 pixels should not exceed the text height by 16 pixels.For higher pixel height font the effective width of the word or syllable may cross the 120 pixel width, and hence complete word or syllable may not be able to displayed without panning.Breaking/Wrapping of the textAdditional guidelines to be provided for breaking the text at word level or at the syllable level. This depends upon the font size and display size.Guidelines for hyphenation mechanism to be provided for breaking the words to enable the text wrapping.
28 Line Wrapping for Indian Scripts Wrong Line WrappingSyllable Level Wrapping With HyphenationWord Level Wrapping With Hyphenation
29 GuidelinesCursor MovementAlso guidelines should be provided for movement of cursor and deletion of the character/syllable.While editing text in Indian Languages cursor position should be changed as per the syllable instead of individual character/vowel.DeletionWhile deleting the characters from the entered text, a syllable wise deletion should happen so that it will reduce the burden of processing and redisplaying the half syllable. ‘Clear’ key to be used to delete the Syllable next to the cursor position and ‘Back’ Key to be used to delete the syllable which is just before the cursor positionURLWith Indian Language Domain Names (IDN) likely to come, like .भारत etc. It will be required to provide this as a separate key (like .com) while typing in the browser URL bar.Also, for handsets sold in India, .in key should be made available on all handset.
30 Very Important points to achieve common Indian Language Support Indic eco-systemAll the handset manufacturers must use same fonts and layout engine and same inputting scheme for all models of the handsets.All content providers must use the same encoding scheme for sending/recieving SMS’s. May be UCS-2 which is a global standard.Ideally all of the above should implemented by single entity so that updating and maintenance will be easy especially since Indian language computing is constantly involving with use of new standards like Unicode 5.2, 6.0, 6.1 etc.
31 Regulation & Certification 3GPP Specification states that,Current work undertaken for including Indian languages in 3GPP TS is not intended to be implemented until a formal request is issued by the relevant national regulatory body.There should be independent verifying and benchmarking agency which can endorse compatibility of latest equipments SMSC/RNC/Handsets etc. to prescribed Indian standards.Verification and certification agency should have thorough knowledge of Indian Language issues (all 22 languages) and mobile computing background
32 Challenges - Mobile Subscriber : Handset Manufacturer: SMSC Vendors: Do I have to change mobile which supports IL ?Is messaging in Indian Languages costlier?Handset Manufacturer:Supporting Indian languages, will it increase our handset cost ?How do I upgrade our current handsets with IL ?SMSC Vendors:If new encoding scheme comes, do we need to upgrade our SMSC’s ?Should be able to transcode or transliterate messages if recipients handset is incompatibleWho will validate and certify our upgraded SMSC ?
33 Challenges - Mobile Operator: Do we have to charge more for IL SMS? Or volumes will reduce the cost ?How do we educate people to use this IL features ?Content Provider:how do we make sure that the right contents go to the right customer?How do we send same message to multiple handsets in multiple languages?
34 Challenges -Currently if person X and Y having Indian language support in their handsets are not able to exchange SMS’s, and find boxes appearing on screen. This is because their Service Provider’s are different and use different protocols for transmission of SMS’sMany proprietary implementations of Indian languages have also hampered the growth and seamless use across, various handsetsSince if person changes handset then his new handset may have different inputting method this also discourages Indian language inputting