Presentation on theme: "Hindi – Urdu Transliteration issues Rahmat Yousufzai and Amba Kulkarni."— Presentation transcript:
Hindi – Urdu Transliteration issues Rahmat Yousufzai and Amba Kulkarni
Urdu Alphabet ا ب پ ت ث ٹج چ ح خد ذ ڈر ز ژ ڑس ش ص ضط ظ ع غف ق ک گل م نو ہ ھ ء ی ےا ب پ ت ث ٹج چ ح خد ذ ڈر ز ژ ڑس ش ص ضط ظ ع غف ق ک گل م نو ہ ھ ء ی ے
Characteristics of Urdu alphabet Most of the Urdu characters join with the following character and make one ligature. Example: تکلیف तक्लीफ This Urdu word is combination of 5 characters ت ک ل ی ف Urdu characters typically have different shapes in different positions – beginning, middle, last.
Characteristics contd... Some characters do not join with the following character and are written in full form even if they come in a middle position. ا آ د ذ ڈ ر ز ژ ڑ و Example: مالک آم بدلہ تہذیب بڈھا برما بزنس ویژن کڑیل بولنا Please note that there is no space in between.
Hindi Alphabet अ आ इ ई उ ऊ ऋ ए ऐ ओ औ अं अः क ख ग घ ङ च छ ज झ ञ ट ठ ड ढ ण त थ द ध न प फ ब भ म य र ल व ष श स ह
Consonants missing in Hindi ث ، ح ،خ ، ز ، ذ ، ص ، ض ، ط ، ظ ، ع ، غ ، ف ، ق These characters do not exist in Hindi. They are borrowed from Arabic and are used for the words borrowed from Arabic/Persian only.
contd2 ث ص (Close to س in Urdu with a minute difference in pronunciation. In Hindi स is used to express these characters including س ز ذ ض ظ In Hindi ज & ज़ Is used for all these characters. However most of the times ज is written without dot.
Contd3 ح Same as ہ in Urdu with a minute difference in pronunciation. In Hindi ह is used for both the characters. ق This character is represented by क़ خ This character is represented by ख़ But normally the dot is not used by Hindi writers.
Contd4 ع : This is an Arabic consonant which is transliterated into Hindi as one of the vowels. ا Alif: with some difference in pronunciation. Example: عمل अमल عالم आलम علاج इलाज عید ईद عمر उम्र عود ऊद عےب ऐब عورت औरत
Contd5 If ع comes in between or as a last character then most of the Urdu speakers normally pronounce it as Alif but in poetry, special care is taken to pronounce it correctly.
Contd6 ط and ت have similar pronunciation and in Hindi त is used to represent both the characters. ف This Character is same like F in English. This sound is represented by फ or फ़. But many a times Hindi writers do not use the dot.
Contd7 غ This sound also does not exist in Hindi. To represent this character, ग or ग़ is used. However normally Hindi writers do not use the dot.
Contd8 ژ This is a Persian character and does not exist in Arabic. In Hindi this is represented by ज and ज़. Example: Television ٹےلی ویژن टेलीविज़न
Contd9 ھ This is Do-chashmi He and gives its sound only when joined with certain characters. An important point to be noted: Arabic has a character ھ ( do-chashmi he ). This character retains its shape only when it comes as first and middle position. In the last position it gets changed as ہ (gola he ). Suggestion: For Urdu we should use ھ with unicode u06BE.
Contd8 ں Noon without dot (Noon Gunna) When this character comes in between the word then dot is marked. This creates ambiguity as it can be read as ن Example: کانچ چھینٹا
Urdu characters borrowed from Hindi ٹ ، ڈ ، ڑ ट, ड, ड़ These characters do not exist in Arabic or Persian. These have been borrowed from Hindi.
Certain Hindi characters representation in Urdu چھ ،جھ ، گھ ، کھ ، ٹھ ، ڈھ ڑھ ، ، تھ ، دھ، بھ ، پھ ख, घ, झ, छ, ठ, ढ, ढ़, थ, ध, फ, भ These are the Hindi characters and are represented in Urdu by adding ھ (Do- Chashmi He) to the initial character.
Contd2 رھ ، لھ ، مھ ، نھ ، ںھ، وھ، ےھ There are no specific characters in Hindi. However the sound is represented as under. न्ह, मह, ल्ह, र्ह, व्ह, ँह, ंह, ेह
Contd3 Example: تےرھواں ، کولھو ، تمھارا ، ننھا ، اوںھ، وھےل तेरहवाँ, कोल्हू, तुमहारा, नन्हा, ऊँह, व्हेल However in Urdu these words are also written as تےرہواں ، کولہو ، تمہارا But ننھا is written without change.
Ambiguous Characters 1. aliph (ا ) & Ena (ع) These two characters are pronounced differently, but Urdu speakers do not pay attention to the difference. Example: عام ( आम common) ، آم ( आम mango)
Contd2 2. sa: se (ث), sIna (س), svAda (ص) se (ث) and svAda (ص) These are purely arabic and Persian characters and are used in only Arabic or Persian words. Where as sIna (س) is used in Hindi, Urdu, Persian and Arabic. The above characters including س are written as स in Hindi.
Contd3 3. Ta: Te (ت), Toya (ط) Toya (ط) is purely Arabic and Persian character and is used only in Persian and Arabic words. Both the characters are written as त in Hindi.
Contd4 4. he: badI he (ح)& gola he (ہ) badI he (ح) is Arabic/Persian character. gola he (ہ) is common in Arabic, Persian, Urdu and Hindi. Hindi equivalent for both: ह
Contd5 5. ja: jAla (ذ), je (ز), jvAda (ض), joya (ظ), PArasI je (ژ) Z sound is not available in Hindi and almost in all Indian languages. Instead j is used. The sound of ja: jAla (ذ), je (ز), jvAxa (ض), joya (ظ) are almost the same and all are Arabic/Persian characters. je (ژ) is purely Persian character and is not available in even Arabic. For all the above characters ज or ज़ is used. Most of the times dot below is not written.
Contd2 It may be noted that if the next character after अं is प, फ, ब, भ, म then it is pronounced as " अम " otherwise it is pronounced as " अन ". Example: 1., अंबाला, अंभोज, अंमर انبالہ ، انبھوج ، عنبر 2. चंपा, कंफू, बंबू, गंभीर, संमान 3. अंक, अंग, अंतर, अंधा, अंडा
Contd3 Also, when अं comes as a last character of the word then it gives the sound of " अम ". Example: अहं, स्वयं, बालकं بالکم سویم اہم Interestingly the same rule of प, फ, ब, भ, म is applied in Urdu also but mostly the character م is used in proper nouns and English words. Example: امبانی امپھل امپائر अंपायर, अंफल अंबानी
Contd4 अः This also is the combination of अ and ः Example : It does not come as the first character of the word however it comes in the middle and last. Example: अंतःप्रवेश, अंतःकरण, प्रायः, अतः اتہ پرایہ انتہ کرن انتہ پروےش
Contd6 ङ, ञ, ण These characters do not exist in Urdu. Instead ن is used. Example: वाङ्मय, पिङ्गला وانگ مے ، پنگلا चञ्चल ( चञ् चल ), गुञ्जन ( गुञ् जन ) چنچل ، گنجن कारण, कणक کارن ، کنک
Contd7 ष, श These two characters have almost the same sound with a minute difference. In Uru there is only one character ش to express this sound. Example: षष्ठी, आकर्षण, पुरूष ششٹھی ، آکرشن ، پرش शरीर, मुश्किल, किशमिश کشمش مشکل شریر
Diacritic marks in Urdu In Urdu, there are no Matras like Hindi. Urdu has some diacritic marks but uses them only in elementary books.
Contd2 Zabara َThis is placed above the character to indicate a consonant with अ. Ex- بَ ब Zer ِThis is placed below the character and to indicate a vowel इ or ए. Ex- بِ बि, बे
Contd3 PeshُThis is placed above the character and creates the sound of उ along with the sound of the character on which it is applied. Ex- بُ बु JazamaْEquivalent of Halant in Devanagari Ex- بْ ब् शब्द شبْد
Contd4 TashdeedّThis is used for reduplication as in ہلّا _گلّا، دھبّا ह ल्लागुल्ला, धब्बा Do zabarًThis is placed above the last character Alif ( ا ) and gives the sound of n. It may be noted that the character just before Alif should be with Zabar. Example :فوراً फ़ौरन (the character ر is with Zabar but normally it is not written)
Contd5 Do zerٍThis is placed above the last character Alif ( ا ) and gives the sound of n. It may be noted that the character just before Alif should be with zer but normally it is not written. Example:نسلاً بعد نسلاٍ नस्लन बाद नस्लिन (The character ل is with Zer ) Ulta pesh :This is placed above the character and gives the sound of oo ( ऊ ) Example: بعدہ مالہ बादहू मालहू
Contd6 Khada zabar:ٰThis adds the sound of Alif to the character on which it is applied. Mostly it is put on Choti ye and Badi ye (ی ، ے ) and the efect of it ie Aa sound is transfered to the character earlier to Choti ye and Badi ye. It comes in the middle also and the character on which it is applied is added with the sound Aa. Example:اعلیٰ ، مصطفےٰ ، ہٰذا आला मुस्तफ़ा हाज़ा
Contd7 Other Diacritic marks are pesh and khada zer.
Gender Rules for gender in most of the words which have been derived from Hindi/Indian languages, do not change between Urdu and Hindi. But for some of the words which have been borrowed from Arabic or persian, the gender changes. vyavastha (feminine) انتظام (Masculine) aakarshan (Masculine)کشش (feminine) prakash(Masculine) روشنی (feminine)
Compound Words In Urdu two words are joined together. ( Same as in English where Apostrophe is used to join the words and Apostrophe gives the sense of "of"). In Urdu, the words are joined by "Izaafat". There are three types of Izaafat.
Contd2 1. Zer ِ is added after the first word and then the other word is written. Example: دردِ دل Most important thing is that there has to be space after zer ِ other wise the words may join together and will be problematic to read correctly. Example: شاخِ گل If space is not given then the word will appear like this. شاخِگل
Contd3 2. If the last character of the first word is "he" or "choti ye", then Hamza is added after the first word. Example: نغمۂ آب، گرمیٔ عشق
Contd4 3. If the last character of the first word is alif or wav then "Hamza Badi ye" is added after the first word. Example: اداۓ خاص، بوۓ گل
Compound words without Izaafat Normally in Hindi some words are written together Example isaka, usaka, Taajmahal etc but in Urdu they are written separately. اس کا، اس کا، تاج محل It may be noted that if Tajmahal is written together then it will not be readable. تاجمحل
Typographical errors 1. In Urdu when choti ye ی and ے come as a last character of a word then it retains its original shape. But if it comes in middle then it is difficult to recognize because the appearance is same in hand written text and Word processing packages like Inpage. Unicode badi ye does not join with the following character. example: کےلا کیلا
Contd2 In all Urdu packages this will be written as کیلا which creates ambiguity and transliteration through machine becomes problematic.
Contd3 2. There are certain characters in Urdu which do not join with the following character. These characters are ا،آ،د، ذ، ڈ، ر، ز، ژ، ڑ و. Data entry operators do not care much to give space between the two words as it is difficult for them to notice the joined position of the words. Due to this machine takes the two words as one and fails to process the word.
Contd4 3. The diacritic marks are ignored in Urdu and hence apparently there is no difference between इस اس and उस اس
Ambiguity हवा ہوا (noun) हुआ ہوا (verb) In Urdu both words are written in the same way but meaning is different. The spelling in Hindi is also different.