2 Largely from: Developing Orthographies for Unwritten Languages Michael Cahill and Keren Rice, eds.
3 What is orthography?A system for representing language in written formGraphemes (individual characters)Word breaksPunctuationDiacriticsRules for splitting and hyphenationSpellingIn language technologies, it can be perfectly acceptable to use an arbitrary intermediate symbol set (like Arpabet) because “only the computer knows.” With orthography development, we are entering the complex domain of human learning and identity.
4 Increased attention to orthographies FinancialFunding connected to literacyHumanitarianUNESCO mother tongue educationTechnologicalUnicode / font supportCell phones, smart phones, messaging, etc.
5 An effective orthography is… Linguistically soundAcceptable to all stakeholdersUsable
6 Acceptability: Governmental Are there national policies?Tone markings disallowed in GhanaRoman-based orthographies must draw from an official unified alphabet in CameroonIs approval required?CAR must have all new orthographies approved by the national governmentEthiopia has several possible agencies to seek approval from
7 Acceptability: Sociolinguistic Which dialect to useUnilectal – which one to use? Prestige? Size? Age?Multilectal – combine elements, but whose speech does it represent?Differentiation – allow for levels of standardizationRelationship with other languagesSometimes desirable to look like another languageSimilarity with a familiar or prestigious languageSometimes desirable not to look like anotherMotivated by rivalry, identityChoice of scriptsCyrillic vs. roman for Serbian/CroatianArabic vs. roman for TuaregFun fact: Language of Koni on northern Ghana has h/ng contrast: /HH AO G UH/ - /NG AO G UH/ ‘woman’
8 Usability: Learning Underrepresentation/Overrepresentation Fewer/more graphemes than phonemesTransfer to major languagesTension between literacy and identityReadabilityNot too many similar charactersConsider fonts (sans serif easier to read)Testing, testing, testingUnderrepresentation example: using a 5-grapheme set for rich vowel system; ignoring toneOverrepresentation example: qu/c/k for /k/How to decide on /CH/ in ghana… <ch,c,ts,tSH,tsch,ky><p,d,b,q> same shape – how to explain that the letter is different if it is rotated? Not intuitive.
9 Usability: Production Unicode complianceFont renderingNon-digital printing (custom typewriter keys!)Entry method (taps, strokes, multi-step)Multi-step: use Japanese as an example of first selecting onset, then swiping to get vowel, finally selecting from kanji list
10 Usability: Teaching How to get speakers to use the orthography? Phonemic awarenessTeaching materials and instructionMotivation/opportunity to writeFormative feedback loopThe orthography is only useful if people use it!
11 Word boundaries Many languages are not written with much white space Orthographers often intuitively follow a system they are familiar withPurpose is to help beginning and fluent readers read with easeSome factors to consider:Syllable structureMovabilitySeparabilityConceptual unityPronounceability in isolation….
12 Is Standardization Necessary? ProsStreamlines language planningEasier to generate teaching/learning materialsBasis for a body of literatureEfficient in case of critical endangermentConsHow to choose?Basis for judgments of intellect/ignoranceObscures diversity in the languageLess relevant in digital ageAt the very least, language developers should not feel rushed to publish standardized orthography – take time to test, build consensusCompeting orthographies are source of tension in communities that have better things to doDo language communities see lack of standardization as problematic?Who benefits from standardization?Language documentation context vs. language reform contextEuropean languages standardized organically, over centuriesCommunity ownership is essential, can’t be seen as top-down
13 Orthography Diplomacy Linguist’s tendency is toward systematic, logical, efficient designNot always compatible with community needsNon-fluent speakers in teaching rolesIncreasingly strong transfer wishes/influencesSpecialized symbols, unfamiliar distinctions are just hard to learnPomo: “Indian phonics”Pomo: researcher observed an instructor (elder) “translating” the designed orthography into something more familiar. “Neither she nor anyone else in the class understood the specialezed symbols or knew the pronunciation even of familiar symbols.Example: xó.mča -> home caAlso: Inupiaq post-it notes in the office
14 Criteria for a new writing system Maximum motivation for the learnerMaximum representation of speechMaximum ease of learningMaximum transferMaximum ease of reproductionSmalley 1963
15 Bias of familiarity Both linguists and non-linguists have it Makes each group potentially blind to the preferences/intuitions of the other groupEspecially: we can fail to recognize that non-linguists / 2L learners of minority language have different transfer issues than we doDon’t overestimate the ease of learning of phonetically-based alphabets!
16 Bias of familiarity For example… Students may not be at all proficient in use of the IPA even after a semester-long courseEven proficient users will transcribe differently depending on whether they are native or non-native speakers – we are coming from different phonological systems
17 Assignments A Yanesha’ Alphabet for the Electronic Age Mary Ruth WiseKurtöp Orthography Development in BhutanGwendolyn HyslopCase Studies of Orthography Decision Making in Mainland Southeast AsiaLarin AdamsYanesha – language of Peru