Presentation is loading. Please wait.

Presentation is loading. Please wait.

ZERO PRONOUN RESOLUTION IN JAPANESE Jeffrey Shu Ling 575 Discourse and Dialogue.

Similar presentations


Presentation on theme: "ZERO PRONOUN RESOLUTION IN JAPANESE Jeffrey Shu Ling 575 Discourse and Dialogue."— Presentation transcript:

1 ZERO PRONOUN RESOLUTION IN JAPANESE Jeffrey Shu Ling 575 Discourse and Dialogue

2 Review of Zero Pronouns  (Pro)nouns often dropped if pragmatically/semantically inferable from context  General preference for names/nouns rather than pronouns in polite/formal speech (except 1 st person and demonstrative pronouns)  Grammatical markers removed with nouns  Pronominal referent may or may not appear in discourse prior to appearance of zero pronoun

3 Overview of Objectives  Find general syntactic rules that can be used to identify the presence of zero pronouns  Find syntactic/semantic/pragmatic clues to identify referents for ZPs, or determine appropriate pronoun/noun insertion if not explicitly found in textual context  Determine priority in case of conflict

4 Syntactic Identification of Zero Pronouns  Determining presence of zero pronoun fairly straightforward for subject/topic, objects  Determine syntactic argument structure of verb  Identify if noun exists to fill arguments (simple task with grammatical markers) Sometimes grammatical markers not used in casual speech  Not so straightforward for other nouns (e.g. locatives)  Might not be important if unstated (e.g. “I’ll go to the store” vs. “I’ll go”). Usually has precedent.

5 Findings: Some Generalizations from Semantics/Syntax  In consecutive statements with ZPs, the topic/subject is often the same  Many verbs have a semantic preference for certain types of nouns (e.g. animate vs. inanimate) which can narrow possibilities  Imperatives are always 2 nd person (unless speaker is talking to him/herself)  Rarely have a subject/topic

6 Findings: Conversations  1P by far most commonly used pronoun  If ZP in statement is 1P, usually has an explicit precedent  If 2P, it may or may not have an explicit precedent  Question/answer format generalizations  Subject/topic of question usually 2/3P If1P, often directly stated  Subject of answer almost always opposite pronoun of question (2/3P vs. 1P), same referent (i.e. same person)  3P both simplest and potentially most complicated  3P personal pronouns (e.g. he/she) very rarely used  If 3P ZP, almost always preceded by explicit reference to name/noun  Gender indeterminate (possible problem for MT)

7 Findings: Domain Specific  Formal situations usually dictate specific conventions to be followed  Representatives almost never use 1P singular or 3P personal pronouns  2P ZPs can be inferred from domain context (a corporate statement would probably be addressing its customers or investors)

8 Findings: Domain Specific  Reference articles (e.g. Wikipedia)  While ZPs commonly are used, the antecedent is usually found no more than a few sentences prior  ZPs in consecutive sentences usually have same referent  If no clear antecedent can be found, subject of article is often a reasonable assumption  1 st and 2 nd person virtually non-existent, can be safely eliminated as possibilities Exceptions: reader-addressed texts (e.g. reference guides)  News articles similar conventions

9 Findings: Domain Specific  Visual media (e.g. comics, TV, etc.)  Most problematic  Referent very often in visual context with no textual context  Heavy reliance on visuals results in not only (pro)noun dropping, but “anything” dropping (including verbs), losing syntactic information  Quite possibly impossible to resolve without human intervention  Purely textual works (e.g. novels) usually have enough information

10 Priority  Domain has highest priority in all situations  Rules for separate domains largely mutually exclusive  Failure to determine reasonable pronouns as determined to be can lead to misinformation  Japanese particularly sensitive to social context Generic pronoun insertion may be highly inappropriate  Simple domain information can be extremely valuable  Unless ruled out by domain, general conversation rules may be applicable to many different media

11 Priority  Pragmatics/semantics > syntax  Ex: Question/answer conventions are of greater relevance than assumption that the subject of the previous sentence is the same as the subject of the current sentence  Ex: The semantics of verbs to prefer certain types of nouns is of greater relevance than the fact that a ZP is a particular grammatical role (e.g. naïve assumption that direct objects tend to be inanimate)

12 An Idea  Instead of inserting “best guess” pronouns, provide a selection of best candidates in text for user to disambiguate  In current MT systems that insert generic pronouns, users have to “interpret” (guess) what is really meant anyway  Insertion of pronouns is never 100% certain  Some media (visual) require human intervention  Insertion of pronouns/nouns can lead to misinformation, faux pas, and sense of unreliability of system  It is much faster to pick out of a set of provided candidates rather than guess whether the pronoun is right or wrong, and go back to try to figure out what is going on


Download ppt "ZERO PRONOUN RESOLUTION IN JAPANESE Jeffrey Shu Ling 575 Discourse and Dialogue."

Similar presentations


Ads by Google