Computational Models of Discourse Analysis Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.

Computational Models of Discourse Analysis Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute

Warm-Up Discussion What is the distinction between personality, identity, and perspective?  Does the distinction matter computationally How do they related to one another as lenses for understanding social media data? What do we take from today’s readings for assignment 4? Personality Identity Perspective

Student Comment At first the paper did not seem related to our task of identifying gender but perhaps this paper shows that the way we see ourselves is extremely consistent. No matter how you ask the question a subject will always give you an honest answer as to how they see themselves. This could mean that no matter how hard we try we will sooner or later embed signals into our blog posts that indicate our perceived gender.

Student Comment It seems that the importance of "spiritual self" in presentation is the most important takeaway from this paper. 96% of users attempt to describe themselves with aspects of their "spiritual self" (i.e., perceived abilities). So focusing on these instead of the material or the social might be better (although, it's possible that a particular gender uses one of these sub-types significantly more than another, which could also be handy, but we don't have that information). Is this personality or identity? How would you expect it to relate to other online behavior?

Semester Review

Semester in Review Unit 1: Theoretical Foundation Unit 2: Linguistic Structure Unit 3: Sentiment Unit 4: Identity and Personality Unit 5: Social Positioning In each Unit:  Readings from Discourse Analysis and Sociolinguistics  Readings from Language Technologies  Hands-on assignment Implementation and corpus based experiment Competitive error analysis Student Presentations

Building Tasks According to Gee’s theory, whenever we speak or write, we are constructing 7 areas of reality What we build: Significance, Practices, Identities, Relationships, Politics, Connections, Sign systems and knowledge How we build them: Social languages, Socially situated identities, Discourses, Conversations, Figured worlds, intertextuality

What we Build Significance: things and people made more or less significant through the text Practices: ritualized activities and how are they being enacted through the text (for example, lecturing or mentoring) Identities: manner in which things and people are being cast in a role through the text Relationships: style of social relationship, like level of formality Politics: how “social goods” are being distributed, who is responsible for the flow, where is it going Connections: connections and disconnections between things and people, e.g., what ideas are related, how are things causally connected, what is affecting what? Sign Systems and Knowledge: languages, social languages, and ways of knowing, what ways of communicating and knowing are treated as standard and acceptable in the context, e.g., that you’re expected to speak in English in class

Discourse Environmentalism Conversation Global Warming Discourse StatusQuo Socially Situated Identity Environmentalist Social Language Liberal rhetoric Figured World Expected structure of Conservationist Commercial Form-Function Correspondence Range of meanings for the word “sustainability” Situated Meaning Meaning of “sustainability” in the commercial Imagine an environmentalist commercial

Computationalizing Gee? Challenge: not variationist Form-function correspondences can be modeled naturally through rules Cells of table like feature extractors? Social Languages like topic models? Figured worlds related to “social causality”

Metafunctions

What is a system?

Computationalizing SFL? See Elijah’s ACL paper! We had to REALLY simplify to get there Not clear how to do that for Heteroglossia yet

Computational Techniques Text entailment/ similarity measures/ paraphrase/ constraint relaxation Topic models Machine Learning Techniques: bootstrapping, HMMs, other statistical modeling techniques Basic features: unigrams, bigrams, POS bigrams, acoustic and prosodic features (speech) Created features: dictionaries, templates, syntactic dependency relations

Basic Aspects of Discourse Structure are Easiest to Model Turn taking Topic segments Speech acts (at least direct ones) More recent computational work focuses on more challenging “discoursey” problems like sentiment and stance Some recent work on metaphors (related to frames), but not applied to discourse level problems

Problems Labels in public datasets don’t necessarily match the theory  Computational approaches embody variationist assumptions, but much of the theory is grounded in a more contextualized view of meaning making Lack of a fully satisfying operationalization of style (style is hard to separate from content)  Grammatical metaphor and other indirect strategies  Same effect can be achieved in so many ways – each technique only captures one slice – so you’re always just grasping a glimpse of what’s there Overfitting spurious correlations  “subpopulations” leading to problems with generalization  Similar variation arising due to numerous different factors (gender, age, SES)  Features at too low level – words serving multiple purposes simultaneously

Engagement and Personality?

How would you expect an Engagement style analysis to relate to personality? What effect would you expect to see on conversations? Are these necessarily connected?

Freshman Engineering Study 131 Freshman engineering students worked in groups of 3 or 4 to design a better wrench  Applying principles related to stress and leverage Procedure  Tutorial on computer aided engineering  Pretest  Collaborative design activity  Posttest  Questionnaire

Heteroglossia Manipulation

Social Manipulation

Tutor Agent Design RequestDetecto r T.TakingCoordinator TutoringManag er OutputCoordinator ConcertChatActorConcertChatListener ConcertChat Server TutoringActor PromptingActor MessageFilter DiscourseMemory PresenceFilter PromptingManage r IntroductionsActor IntroductionsManag er ActivityDetector AnnotationFilter PlanExecutor SocialControlle r ProgressDetector Tutor One last thing on this topic, Does more (or less) stress in a wrench make it easier to use? S95no change? TutorYou are correct about that.. Stress doesn't determine ease of use. Tutor It's the moment achieved by the wrench that determines the ease of use. S89yay! Tutor It's good to have your contributions in the discussion Jackie :-) TutorGo team :-) S89Go team yay Tutor I am happy to be working with our team S89Me too S95whoa the bot knows my name Tutor Unfortunately maximum stress (12800) in our design1 is way above the maximum allowed stress (i.e. 8750) TutorThis wrench cannot be safely used! Kumar, R. & Rosé, C. P. (2011). Architecture for building Conversational Agents that support Collaborative Learning, IEEE Transactions on Learning Technologies special issue on Intelligent and Innovative Support Systems for Computer Supported Collaborative Learning

Results on Breadth of Coverage of Design Space Significant main effect of Heteroglossia on number of ideas mentioned  Heteroglossia was better than Monoglossia and Neutral Significant interaction  In the Social condition, Monoglossia was worse than the other two

Results on Perception Students were significantly happier with the interaction in the Heteroglossia condition than Neutral, with Monoglossia in the middle Students liked the Heteroglossic and Monoglossic agents better than the Neutral agent Students in the Heteroglossia condition felt marginally more successful than students in the Monoglossia condition No effect on Personality indicators such as Pushy, Wishy Washy, etc. Does that mean that impression of personality and how you feel about an interaction with someone are not linked?

Student Comment I would also note that English is a very gender neutral language, so gender performativity is harder to classify.

Engagement Already established: Positioning a proposition  But can it also be primarily positioning between people?  Patterns of positioning propositions as having the same or different alignment between speaker and hearer could do this Is positioning in communication always positioning by means of propositional content?

Connection between Heteroglossia and Attitude But is this really different from a disclaim? And is this really different from a proclaim?

Hedging and Occupation? And as such, I believe hedging is a much more effective tool in showing generational or occupational differences rather than gender differences.  For example, teenagers often use verbs such as 'like' and 'all' to report speech: he was all 'that's stupid' and then he was like ''but I'm stupid too'. The occupational differences I would attribute to the differences between people who need exact values as opposed to people who can accept generalizations or approximations.

Questions?

Computational Models of Discourse Analysis Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.

Similar presentations

Presentation on theme: "Computational Models of Discourse Analysis Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Computational Models of Discourse Analysis Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.

Similar presentations

Presentation on theme: "Computational Models of Discourse Analysis Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute."— Presentation transcript:

Similar presentations

About project

Feedback