Presentation is loading. Please wait.

Presentation is loading. Please wait.

Towards Identifying Unresolved Discussions in Student Online Forums Jihie Kim, Jia Li, and Taehwan Kim Information Sciences Institute/ University of Southern.

Similar presentations


Presentation on theme: "Towards Identifying Unresolved Discussions in Student Online Forums Jihie Kim, Jia Li, and Taehwan Kim Information Sciences Institute/ University of Southern."— Presentation transcript:

1 Towards Identifying Unresolved Discussions in Student Online Forums Jihie Kim, Jia Li, and Taehwan Kim Information Sciences Institute/ University of Southern California 1

2 Pedagogical Discourse Jihie Kim/USC-ISI 2 “Talk to as many other people as possible. CS is learned by talking to others, not by reading, or so it seems to me now.” -- Advice from an undergraduate computer science student

3 Pedagogical Discourse Jihie Kim/USC-ISI 3 Discussion Board and Corpora 15 semesters running… CS and Engineering courses Undergrad/Graduate USC/Non-USC Almost 800 students Over 8000 messages 15 semesters running… CS and Engineering courses Undergrad/Graduate USC/Non-USC Almost 800 students Over 8000 messages Extensible open-source discussion board (phpBB) serves as a platform for bridging ISI research and USC teaching practice

4 Pedagogical Discourse Jihie Kim/USC-ISI 4 Student Messages in an Undergraduate Operating Systems Course Text is incoherent and ungrammatical. Problem description: Non- factoid questions are difficult to identify, dependent on context, and may include multiple sentences or paragraphs. Answers require explanations.

5 Pedagogical Discourse Jihie Kim/USC-ISI 5 Thread Length Distribution Data from an undergraduate CS Course Data from a graduate CS Course # of threads # of messages Threads are often very short, many consisting of only 1-2 messages Students jump into programming details without understanding larger picture or related concepts TA and instructors are not always available to fully guide interactions # of messages  Need of Discussion Assessment and Scaffolding

6 Pedagogical Discourse Jihie Kim/USC-ISI 6 PedDiscourse Research Discussion Assessment  Which discussions need instructor attention?  Who is asking and answering questions?  What topics are discussed when? Discussion Scaffolding  Promote reflection  Promote collaboration among students

7 Pedagogical Discourse Jihie Kim/USC-ISI 7 Individual messages  Topic, quantity Relations among messages  Response/Replies  Roles that a message play Discussion threads  Thread lengths and quantity  Discussion Topic  Discussion Focus … Related course data  Notes, web pages, readings  Assignments and projects... Modeling discussion threads

8 Pedagogical Discourse Jihie Kim/USC-ISI 8 Discussion Assessment Which discussions need instructor attention?  Identify roles that individual messages play (ques, ans, ack, etc.)  Analyze patterns of message roles  Find discussion threads without an answer for the initial question

9 Pedagogical Discourse Jihie Kim/USC-ISI 9 Roles of individual messages Use Searle’s theory of Speech Acts (Searle, 1969) to model threaded discussions Speech Acts Choose SAs to use Question (QUES), Answer or Suggestion (ANS-SUG), Correction or Objection (Neg-Ack), ….. Provide relationship between a pair of messages Multiple SA’s per pair of messages in thread A single message can be related (via SAs) with multiple messages

10 Pedagogical Discourse Jihie Kim/USC-ISI 10 Speech Acts (SAs) in a discussion thread S1 S2 S1 I am still confused. I understand it is in the same address space as the parent process, where do we allocate the 8 pages of mem for it? And how do we keep track of.....? … I am sure it is a simple concept that I am just missing. S3 read the student documentation for the Fork syscall The Professor gave us 2 methods for forking threads from the main program. One was The other was to When you fork a thread where does it get created and take its 8 pages from? Do you have to calculate......? If so how? Where does it store its PCReg ? Any suggestions would be helpfule. If you use the first implementation...., then you'll have a hard limit on the number of threads....If you use the second implementation, you need to.... Either way, you'll need to implement the AddrSpace::NewStack() function and make sure that there is memory available. If you use the first implementation...., then you'll have a hard limit on the number of threads....If you use the second implementation, you need to.... Either way, you'll need to implement the AddrSpace::NewStack() function and make sure that there is memory available. QUES ISSUE, QUES ANS-SUG

11 Pedagogical Discourse Jihie Kim/USC-ISI 11 Code 1 Name QUESQuestion ANNOAnnouncement CANSComplex Answer SANSSimple Answer SUGSuggest ELABElaborate CORRCorrect OBJObject CRTCriticize SUPSupport ACKAcknowledge COMPComplement Code 3 QUES ANNO ANS-SUG ELAB POS-ACK NEG-ACK Code 2 POS NEUT NEG Code 1 Code 2 Code 3 Kappa: 0.70 Kappa: 0.54 Kappa: 0.58 Speech Act categories explored

12 Pedagogical Discourse Jihie Kim/USC-ISI 12 Current Speech Act Categories SA Category Descriptionkappa Distribution (% in corpus) QUES A question about a problem, including question about a previous message ANS-SUG A simple or complex answer to a previous question. Suggestion or advice ISSUE Report misunderstanding, unclear concepts or issues in solving problems Pos-ACK An acknowledgement, compliment or support in response to a prev. message Neg-ACK A correction or objection (or complaint) to/on a previous message

13 Pedagogical Discourse Jihie Kim/USC-ISI 13 Data cleaning and pre-processing  Discussion data Noisy, Incoherent High variations – messages may contain answers or suggestions in the form of questions Informal dialect used by students  Data pre-processing – Tokenization, Stemming, other filtering steps applied (e.g. Removing programming code existing within messages, pluralized words,…etc….)  Data Categorization Transform/Replace commonly occurring words/word-sequences with categories  Apostrophe words ( ‘re, ‘ve, ‘m…)  Technical terms existing within messages replaced by TECH_TERM - (from commonly used technical terms in course)  Don’t replace pronouns (“you can” in ANS vs. “I can”)

14 Pedagogical Discourse Jihie Kim/USC-ISI 14 Features for SA Classification  F1: Cue phases and their positions (e.g. “Thank” position)  F2: Message Position  F3: Previous Message Information  F4: Poster Class  F5: Poster Change  F6: Message Length IF cue-phrase = {What} &{“?”} => QUES IF cue-phrase = {“yes you can”} & poster-info = Instructor & post-length = Medium => ANS IF cue-phrase = {“yes”} & cue-position = CP_BEGIN & prev-SA = QUES => ANS IF cue-phrase = {“not know”} & poster-info = student & poster-change = YES => ISSUE Example TBL rules

15 Pedagogical Discourse Jihie Kim/USC-ISI 15 SA Classification Results SA Category Support Vector Machine (SVM) Transformation-Based Learning (TBL) PrecisionRecallF scorePrecisionRecallF score QUES ANS ISSUE Pos-ACK Neg-ACK

16 Pedagogical Discourse Jihie Kim/USC-ISI 16 Profiling discussion threads with SAs (Q1) Were all questions answered? (Y/N) (Q2) Were there any issues or confusion? (Y/N) (Q3) Were those issues or confusions resolved? (Y/N)

17 Pedagogical Discourse Jihie Kim/USC-ISI 17 Thread classification with SA classifiers  Feature Set1: Whether there was an [SA] in the thread  Feature Set2: Whether the last message in the thread included [SA] PrecisionRecallF score Q10.93 Q20.93 Q30.89 (a)SVM Classification results with human annotated SAs PrecisionRecallF score Q Q Q (b) SVM Classification results with system generated SAs (Q1) Were all questions answered? (Y/N) (Q2) Were there any issues or confusion? (Y/N) (Q3) Were those issues or confusions resolved? (Y/N)

18 Pedagogical Discourse Jihie Kim/USC-ISI 18 Direct thread classification without SA classifiers  F1’: cue phrases and their positions (last message or not) in the thread PrecisionRecallF score Q10.86 Q Q (a)With SAs PrecisionRecallF score Q Q Q (Q1) Were all questions answered? (Y/N) (Q2) Were there any issues or confusion? (Y/N) (Q3) Were those issues or confusions resolved? (Y/N) (b) Direct classification

19 Pedagogical Discourse Jihie Kim/USC-ISI 19 Summary and Discussion Identifying unresolved discussions  Discerning speech acts (SAs) in student online discussions  Classify discussion threads with SA as features  Compare SA-based classification and direct thread classification with phrase features  SA-based features may help some difficult cases E.g. Longer threads with more than one questions raised

20 Pedagogical Discourse Jihie Kim/USC-ISI 20 Related Work  Pedagogical/tutorial dialogue Instructional discourse modeling (Yuan et al., 2008; Graesser et al., 2005; McLaren et al., 2007; Boyer et al., 2008; Fossati 2008; Litman et al., 2003)  Dialogue modeling in messages or blog (e.g. AAAI 2008 workshop on Enhanced Messaging) speech acts Requests and commitments  Handling noisy data and high variance in text (Knoblock et al., 2007)  Course topic and task modeling using information extraction techniques (Roy et al. 2008; Jovanovic et al., 2006 )  Trace student e-learning activities (Israel and Aiken, 2007; Dringus and Ellis, 2005)

21 Pedagogical Discourse Jihie Kim/USC-ISI 21 Ongoing Work: Discussion Assessment  Discussion thread pattern and phase analysis  question, understanding, solving and closing  Discussion topic analysis  Coherency of discussion topics  Student profiling  Information providers (peer mentors) vs. information seekers  Information flow and influence network among participants  Use of workflows (distributed systems) for large-scale assessment  E.g. participation changes over several semesters

22 Pedagogical Discourse Jihie Kim/USC-ISI 22 Supported by National Science Foundation (NSF) More details available at


Download ppt "Towards Identifying Unresolved Discussions in Student Online Forums Jihie Kim, Jia Li, and Taehwan Kim Information Sciences Institute/ University of Southern."

Similar presentations


Ads by Google