Understanding User Intents in Online Health Forums

Slides:



Advertisements
Similar presentations
TWO STEP EQUATIONS 1. SOLVE FOR X 2. DO THE ADDITION STEP FIRST
Advertisements

3.6 Support Vector Machines
Online Max-Margin Weight Learning with Markov Logic Networks Tuyen N. Huynh and Raymond J. Mooney Machine Learning Group Department of Computer Science.
Advanced Piloting Cruise Plot.
1 End-User Programming to Support Classroom Activities on Small Devices Craig Prince University of Washington VL/HCC 2008.
Chapter 1 The Study of Body Function Image PowerPoint
Exploring Traversal Strategy for Web Forum Crawling Yida Wang, Jiang-Ming Yang, Wei Lai, Rui Cai, Lei Zhang and Wei-Ying Ma Chinese Academy of Sciences.
Human Performance Improvement Process
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
Title Subtitle.
Arithmetic and Geometric Means
DIVIDING INTEGERS 1. IF THE SIGNS ARE THE SAME THE ANSWER IS POSITIVE 2. IF THE SIGNS ARE DIFFERENT THE ANSWER IS NEGATIVE.
FACTORING ax2 + bx + c Think “unfoil” Work down, Show all steps.
Year 6 mental test 10 second questions
Fourth normal form: 4NF 1. 2 Normal forms desirable forms for relations in DB design eliminate redundancies avoid update anomalies enforce integrity constraints.
Mianwei Zhou, Kevin Chen-Chuan Chang University of Illinois at Urbana-Champaign Entity-Centric Document Filtering: Boosting Feature Mapping through Meta-Features.
Richmond House, Liverpool (1) 26 th January 2004.
LABELING TURKISH NEWS STORIES WITH CRF Prof. Dr. Eşref Adalı ISTANBUL TECHNICAL UNIVERSITY COMPUTER ENGINEERING 1.
Filtering Semi-Structured Documents Based on Faceted Feedback Lanbo Zhang, Yi Zhang, Qianli Xing Information Retrieval and Knowledge Management (IRKM)
On Comparing Classifiers : Pitfalls to Avoid and Recommended Approach
ABC Technology Project
Other Gate Types COE 202 Digital Logic Design Dr. Aiman El-Maleh
Capacity-Approaching Codes for Reversible Data Hiding Weiming Zhang, Biao Chen, and Nenghai Yu Department of Electrical Engineering & Information Science.
1 Undirected Breadth First Search F A BCG DE H 2 F A BCG DE H Queue: A get Undiscovered Fringe Finished Active 0 distance from A visit(A)
VOORBLAD.
1 Analysis of Random Mobility Models with PDE's Michele Garetto Emilio Leonardi Politecnico di Torino Italy MobiHoc Firenze.
15. Oktober Oktober Oktober 2012.
Text Categorization.
Quadratic Inequalities
1 Breadth First Search s s Undiscovered Discovered Finished Queue: s Top of queue 2 1 Shortest path from s.
1 Evaluations in information retrieval. 2 Evaluations in information retrieval: summary The following gives an overview of approaches that are applied.
The world leader in serving science TQ ANALYST SOFTWARE Putting your applications on target.
BIOLOGY AUGUST 2013 OPENING ASSIGNMENTS. AUGUST 7, 2013  Question goes here!
Factor P 16 8(8-5ab) 4(d² + 4) 3rs(2r – s) 15cd(1 + 2cd) 8(4a² + 3b²)
Basel-ICU-Journal Challenge18/20/ Basel-ICU-Journal Challenge8/20/2014.
Machine Learning: Intro and Supervised Classification
© 2012 National Heart Foundation of Australia. Slide 2.
The x- and y-Intercepts
Understanding Generalist Practice, 5e, Kirst-Ashman/Hull
GG Consulting, LLC I-SUITE. Source: TEA SHARS Frequently asked questions 2.
Addition 1’s to 20.
25 seconds left…...
Copyright © Cengage Learning. All rights reserved.
Music Recommendation by Unified Hypergraph: Music Recommendation by Unified Hypergraph: Combining Social Media Information and Music Content Jiajun Bu,
Januar MDMDFSSMDMDFSSS
Week 1.
We will resume in: 25 Minutes.
©Brooks/Cole, 2001 Chapter 12 Derived Types-- Enumerated, Structure and Union.
Intracellular Compartments and Transport
1 Unit 1 Kinematics Chapter 1 Day
PSSA Preparation.
Immunobiology: The Immune System in Health & Disease Sixth Edition
Essential Cell Biology
Weekly Attendance by Class w/e 6 th September 2013.
Document Summarization using Conditional Random Fields Dou Shen, Jian-Tao Sun, Hua Li, Qiang Yang, Zheng Chen IJCAI 2007 Hao-Chin Chang Department of Computer.
Immunobiology: The Immune System in Health & Disease Sixth Edition
Educator Evaluation: A Protocol for Developing S.M.A.R.T. Goal Statements.
22 nd User Modeling, Adaptation and Personalization (UMAP 2014) Time-Sensitive User Profile for Optimizing Search Personalization Ameni Kacem, Mohand Boughanem,
Psychological Advertising: Exploring User Psychology for Click Prediction in Sponsored Search Date: 2014/03/25 Author: Taifeng Wang, Jiang Bian, Shusen.
Classification Classification Examples
Jason H.D. Cho 1,2, Parikshit Sondhi 1, Chengxiang Zhai 1, Bruce R. Schatz 1,2,3 1 Department of Computer Science, 2 Institute of Genomic Biology, 3 Department.
Virtual Examples for Text Classification with Support Vector Machines Manabu Sassano Proceedings of the 2003 Conference on Emprical Methods in Natural.
Presentation transcript:

Understanding User Intents in Online Health Forums Thomas Zhang, Jason H.D. Cho, Chengxiang Zhai Department of Computer Science University of Illinois at Urbana-Champaign 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics Newport Beach, California 22nd September 2014 One thing I would like to make clear from the beginning is that we’re focusing on determining the intent of the first posts in threads. Obviously in a thread you can have many posts, but we’re only looking at the original thread post here.

Online Health Forums Purpose: To provide a convenient platform to facilitate discussion among patients and professionals Huge user base, and still growing! In 2011, 80% of all web users searched for health information online, of which 6% participated in health related discussions Forums contain valuable information Contain rich, often first hand experiences

Deficiencies of Forums Threads are scattered Similar questions are asked again and again Keyword search is inadequate Finding several keyword matches in a thread does not necessarily mean that the thread is relevant - Similar threads should be grouped together to facilitate easy retrieval - Users have no knowledge of existing content on forums, and thus feel the need to post questions that may have already been answered in another thread Keyword searching is insufficient because we don’t really know what the user is asking about A match for a medication in both threads does not mean one thread is relevant to another, one user could be asking for its side effects, while another may be asking for its effectiveness

Post about cholinergic urticaria in April 2004 Received 3rd and final reply a week later Post from March 2012 No replies as of July 2014

Applications of Intents Improving thread retrieval e.g. A thread whose original post matches both keywords and intent specified by the user are more likely to be helpful Filtering threads e.g. To treat a condition, only look at posts asking about treatment Understanding user behavior in forums i.e. users of different forums have different intents

This Paper Introduces problem of identifying user intents in health forums as a classification problem Derives the first taxonomy of user intents Designs a set of novel features for use with machine learning to solve the problem Create the first dataset for evaluation, and conducted experiments to make empirical findings

Roadmap Problem formulation Intent taxonomy derivation Methodology Support vector machines Hierarchical classification Feature design Evaluation Dataset Experiments Results Intents in MedHelp forums Wrap-up

Problem Formulation Given 𝑂, an original thread post from our dataset 𝐷 with intent 𝑐 𝑖 from a taxonomy of user intents 𝐶= 𝑐 1 ,…, 𝑐 𝑘 . Denote 𝑆={ 𝑠 1 ,…, 𝑠 𝑛 } as the sentence representation of 𝑂. Classify 𝑂 as some 𝑐 𝑗 ∈𝐶 using 𝑆 as evidence. 𝑂 is correctly classified if and only if 𝒋=𝒊

Taxonomy Derivation No taxonomy exists for health forum intents Solution: Create our own! First reduce top ten most commonly asked generic questions by doctors (Ely et al, 2000) into three intent classes Classes match the intents of users who search for health information online (Choudhury et al, 2014) Next introduce two additional intent classes that are specific to health forum posts

Taxonomy Manage: How should I manage or treat condition X? Cause: What is the cause of symptom/physical/test finding X? Adverse: Can drug or treatment X cause adverse finding Y? Combo: Combination (at least two of first three) Story: Story telling, news, sharing or asking about experience, soliciting support, or others Add pop-up example for each class

Where are we? Problem formulation Intent taxonomy derivation Methodology Support vector machines Feature Selection Hierarchical classification Evaluation Dataset Experiments Results Intents in MedHelp forums Wrap-up

Support Vector Machines (SVM) Main idea: Learn a hyperplane from examples to separate them into two classes Use learned hyperplane to classify unseen examples Capable of non-linear and multiclass classification Shown to have good performance on high dimensional data

Post Representation How should we represent posts? What are features? SVMs require examples to be represented as a vector of features What are features? Some measurable property of the observed data How should we select them?

Feature Selection A good feature should be: Generic enough to be found in many posts Sufficiently discriminative for different intents

Solution: Patterns! Sequence of (possibly non-contiguous) tokens that represent recurring text patterns in sentences Very generic Lowercasing, stemming POS tagging UMLS semantic group tagging Very discriminative “What could X be…?” signifies Cause intent, but “What does X do…?” signifies Manage intent We want patterns to match regardless of case, or word form, or in some cases, we want to replace certain nouns or verbs with specific placeholders. We want patterns to be able to distinguish between different intent classes, which means that we want each pattern in our set to, for the most part, belong to a single class. It’s difficult to satisfy these properties, for example, Story posts have large variations in content, and so it is hard to find generic patterns that match content across many posts. Recall that the main purpose of knowing intents is so that we can better match users with information. Users with Story intent are story telling, sharing experiences, or soliciting support, none of which can be answered by other threads.

Pattern Types Each pattern falls under one of four types: LSP: Lowercased + stemmed tokens only E.g. “…what can caus…” POSP: LSP + POS tags E.g. “…how to <VERB>…” SGP: LSP + semantic group tags E.g. “…if <CHEM> works…” ALL: All types of tokens and tags E.g. “…<CHEM> make <PRP> feel…”

UMLS Semantic Groups MetaMap labels text phrases with semantic group labels from the UMLS Metathesaurus MetaMap is a piece of software developed by the National Library of Medicine (NLM) to map biomedical terms to Unified Medical Language System’s Metathesaurus concepts and their associated semantic types and group names. The table above shows the semantic type and group names that are considered by our program. During preprocessing, whenever we see a term with one of the above types, we replace it with its corresponding semantic group name to facilitate pattern matching.

Caveat Patterns possess limitations Difficult to achieve good coverage without sacrificing discriminative properties Impossible to extract for posts with large content variations (e.g. Story posts) However, we still want complete coverage of our dataset!

Solution: Hierarchical Classification! Two cascading SVM classifiers The first uses binary pattern features (Pattern SVM) The second uses unigram features with TF-IDF weighting (Word SVM) Complete coverage allows comparison with unigram baseline Input Post Match ≥ 1 pattern? Yes No Pattern SVM Word SVM TF-IDF is a numerical statistic intended to reflect how important a word is to a document in a collection or corpus Output Class

Where are we? Problem formulation Intent taxonomy derivation Methodology Support vector machines Hierarchical classification Feature design Evaluation Dataset Experiments Results Intents in MedHelp forums Wrap-up

Dataset No labeled dataset exists, since this is a new problem So we create our own! 1,192 original HealthBoards posts, evenly divided among four topics: allergies, breast cancer, depression, and heart disease Ideally want more posts, but labeling is expensive Why the four topics?

Dataset Labeling Labeling done by two CS students Substantial* agreement with medical students (𝜅=0.67) Substantial* agreement between themselves (𝜅=0.665, 74.67% labels match) Combo posts labeled by a third CS student according to their underlying classes A Combo post is predicted correctly if a classifier outputs one of its class labels We explored the feasibility of having medical students label our dataset, but they were too busy to. What we ended up doing was have them label a small number of posts, and compare the agreement among their labels and that of the CS students. Using fleiss kappa, we find that the labelers actually have substantial agreement with the medical students. Next, we also used cohen’s kappa to measure the agreement between the two CS students, and found that they had substantial agreement as well. This gives us confidence that the labels are reasonably consistent. Finally, we have a third CS student label all the Combo posts with their underlying classes. These labels will be used during evaluation. *Per Landis and Koch, 1977

Experiments What is the best performing set of patterns? Try different type combinations of patterns How does hierarchical compare with baseline? Five-fold cross validation (CV) Does performance suffer if we train on posts from three topics and test on the fourth? Four-fold forum CV We want to first use an SVM to classify our dataset using different combinations of pattern features to see which combination gave the best result. Once we know which set of patterns perform the best, we’ll use that same set in hierarchical classification. We will perform both standard 5-fold cross validation and 4-fold forum cross validation and compare the results to the baseline, which is simply an SVM using unigram word features.

Selecting a Pattern Set As we add more pattern types to our feature space, we see that the total number of posts that gets matched increases. We decide to use the 6th feature space for hierarchical classification. 𝑃= 𝐶𝑜𝑟. 𝑇𝑜𝑡. , 𝑅= 𝐶𝑜𝑟. 𝑀 + 𝐶 +|𝐴| , 𝐹1= 2𝑃𝑅 𝑃+𝑅

CV Takeaways Overall improvement is underwhelming, why? Patterns generalize well across forum topics Patterns give high precision but low recall Why is this acceptable? Patterns reach labeling agreement upper bound Hierarchical Classification Performance Word Classifier (Baseline) Performance Pattern SVM yields labeling agreement upper bound. Low recall is acceptable because we would in general prefer to classify intents of fewer posts with high accuracy than more posts with lower accuracy.

Intents in MedHelp Forums We applied our Pattern SVM to 61,225 MedHelp posts split across allergies, breast cancer, depression, and heart disease Cause is popular. Manage is majority in depression. Depression has lots of Adverse posts. Allergies contain smaller ratio of Cause to Manage posts.

Concluding Remarks Introduced the new problem of forum post intent analysis Designed the first taxonomy and dataset for classification Proposed a novel set of pattern features for SVMs Proved that patterns give high classification precision while generalizing well across forums Knowledge of intents has positive ramifications from an information retrieval perspective, for example, in search and recommendation. Reduce search space. Allowing forums to recommend relevant threads and potentially more accurate search results. Patterns give precision close to 75% agreement upper bound. Classifier can be used to classify posts from topics never seen before in training.

Future Work Administer study of health forum user intents Expand pattern feature set to improve recall Handle classification of Story posts Identify all intents from Combo posts Further evaluation with larger datasets More accurate taxonomy. 2-4) More accurate intent analysis. These extensions will give us a more accurate representation of the intent distributions across different forum topics.

Thank you! Questions? Comments?