Download presentation
Presentation is loading. Please wait.
1
The classification problem (Recap from LING570) LING 572 Fei Xia, Dan Jinguji Week 1: 1/10/08 1
2
Outline Probability theory The classification task => Both were covered in LING570, and are therefore part of prerequisites. 2
3
Probability theory 3
4
Three types of probability Joint prob: P(x,y)= prob of x and y happening together Conditional prob: P(x|y) = prob of x given a specific value of y Marginal prob: P(x) = prob of x for all possible values of y 4
5
Common tricks (I): Marginal prob joint prob 5
6
Common tricks (II): Chain rule 6
7
Common tricks (III): Bayes rule 7
8
Common tricks (IV): Independence assumption 8 A and B are conditionally independent given C: P(A|B,C) = P(A|C) P(A,B|C) = P(A|C) P(B|C)
9
Classification problem 9
10
Definition of classification problem Task: –C= {c 1, c 2,.., c m } is a finite set of pre-defined classes (a.k.a., labels, categories). –Given an input x, decide on its category y. Multi-label vs. single-label problem –Single-label: for each x, only one class is assigned to it. –Multi-label: a x could have multiple labels. Multi-class vs. binary classification problem –Binary: |C| = 2. –Multi-class: |C| > 2 10
11
Conversion to single-label binary problem Multi-label single-label –If labels are unrelated, we can convert a multi-label problem into |C| binary problems: e.g., does x have label c 1 ? Does it have label c 2 ? … Does it have label c m ? Multi-class binary problem –We can convert multi-class problem to several binary problems. We will discuss this in Week #6. => We will focus on single-label binary classification problem. 11
12
Examples of classification tasks Text classification Document filtering Language/Author/Speaker id WSD PP attachment Automatic essay grading … 12
13
Sequence labeling tasks Tokenization / Word segmentation POS tagging NE detection NP chunking Parsing Reference resolution … We can use classification algorithms + beam search 13
14
Steps for solving a classification problem Split data into training/test/validation Data preparation Training Decoding Postprocessing Evaluation 14
15
The three main steps Data preparation: represent the data as feature vectors Training: A trainer takes the training data as input, and outputs a classifier. Decoding: A decoder takes a classifier and test data as input, and output classification results. 15
16
Data An instance: (x, y) Labeled data: y is known Unlabeled data: y is unknown Training/test data: a set of instances. 16
17
Data preparation: creating attribute-value table f1f1 f2f2 …fKfK Target d1d1 yes1no-1000c2c2 d2d2 d3d3 … dndn 17
18
Attribute-value table Each row corresponds to an instance. Each column corresponds to a feature. A feature type (a.k.a. a feature template): w -1 A feature: w -1 =book Binary feature vs. non-binary feature 18
19
The training stage Three types of learning –Supervised learning: the training data is labeled. –Unsupervised learning: the training data is unlabeled. –Semi-supervised learning: the training data consists of both. We will focus on supervised learning in LING572 19
20
The decoding stage A classifier is a function f: f(x) = {(c i, score i )}. Given the test data, a classifier “fills out” a decision matrix. d1d1 d2d2 d3d3 …. c1c1 0.10.40… c2c2 0.90.10… c3c3 … 20
21
Important tasks (for you) in LING 572 Understand various learning algorithms. Apply the algorithms to different tasks: –Convert the data into attribute-value table Define feature types Feature selection Convert an instance into a feature vector –Choose an appropriate learning algorithm. 21
22
Important concepts in a classification task –Instance: a (x, y) pair, y may be unknown –Labeled data, unlabeled data –Training data, test data –Feature, feature type/template –Feature vector –Attribute-value table –Trainer, classifier –Training stage, test stage 22 Summary
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.