Presentation is loading. Please wait.

Presentation is loading. Please wait.

Cost-Sensitive Learning for Large- Scale Hierarchical Classification of Commercial Products Jianfu Chen, David S. Warren Stony Brook University.

Similar presentations


Presentation on theme: "Cost-Sensitive Learning for Large- Scale Hierarchical Classification of Commercial Products Jianfu Chen, David S. Warren Stony Brook University."— Presentation transcript:

1 Cost-Sensitive Learning for Large- Scale Hierarchical Classification of Commercial Products Jianfu Chen, David S. Warren Stony Brook University

2 Classification is a fundamental problem in information management. content Product description UNSPSC Product and material transport vehicles (16) Passenger motor vehicles (15) Safety and rescue vehicles (17) Limousines (06) Automobiles or cars (03) Buses (02) Food Beverage and Tobacco Products (50) Vehicles and their Accessories and Components (25) Office Equipment and Accessories and Supplies (44) Marine transport (11)Motor vehicles (10) Aerospace systems (20) Segment Family Class Commodity SpamHam

3 How should we design a classifier for a given real world task?

4 Method 1. No Design Training Set f(x) Test Set Try Off-the-shelf Classifiers SVM Logistic-regression Decision Tree Neural Network... Implicit Assumption: We are trying to minimize error rate, or equivalently, maximize accuracy

5 What’s the use of the classifier? How do we evaluate the performance of a classifier according to our interests? Method 2. Optimize what we really care about Quantify what we really care about Optimize what we care about

6 Hierarchical classification of commercial products Textual product description UNSPSC Product and material transport vehicles (16) Passenger motor vehicles (15) Safety and rescue vehicles (17) Limousines (06)Automobiles or cars (03)Buses (02) Food Beverage and Tobacco Products (50) Vehicles and their Accessories and Components (25) Office Equipment and Accessories and Supplies (44) Marine transport (11)Motor vehicles (10)Aerospace systems (20) Segment Family Class Commodity

7 Product taxonomy helps customers to find desired products quickly. Facilitates exploring similar products Helps product recommendation Facilitates corporate spend analysis Looking for gift ideas for a kid? Toys&Games dolls building toys puzzles...

8 We assume misclassification of products leads to revenue loss. Textual product description of a mouse Product... Desktop computer and accessories mouse keyboard... pet... realize an expected annual revenue lose part of the potential revenue

9 What do we really care about? A vendor’s business goal is to maximize revenue, or equivalently, minimize revenue loss

10 Observation 1: the misclassification cost of a product depends on its potential revenue.

11 Observation 2: the misclassification cost of a product depends on how far apart the true class and the predicted class in the taxonomy.... Textual product description of a mouse Product... Desktop computer and accessories mouse keyboard... pet...

12 The proposed performance evaluation metric: average revenue loss d(y,y’) revenue loss of product x

13 Learning – minimizing average revenue loss Minimize convex upper bound

14 Multi-class SVM with margin re-scaling

15 0-1error rate (standard multi-class SVM) VALUEproduct revenue TREEhierarchical distance REVLOSSrevenue loss Multi-class SVM with margin re-scaling plug in any loss function

16 Dataset UNSPSC (United Nations Standard Product and Service Code) dataset Product revenues are simulated – revenue = price * sales data sourcemultiple online market places oriented for DoD and Federal government customers GSA Advantage DoD EMALL taxonomy structure4-level balanced treeUNSPSC taxonomy #examples1.4M #leaf classes1073

17 Experimental results Average revenue loss (in K$) of different algorithms

18 What’s wrong? Revenue loss ranges from a few K to several M

19 Loss normalization

20 Final results Average revenue loss (in K$) of different algorithms 7.88% reduction in average revenue loss!

21 Conclusion What do we really care about for this task? Minimize error rate? Minimize revenue loss? Performance evaluation metric Model + Tractable loss function Optimization How do we approximate the performance evaluation metric to make it tractable? Find the best parameters regularized empirical risk minimization A general method: multi- class SVM with margin re-scaling and loss normalization

22 Thank you! Questions?


Download ppt "Cost-Sensitive Learning for Large- Scale Hierarchical Classification of Commercial Products Jianfu Chen, David S. Warren Stony Brook University."

Similar presentations


Ads by Google