1 Learning Techniques for Big Data Redundant features – Group lasso – Feature selection.

Slides:



Advertisements
Similar presentations
Test practice Multiplication. Multiplication 9x2.
Advertisements

Random Projection for High Dimensional Data Clustering: A Cluster Ensemble Approach Xiaoli Zhang Fern, Carla E. Brodley ICML’2003 Presented by Dehong Liu.
Extraction and Transfer of Knowledge in Reinforcement Learning A.LAZARIC Inria “30 minutes de Science” Seminars SequeL Inria Lille – Nord Europe December.
Boosting Approach to ML
Learning on Probabilistic Labels Peng Peng, Raymond Chi-wing Wong, Philip S. Yu CSE, HKUST 1.
A novel supervised feature extraction and classification framework for land cover recognition of the off-land scenario Yan Cui
PARALLEL PROCESSING COMPARATIVE STUDY 1. CONTEXT How to finish a work in short time???? Solution To use quicker worker. Inconvenient: The speed of worker.
Data Visualization STAT 890, STAT 442, CM 462
Self Taught Learning : Transfer learning from unlabeled data Presented by: Shankar B S DMML Lab Rajat Raina et al, CS, Stanford ICML 2007.
Signal Processing Institute Swiss Federal Institute of Technology, Lausanne 1 Feature selection for audio-visual speech recognition Mihai Gurban.
Unsupervised Feature Selection for Multi-Cluster Data Deng Cai et al, KDD 2010 Presenter: Yunchao Gong Dept. Computer Science, UNC Chapel Hill.
Learning in Feature Space (Could Simplify the Classification Task)  Learning in a high dimensional space could degrade generalization performance  This.
Learning from Multiple Outlooks Maayan Harel and Shie Mannor ICML 2011 Presented by Minhua Chen.
Introduction to domain adaptation
Overcoming Dataset Bias: An Unsupervised Domain Adaptation Approach Boqing Gong University of Southern California Joint work with Fei Sha and Kristen Grauman.
1 Blockwise Coordinate Descent Procedures for the Multi-task Lasso with Applications to Neural Semantic Basis Discovery ICML 2009 Han Liu, Mark Palatucci,
HOCT: A Highly Scalable Algorithm for Training Linear CRF on Modern Hardware presented by Tianyuan Chen.
BrainStorming 樊艳波 Outline Several papers on icml15 & cvpr15 PALM Information Theory Learning.
© 2008 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice George Forman Martin Scholz Shyam.
Gesture Recognition & Machine Learning for Real-Time Musical Interaction Rebecca Fiebrink Assistant Professor of Computer Science (also Music) Princeton.
Fast Similarity Search for Learned Metrics Prateek Jain, Brian Kulis, and Kristen Grauman Department of Computer Sciences University of Texas at Austin.
Investigating Survivability Strategies for Ultra-Large Scale (ULS) Systems Vanderbilt University Nashville, Tennessee Institute for Software Integrated.
Computer Science and Engineering Parallelizing Defect Detection and Categorization Using FREERIDE Leonid Glimcher P. 1 ipdps’05 Scaling and Parallelizing.
Kernel adaptive filtering Lecture slides for EEL6502 Spring 2011 Sohan Seth.
Hierarchical Classification
1 Using Tiling to Scale Parallel Datacube Implementation Ruoming Jin Karthik Vaidyanathan Ge Yang Gagan Agrawal The Ohio State University.
Microsoft Office XP Illustrated Introductory, Enhanced Word and Excel Integrating.
Face Detection Using Large Margin Classifiers Ming-Hsuan Yang Dan Roth Narendra Ahuja Presented by Kiang “Sean” Zhou Beckman Institute University of Illinois.
$100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300.
Shared Memory Consistency Models. SMP systems support shared memory abstraction: all processors see the whole memory and can perform memory operations.
Dynamic Voltage Frequency Scaling for Multi-tasking Systems Using Online Learning Gaurav DhimanTajana Simunic Rosing Department of Computer Science and.
© 2009 IBM Corporation IBM Research Xianglong Liu 1, Yadong Mu 2, Bo Lang 1 and Shih-Fu Chang 2 1 Beihang University, Beijing, China 2 Columbia University,
Data Mining with Big data
An Efficient Greedy Method for Unsupervised Feature Selection
COP5992 – DATA MINING TERM PROJECT RANDOM SUBSPACE METHOD + CO-TRAINING by SELIM KALAYCI.
Coherent Hierarchical Culling: Hardware Occlusion Queries Made Useful Jiri Bittner 1, Michael Wimmer 1, Harald Piringer 2, Werner Purgathofer 1 1 Vienna.
Rehospitalization Analytics: Modeling and Reducing the Risks of Rehospitalization Chandan K. Reddy Department of Computer Science, Wayne State University.
Data Mining with Big Data IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2014 Xiangyu Cai ( )
Xiangnan Kong,Philip S. Yu An Ensemble-based Approach to Fast Classification of Multi-label Data Streams Dept. of Computer Science University of Illinois.
Self-taught Clustering – an instance of Transfer Unsupervised Learning † Wenyuan Dai joint work with ‡ Qiang Yang, † Gui-Rong Xue, and † Yong Yu † Shanghai.
Unsupervised Streaming Feature Selection in Social Media
Computing & Information Sciences Kansas State University An Overview of Big Data Analytics: Challenges & Selected Applications Guest Seminar Drake University.
Scalable Learning of Collective Behavior Based on Sparse Social Dimensions Lei Tang, Huan Liu CIKM ’ 09 Speaker: Hsin-Lan, Wang Date: 2010/02/01.
Data Mining 2, Filter methods T statistic Information Distance Correlation Separability …
1 Machine Learning in Natural Language More on Discriminative models Dan Roth University of Illinois, Urbana-Champaign
Effects of Measurement Uncertainties on Adaptive Source Characterization in Water Distribution Networks Li Liu, E. Downey Brill, G. Mahinthakumar, James.
Big Data Quality Panel Norman Paton University of Manchester.
Support Vector Machines Part 2. Recap of SVM algorithm Given training set S = {(x 1, y 1 ), (x 2, y 2 ),..., (x m, y m ) | (x i, y i )   n  {+1, -1}
New data sources (such as Big Data) and Traditional Sources Work Package 2.
Medical Applications of Signal Processing Research Memory Data Joint Image Reconstruction and Field Map Estimation in MRI H.M. Nguyen,
Strong Supervision from Weak Annotation: Interactive Training of Deformable Part Models S. Branson, P. Perona, S. Belongie.
BIG DATA Initiative SMART SubstationBig Data Solution.
Learning Mid-Level Features For Recognition
© 2013 ExcelR Solutions. All Rights Reserved Examples of Random Forest.
BACK SOLUTION:
Adaptive Resource Allocation Technique for Exascale Systems
th IEEE International Conference on Sensing, Communication and Networking Online Incentive Mechanism for Mobile Crowdsourcing based on Two-tiered.
Harm van Seijen Bram Bakker Leon Kester TNO / UvA UvA
STUDY AND IMPLEMENTATION
Multiplication More complicated than addition
Label Name Label Name Label Name Label Name Label Name Label Name
The Big 6 Research Model Step 3: Location and Access
View Inter-Prediction GAN: Unsupervised Representation Learning for 3D Shapes by Learning Global Shape Memories to Support Local View Predictions 1,2 1.
Microarray Data Set The microarray data set we are dealing with is represented as a 2d numerical array.
Mixed Up Multiplication Challenge
POWER CHALLENGES Several Ways To Solve 7 CHALLENGES.
Adaptive Traffic Control
Rohan Yadav and Charles Yuan (rohany) (chenhuiy)
Learned Convolutional Sparse Coding
Low-Rank Sparse Feature Selection for Patient Similarity Learning
Presentation transcript:

1 Learning Techniques for Big Data Redundant features – Group lasso – Feature selection

2 Learning Techniques for Big Data Online Learning for Group Lasso Scenario: huge data with group features appear sequentially How to learn the decision function adaptively?

3 Learning Techniques for Big Data Solutions and Properties Objective function Algorithm Three main steps Close solution Theoretical guarantee

4 Learning Techniques for Big Data Redundant features – Group lasso – Feature selection Insufficient labeled data – Multi-task learning – Unsupervised learning

5 Learning Techniques for Big Data Online Learning for Multi-task Feature Selection Problems and Motivation – Learning multiple related tasks simultaneously to improve performance – Existing redundant or irrelevant features – Data occur sequentially Challenges – How to adaptively update the models while selecting the important features? – Any theoretical guarantee? aMTFS

6 Learning Techniques for Big Data Solution and Properties Objective function Algorithm Three main steps Close-formed solution Theoretical guarantee

7 Learning Techniques for Big Data Redundant features – Group lasso – Feature selection Insufficient labeled data – Multi-task learning – Unsupervised learning Complicated decision function – Multiple kernel learning: level method speedup, generalization

8 Learning Techniques for Big Data Sparse Generalized Multiple Kernel Learning Data characteristics – Multi-source – Heterogeneous Labeled data: Horse/Donkey

9 Learning Techniques for Big Data Redundant features – Group lasso – Feature selection Insufficient labeled data – Multi-task learning – Unsupervised learning Complicated decision function – Multiple kernel learning: level method speedup, generalization Multiple kernel learning Volume data – Online learning Related work – ICML’10, CIKM’10-11, IJCNN’10, IEEETNN’11, ACMTKDD’13