Presentation is loading. Please wait.

Presentation is loading. Please wait.

TED Talks – A Predictive Analysis Using Classification Algorithms

Similar presentations


Presentation on theme: "TED Talks – A Predictive Analysis Using Classification Algorithms"— Presentation transcript:

1 TED Talks – A Predictive Analysis Using Classification Algorithms
Paulami ) School of Information Sciences, University of Illinois at Urbana-Champaign Introduction: TED talks are a great source of knowledge and ideas on a plethora of topics such as Technology, Entertainment, Design, Academic Research etc. which are presented by distinguished speakers. Aim: Predicting the number of views of the talk Analyzing the overall reaction to the talks based on the user comments.  Visualizations: Accuracy percentage for predictions: Analysis: Precision Recall Curve: Table 1: Number of views Fig 1: Top Ten Speakers Fig 6: Precision Recall Curve for predicting number of views Dataset: The dataset contains the details of around TED talks from year 2006 till 2017.  Fig 7: Precision Recall Curve for predicting reaction of talks Table 2: Reactions to talks Fig 2: Number of Views on Talks per year Fig 3: Number of Talks per year Findings : Number of views and the number of comments were correlated. Talks with higher number of views had high number of comments. Most of the talks were on technology and very less on innovations. The number of talks increased over the years. The top ten speakers were mainly authors and motivational speakers. Conclusions and Future Work: We implemented five classification models and tested. Logistic regression does well in predicting the number of views of the talks. Random forest algorithm gives the best accuracy for predicting the reaction of the talks Using this model we can extend this research on datasets of various media and advertising and other online platforms to predict the user reviews. Confusion Matrix: Fig 4: Number of views Data Cleaning and Pre-processing:  Removed the special characters like ‘$’ ,’/’,’ ^’ etc. that were present in the data. Corrected date format in desired way - ddmmyyyy Divided the data in binary form 1 for numbers more than median,-1 for numbers less than median to get low and high views. Removed outliers from the data. Categorized words into positive, negative and neutral ratings. References: [1] [2] [3] [4] [5] Fig 5: Reaction to talks


Download ppt "TED Talks – A Predictive Analysis Using Classification Algorithms"

Similar presentations


Ads by Google