Presentation is loading. Please wait.

Presentation is loading. Please wait.

BIA 660 Web Analytics - Midterm Akshta Chougule Hao Han Di Huo Xi Lu Laura Sills Bank Of America.

Similar presentations


Presentation on theme: "BIA 660 Web Analytics - Midterm Akshta Chougule Hao Han Di Huo Xi Lu Laura Sills Bank Of America."— Presentation transcript:

1 BIA 660 Web Analytics - Midterm Akshta Chougule Hao Han Di Huo Xi Lu Laura Sills Bank Of America

2 Business Problem Customer Strategy: grow base by forming life- long banking relationships with young adults Current Account Demographics Report Shows ●fewer new student accounts ●increase in cancellation of accounts by the young adult demographic Impact: Losing market share to other banks

3 Business Questions ●What is Bank of America’s reputation with this age group - do they like Bank of America or not? ●How does Bank of America compare to other banks? ●Are customers in this demographic group unhappy with the bank’s services? ●Are there any banking products which customers in this group want not offered by Bank of America?

4 Source of Information Online social media sites are a good source for comments from this age group

5 YouTube Statistics ● More than 1 billion unique users monthly ●Nielsen ratings show that YouTube reaches more US adults ages than any other cable network

6 Demographics of Reddit chart/277513/

7 What do People Think About Banks?

8 TopicRedditYouTubeTwitter mortgage5%6%30% loan5%13%0% fraud6%7%0% insurance1%2%0% branch3%1%0% hours2%1%0% account19%16%20% overdraft8%1%0% bailout1%6%0% fee18%11%20% customer13%8%0% representative / teller7%18%20% [credit] union10%7%10% computer1% 0% CEO2% 0%

9 Data Gathering and Validation Use Python to obtain comments from web ●Crawling Reddit ●API for Twitter ●API for YouTube

10 Data Cleansing and Exploration ●Delete incomplete comments, extra whitespace, and punctuation, stopwords ●Explore data using Python to analyze the frequency of words in the comments in order to identify “key words” related to banking ●Word scan confirmed the key words

11 Gathering data from Twitter ●Technique: twitter API ●Amount of tweets: BOA KB Citibank KB Chase KB ●Timestamp: 1 week ●Type of Data: Tweet text Tweet created_at Geocode

12 Data Processing ●Two libraries: positive & negative ●Score each tweet

13 Tweets by Location

14 Data Processing ● Summary for BOA tweets: ●Good or bad? Min.1st Qu.MedianMean3rd Qu.Max

15 Competitor Analysis

16 Distribution for tweets’ score Mean: BOA: Citi bank: Chase:

17 Two Sample T-test Null hypothesis: true difference in means is equal to 0 Alpha=0.1 ●BOA and Citi bank: p-value = < 0.1 ●Citi bank and Chase: p-value = < 0.1 ●BOA and Chase p-value = > 0.1

18 Gathering data from YouTube ● Techniques: BeautifulSoup g.data ●Amount for general analysis: 3097

19 TopicRedditYouTubeTwitter mortgage5%6%30% loan5%13%0% fraud6%7%0% insurance1%2%0% branch3%1%0% hours2%1%0% account19%16%20% overdraft8%1%0% bailout1%6%0% fee18%11%20% customer13%8%0% representative / teller7%18%20% [credit] union10%7%10% computer1% 0% CEO2% 0%

20 YouTube data for each category ● Training data: 600 ●Loan: 2430 ●Account: 2700 ●Service: 520

21 Naive Bayes Classification Algorithm A naive Bayes classifier assumes that the presence or absence of a particular feature is unrelated to the presence or absence of any other feature, given the class variable 。

22 Naive Bayes Classification Algorithm Splitting the dataset into training and test data (Manual rating of comments) ●Training (400) ●Testing (200) ●Predicting (5700)

23 Primary Categories of Customer Complaints

24 Accuracy of Classification ● Mortgage: 64.5% ●Accounts: 58.7% ●Service: 68.4%

25 Mortgage

26 Account

27 Service

28 Thank you!


Download ppt "BIA 660 Web Analytics - Midterm Akshta Chougule Hao Han Di Huo Xi Lu Laura Sills Bank Of America."

Similar presentations


Ads by Google