Presentation is loading. Please wait.

Presentation is loading. Please wait.

Team #6 Bill Cheng Sabina Del RossoStephen Hom Omede Firouz Stacy Hsueh Wei Jiang Thoranis Karnasuta Social Networking Analytics for Calbee (SNAC) CLIENTEER/SCHEMANORMALIZATIONQUERIES.

Similar presentations


Presentation on theme: "Team #6 Bill Cheng Sabina Del RossoStephen Hom Omede Firouz Stacy Hsueh Wei Jiang Thoranis Karnasuta Social Networking Analytics for Calbee (SNAC) CLIENTEER/SCHEMANORMALIZATIONQUERIES."— Presentation transcript:

1 Team #6 Bill Cheng Sabina Del RossoStephen Hom Omede Firouz Stacy Hsueh Wei Jiang Thoranis Karnasuta Social Networking Analytics for Calbee (SNAC) CLIENTEER/SCHEMANORMALIZATIONQUERIES Professor Ken Goldberg. IEOR 115. December 9, 2011. DATABASE

2 Client Background: Calbee San Francisco CALBEE, Inc. is one of the largest snack companies in Japan – Company based on the premise of good health Calbee, San Francisco is the company’s first US-based flagship store – Founded in early 2011 – Located in Westfield Mall Active in social media – Website, Facebook, Twitter Image from calbeeshop.com CLIENTEER/SCHEMANORMALIZATIONQUERIES DATABASE

3 Currently do not keep track of social media hits on any site Use Point of Sale for sales data and employee clock-ins Current Infrastructure CLIENTEER/SCHEMANORMALIZATIONQUERIES DATABASE Image from http://www.unrealstudio.com

4 Handle future expansion into e-commerce Increase social media marketing in targeted demographics View effect of promotions on sales and social media to help better cater future promotions Provide a foundation to maximize profits – Logistic management using integer programming – Data mining and machine learning to predict sales Database Objectives CLIENTEER/SCHEMANORMALIZATIONQUERIES DATABASE

5 EER Diagram EER/SCHEMANORMALIZATIONQUERIES DATABASE CLIENT

6 Relational Design Schema (46 relations) Promotion/Sales/Retail: Relations Numbered 0-9 1. PRODUCT(ProdID, Name, IsSour, IsSweet, IsSalty, IsSavory, ManufCost, RetailPrice) 2. PURCHASE(PurchaseID, ProdID 1, PromoID 3a, CustID 6a, StoreID 4a, EmpID 5a, Timestamp, ipAddress) 3a. PROMOTION(PromoID, PromoCode, StoreID, StartDate, EndDate, Discount) 3b. PROMOTION_SPREAD_VIA_TWITTER(PromoID 3a, TweetID 10c ) 3c. PROMOTION_SPREAD_VIA_F(PromoID 3a, F_CID 11c ) 3d. PROMOTION_SPREAD_VIA_G+(PromoID 3a, G_CID 12c ) 3e. PROMOTION_SPREAD_VIA_S(PromoID 3a, S_DID 13c ) 3f. PROMOTION_SPREAD_VIA_B(PromoID 3a, BPost_ID 14a ) 3g. PROMOTION_INFO_VIA_W(PromoID 3a, url 15 ) 4a. STORE(StoreID, AddressNo,StreetName, City, Country, ZipCode, PhoneNo) 4b. STORE_CARRIES(StoreID 4a, ProdID 1, Stock) 5a. EMPLOYEE(EmpID, LName, FName, Position, FavProdID 1, StoreID 4, AddressNo,StreetName, City, State, Country, ZipCode, SSN) 5b. EMPLOYEE_IS_FRIEND(EmpID 5a, T_UID 10a, F_UID 11a, G_UID 12a, S_UID 13a ) 5c. EMPLOYEE_IS_CUSTOMER(EmpID 5a, CustID 6a ) 6a. CUSTOMER(CustID, LName, FName, AddressNo, StreetName, City, State, Country, ZipCode, FavProd 1, BirthDate) 6b. CUSTOMER_IS_FRIEND(CustID 6a, T_UID 10a, F_UID 11a, G_UID 12a, S_UID 13a ) 8a. PRODUCT_AD(P_Ad_ID, ProductID 1, DateBeginAd, DateEndAd, F_or_G_Ad) 8b. STORE_AD(S_Ad_ID, Store_ID, DateBeginAd, DateEndAd, F_or_G_Ad) 8c. F_P_AD_CLICKED(P_Ad, F_UID, Timestamp, ipAddress) 8d. G_P_AD_CLICKED(P_Ad_ID, G_UID, Timestamp, ipAddress) 8e. F_S_AD_CLICKED(S_Ad_ID, F_UID, Timestamp, ipAddress) 8f. G_S_AD_CLICKED(S_Ad_ID, G_UID, Timestamp, ipAddress) Relational Design Schema EER/SCHEMANORMALIZATIONQUERIES DATABASE CLIENT

7 Social Media: Relations Numbered 10-19 10a. T_USER(T_UID, T_Username, Fname, Lname, City, State, BirthDate, Email) 10b. T_FOLLOWING(T_UID 10a, Follower_T_UID 10a, DateBeganFollowing) 10c. TWEET(TweetID, T_UID 10a, Auth_T_UID 10a, TextStr, Timestamp) 11a. F_USER(F_UID, Fname, Lname, City, State, BirthDate, Email) 11b. F_FRIENDS(F_UID 11a, Friend_F_UID 11a, DateBecameFriends) 11c. F_COMMENT(F_CID, Auth_F_UID 11a, On_F_CID 11c, TextStr, Timestamp) 11d. F_LIKE(F_CID 11c, F_UID 11a, Timestamp) 12a. G_USER(G_UID, Fname, Lname, City, State, BirthDate, Email) 12b. G_FRIENDS(G_UID 12a, Friend_G_UID 12a, DateBecameFriends) 12c. G_COMMENT(G_CID, Auth_G_UID 12a, On_G_CID 12c, TextStr, Timestamp) 12d. G_LIKE(G_CID 12c, G_UID 12a, Timestamp) 13a. S_USER(S_UID, Fname, Lname, City, State, BirthDate, Email) 13b. S_FOLLOWING(S_UID 13a, Follower_S_UID 13a, DateBeganFollowing) 13c. S_DISCOVERY(S_DID, S_UID 13a, url, Timestamp) 13d. S_REVIEW(S_DID 13c, S_UID 13a, TextStr, Like/Dislike, Timestamp) 14a. BLOG_POST(url, BPost_ID, Author_Emp_ID 5a, TextStr, Timestamp) 14b. BLOG_COMMENT(BComment_ID, url, BPost_ID 14a, TextStr, Timestamp, ipAddress) 14c. ASSOCIATE_IP_T(T_UID 10a, Timestamp, ipAddress) 14d. ASSOCIATE_IP_FB(F_UID 11a, Timestamp, ipAddress) 14e. ASSOCIATE_IP_G(G_UID 12a, Timestamp, ipAddress) 14f. ASSOCIATE_IP_S(S_UID 13a, Timestamp, ipAddress) 15. MAIN_WEBSITE(url, link_to_html_file, Timestamp) Relational Design Schema Cont. EER/SCHEMANORMALIZATIONQUERIES DATABASE CLIENT

8 Other Data: Relations Numbered 20-29 20a. GOOGLE_TREND(GT_ID, word, city, country, day, hits) 20b. RELATED_TREND(word, Related_Prod_ID 1 ) Relational Design Schema Cont. EER/SCHEMANORMALIZATIONQUERIES DATABASE CLIENT

9 Access Table Relationships EER/SCHEMANORMALIZATIONQUERIES DATABASE CLIENT

10 EER/SCHEMANORMALIZATIONQUERIES DATABASE CLIENT Access Table Relationships Cont.

11 Normalization Analysis: 1NF NORMALIZATIONQUERIES DATABASE CLIENTEER/SCHEMA Removal of a multi-valued attribute (flavor):

12 Normalization Analysis: 2NF NORMALIZATIONQUERIES DATABASE CLIENTEER/SCHEMA Removal of a partial FD: {PromoID}  {PromoCode, StoreID, StartDate, EndDate, Discount}

13 Normalization Analysis: 3NF NORMALIZATIONQUERIES DATABASE CLIENTEER/SCHEMA Removal of a transitive FD: {T_UID}  {T_Username, Fname, Lname, City, State, BirthDate, Email}

14 Normalization Analysis: BCNF NORMALIZATIONQUERIES DATABASE CLIENTEER/SCHEMA Removal of a FD with a non-superkey attribute on the LHS: {PromoCode}  {StartDate, EndDate, Discount}

15 Find out the most talked about products in a city and their quantities (stock). This will help us determine which products to move around to balance inventories in expectation of sale increases. Data can be exported to a solver to do a shipment problem. Query 1: Popular Product Stock NORMALIZATIONQUERIES DATABASE CLIENTEER/SCHEMA

16 SELECT Product.ProdID, Product.ProdName, (SELECT COUNT(F_Comment.F_CID) FROM F_Comment WHERE F_Comment.TextStr LIKE '*' + Product.ProdName + '*') AS Hits, Store.City, Store_Carries.StoreID AS Store, Store_Carries.Stock AS Stock FROM Product, Store, Store_Carries WHERE (((Product.ProdID)=[Store_Carries].[ProdID]) AND ((Store.StoreID)=[Store_Carries].[StoreID])) ORDER BY Product.ProdName; Query 1: SQL NORMALIZATIONQUERIES DATABASE CLIENTEER/SCHEMA

17 Query 1: Output NORMALIZATIONQUERIES DATABASE CLIENTEER/SCHEMA

18 We have a list of stores and their stock of different products – Transportation problem to encourage similar levels of stock – Minimize shipments, shipping costs, etc. Subject to: No outliers (stores with low stock) Possible shipment constraints Possible traffic constraints Etc. Query 1: Data Analysis NORMALIZATIONQUERIES DATABASE CLIENTEER/SCHEMA

19 Query 1: Data Analysis (AMPL) NORMALIZATIONQUERIES DATABASE CLIENTEER/SCHEMA

20 Query 1: Data Analysis NORMALIZATIONQUERIES DATABASE CLIENTEER/SCHEMA

21 Consider a promotion. Compare product social network comments in a given city two weeks before, during, and two week after a promotion to judge its effectiveness. Order by the return on the investment. Query 2: Promo Social Networking NORMALIZATIONQUERIES DATABASE CLIENTEER/SCHEMA

22 SELECT Promotion.PromoID, (SELECT COUNT(*) FROM F_Comment, Product WHERE F_Comment.TextStr LIKE '*' + Product.ProdName + '*' AND Product.ProdID = Promotion.ProdID AND F_Comment.Timestamp < Promotion.StartDate AND F_Comment.Timestamp > Promotion.StartDate - 14) AS HitsBefore, (SELECT COUNT(*) FROM F_Comment, Product WHERE F_Comment.TextStr LIKE '*' + Product.ProdName + '*' AND Product.ProdID = Promotion.ProdID AND F_Comment.Timestamp < Promotion.EndDate AND F_Comment.Timestamp > Promotion.StartDate) AS HitsDuring, (SELECT COUNT(*) FROM F_Comment, Product WHERE F_Comment.TextStr LIKE '*' + Product.ProdName + '*' AND Product.ProdID = Promotion.ProdID AND F_Comment.Timestamp < Promotion.EndDate + 14 AND F_Comment.Timestamp > Promotion.EndDate) AS HitsAfter, (SELECT SUM(Promotion.Discount*Product.RetailPrice) FROM Product) AS PromoCost FROM Promotion ORDER BY PromoCost; Query 2: SQL NORMALIZATIONQUERIES DATABASE CLIENTEER/SCHEMA

23 Use friendship data to rate friends by how many recommendations they have made. Determine how many of a person's friends became friends with us after they became friends with us. In this way, we identify possible priority customers of Calbee to target for special advertisements and promotions. Query 3: Priority Customers NORMALIZATIONQUERIES DATABASE CLIENTEER/SCHEMA

24 SELECT F.F_UID, ( SELECT COUNT(*) FROM F_Friends AS F2 WHERE F2.F_UID = F.F_UID AND EXISTS( SELECT F3.DateBecameFriends FROM F_FRIENDS F3 WHERE F3.Friend_F_UID = 1 AND F3.F_UID = F2.Friend_F_UID AND F3.DateBecameFriends > F.DateBecameFriends)) AS friendCount FROM F_Friends AS F WHERE F.Friend_F_UID = 1; Query 3: SQL NORMALIZATIONQUERIES DATABASE CLIENTEER/SCHEMA

25 Query 3: Data Analysis NORMALIZATIONQUERIES DATABASE CLIENTEER/SCHEMA

26 Determine priority stores which don’t stock products that they should, as determined by google trend word popularity. For a given google trend word, find the top 5 cities in which the word is most searched in year 2011. Then, find stores in those cities and which related products they do not stock. This will help us identify how to improve inventory. Query 4: Google Trends and Stocks NORMALIZATIONQUERIES DATABASE CLIENTEER/SCHEMA

27 SELECT Store.StoreID AS Store, Store.City AS City, Product.ProdID AS Prod FROM Store, Store_Carries, Product WHERE Store.City IN (SELECT TOP 5 Google_Trend.City FROM Google_Trend WHERE Google_Trend.Word = 'test') AND Store_Carries.StoreID = Store.StoreID AND Store_Carries.ProdID = Product.ProdID AND Store_Carries.Stock = 0 AND Product.ProdID IN (SELECT Related_Trend.Related_Prod_ID FROM Related_Trend WHERE Related_Trend.Word = 'test'); Query 4: SQL NORMALIZATIONQUERIES DATABASE CLIENTEER/SCHEMA

28 Gather Social Networking, Google Trend, and Purchase data over time to formulate predictive models. For a given product, find the number of social network hits of a product, the related trend word hits, and the number of purchases in that product for a given city on a given day. In this way, we can use social network 'buzz' and trend data to predict purchases as a function of time and city. Order by product then timestamp. Query 5: Social Network and Purchases NORMALIZATIONQUERIES DATABASE CLIENTEER/SCHEMA

29 SELECT Product.ProdID, Product.ProdName, Purchase.Timestamp, (SELECT COUNT(F_Comment.F_CID) FROM F_Comment WHERE F_Comment.TextStr LIKE '*' + Product.ProdName + '*' AND F_Comment.Timestamp = Purchase.Timestamp) AS SocialNetworkHits, (SELECT SUM(Google_Trend.hits) FROM Google_Trend WHERE Google_Trend.word = Product.ProdName AND Google_Trend.Timestamp = Purchase.Timestamp) AS TrendHits FROM Product, Purchase WHERE Purchase.ProdID = Product.ProdID ORDER BY Purchase.ProdID, Timestamp; Query 5: SQL NORMALIZATIONQUERIES DATABASE CLIENTEER/SCHEMA

30 Query 5: Output NORMALIZATIONQUERIES DATABASE CLIENTEER/SCHEMA

31 Query 5: Data Analysis NORMALIZATIONQUERIES DATABASE CLIENTEER/SCHEMA

32 Social media networking, Google Trends, and Purchases data used predictively – Group into weekly vectors – Extract significant data using Principle Component Analysis to project onto 2 dimensions. – Cluster data using K-Means  See if we can predict future sales using machine learning Query 5: Data Analysis NORMALIZATIONQUERIES DATABASE CLIENTEER/SCHEMA

33 Query 5: Data Analysis NORMALIZATIONQUERIES DATABASE CLIENTEER/SCHEMA

34 Employees login here: NORMALIZATIONQUERIES DATABASE CLIENTEER/SCHEMA Login Interface

35 Allows employees to insert data in forms or run selected query NORMALIZATIONQUERIES DATABASE CLIENTEER/SCHEMA Switchboard

36 Enter information on new Calbee employees Forms: New Employee NORMALIZATIONQUERIES DATABASE CLIENTEER/SCHEMA

37 Questions? CLIENTEER/SCHEMANORMALIZATIONQUERIES DATABASE Thank you!


Download ppt "Team #6 Bill Cheng Sabina Del RossoStephen Hom Omede Firouz Stacy Hsueh Wei Jiang Thoranis Karnasuta Social Networking Analytics for Calbee (SNAC) CLIENTEER/SCHEMANORMALIZATIONQUERIES."

Similar presentations


Ads by Google