Presentation is loading. Please wait.

Presentation is loading. Please wait.

Sparse Real Estate Ranking with Online User Reviews and Offline Moving Behaviors Yanjie Fu Hui Xiong, Yu Zheng, Yong Ge, Zijun Yao, Yanchi Liu, Jing Yuan.

Similar presentations


Presentation on theme: "Sparse Real Estate Ranking with Online User Reviews and Offline Moving Behaviors Yanjie Fu Hui Xiong, Yu Zheng, Yong Ge, Zijun Yao, Yanchi Liu, Jing Yuan."— Presentation transcript:

1 Sparse Real Estate Ranking with Online User Reviews and Offline Moving Behaviors Yanjie Fu Hui Xiong, Yu Zheng, Yong Ge, Zijun Yao, Yanchi Liu, Jing Yuan Rutgers, the State University of New Jersey ICDM2014@Shenzhen, China

2  Background and Motivation  Problem Statement  Methodology  Evaluation  Conclusions Agenda 2

3 Why Housing Matters Make extra money as an investment option. 3 Darling, I am so sorry. You have no house. Without a house, you might… Young man, go home, and buy a new house. Yes. I do! With a house, you might… A house is a signature of your capability to settle down your family.

4 Real Estate Investment Value  Market value  The price an estate would trade in the marketplace 4  Investment value  The growth potential of resale value  Motivation to enter estate market We don’t predict future price! We predict investment potential!

5 Quantifying Investment Value 5  Estate investment return rate of a given market period  Prepare the ground-truth investment values of estates for training data  Identify rising market period and falling market period of Beijing  Calculate the investment returns of each real estate during rising market period and falling market period  Estate grading (5>4>3>2>1) by finding inflection points

6 Existing Housing Analysis Methods  Housing indexes  FHFA/OFHEO, S&P/Case-Shiller Indices, FNC Residential Price Index  Suit for regional housing analysis rather than a specific house  Financial time series analysis  Trend, periodicity and volatility of housing price time series  Noisy: speculative demands/government policies affect prices  Correlating estate value to the static statistics of urban infrastructure  The numbers/distance of bus stop, subway stations, road network entries, and POIs  Physical facilities have both positive and negative effect Train stations bring noise and pollution and degrade estate value  Lack of dynamics and hard to reflect the changing pulses of a city 6

7 What Better Reflects Estate Investment Value ?  Consumer psychology  People’s opinions and estate investment value  If people have better opinions for an estate, the demand for this estate is higher and its investment value will be higher  Uncovering people’s opinions for an estate from user-generated estate-related dynamic data 7

8 Online User Reviews 8 Yelp Yahoo Local Listing FoursquareGoogle Local for Business s how the explicit opinions of mobile users for places surrounding an estate

9 Offline Moving Behaviors (1) Cell -Tower Traces Taxicab GPS Traces Check-in Traces Bus Traces These data better sense the dynamic pulse of city, comparing to static statistics of urban infrastructure

10 Offline Moving Behaviors (2)  Taxi transits  Fast and expensive  Central business district and financial areas 10 Taxi drop-off pointsBus drop-off points Checkins  Bus transits  Slow and cheap  Information technology and education areas  Checkins  Walking portion of mobility  Areas full of attractions, entertainments, and POIs  Encode the static statistics of urban infrastructure  Reflect the implicit “opinions” of mobile users for a neighborhood

11 Problem Definition  Given  Estates with locations and historical prices  Online User Reviews (rating and comments for business venues/point of interests)  Offline Moving Behaviors (taxi traces, bus traces, mobile check-ins)  Objective  Rank estates based on their investment values  Core tasks  Extract discriminative features that reflect residents’ opinions for estates  Learn an estate ranking predictor by combining a pairwise ranking objective and a sparsity regularization 11

12 Methodology Overview 12

13 Features from Online User Reviews 13  Overall Satisfaction  Service Quality  Environment Class  Consumption Cost  Functionality Planning

14 Features from Offline Moving Behaviors 14  Taxi Arriving Volume  Taxi Leaving Volume  Taxi Transition Volume  Taxi Driving Velocity  Taxi Commute Distance  Bus Arriving Volume  Bus Leaving Volume  Bus Transition Volume  Bus Stop Density  Popularity of Checkin  Topic Profile of Checkin  Propagating word-of-mouth from poi to neighborhood  Textual profiling from words to topics

15 Predicting Estate Investment Value 15 Features of User Review Features of Taxi Trajectories Features of Smart Card Transactions Features of Checkins Estate Investment Value

16 Modeling Ranking Objective  Prediction Accuracy  Maximizing the likelihood ≈ minimizing square loss 16  Ranking Consistency  A ranked list of estates is viewed as a directed graph  Nodes as real estates  Directed edge A  B meaning A ranks higher than B  Our model generate edges with certain probability  Maximizing the likelihood ≈ minimizing the ranking loss of graph-based ranking structure a>b>c>d

17 Incorporating Sparsity Regularization into Estate Ranking  Extracting large amount of estate-related features  Features are correlated and redundant  A small number of good features can determine the ranking of estates based on investment values  Two steps in classic method  Feature Selection  Fit the selected features with ranking model  The selected feature subset may not be optimal for ranking because the two steps are modelled separately  Combining sparsity and ranking in a unified model  Enforce sparse representations during learning by setting some feature weights to zero and avoiding overfitting 17

18 Solving The Ranking Objective  Log of posterior 18  Maximize A Posterior

19 Experimental Data  Beijing real-world Data  Beijing estate data 2851 estates with transaction records from 04/2011 to 09/2012 Falling market(04/2011 to 02/2012) and Rising market (02/2012 to 09/2012)  Beijing Taxi Traces  Beijing Smart Card Transactions  Beijing Check-Ins  Beijing Business Review 19

20 Evaluation Methods and Metrics  Baseline algorithms  MART: it is a boosted tree model, specifically, a linear combination of the out puts of a set of regression trees  RankBoost: it is a boosted pairwise ranking method, which trains multiple weak rankers and combines their outputs as final ranking  LambdaMart: it is the boosted tree version of LambdaRank, which is based on RankNet  Coordinate Ascent: it uses domination loss and applies coordinate descent for optimization  FenchRank: designed for solving the sparse ranking problem with a L1 constraint  Evaluation metrics  Normalized Discounted Cumulative Gain (NDCG)  Precision  Recall  Kendall’s Tau Coefficient 20

21 Correlation Analysis 21 If the heterogeneity of functional planning is too high or too low, the house will be low-valued If the commute distance is short, the house is close to important places and business venuses

22 Feature Evaluation on Different Sources 22 Business reviews and checkins performs better than taxi and bus traces Checkins and reviews represent attending phrase Taxi and bus traces moving phrase Taxi features perform better than bus features in falling market Taxi mobility represents white-collar and business people Bus mobility represents mediate classes

23 Feature Evaluation on Different Radiuses 23 We recommend to set the radius of neighborhood to 0.75±0.25km, rather than too short( 2km) 0.75km is not only a comfortable walking distance for bus and taxi stops, but also sufficient to capture the outdoor activities of estate neighborhoods

24 Model Evaluation  Our method and FenchelRank achieve the first and second best ranking accuracy in top-k ranking  Our method keep a balance between top-k and overall ranking 24

25 Conclusions  High-value house discovery  A novel geo-buesiness problem  Investment-value based real estate ranking  Features from online user reviews to capture explicit opinions for POIs near an estate  Features from offline moving behaviors to capture implicit geographical preferences of mobile users  Real estate ranking by combining prediction accuracy, ranking consistency and sparsity regularization  Benefits  Online user reviews and offline moving behavior better sense the up- to-date geo-preference of people toward real estates in a cheaper way  All aspects of feature engineering of the interest of industry  Joint modeling of prediction accuracy, ranking consistency, and sparsity regularization in a unified framework 25


Download ppt "Sparse Real Estate Ranking with Online User Reviews and Offline Moving Behaviors Yanjie Fu Hui Xiong, Yu Zheng, Yong Ge, Zijun Yao, Yanchi Liu, Jing Yuan."

Similar presentations


Ads by Google