In Mekansal İşler, we are building spatially enabled decision support platforms to improve operations of our customers in variety of area of interests including real-estate, insurance, customer analytics and location feasibility.
One of the questions we asked ourselves in recent past was if we could implement an algorithm and supporting software that will improve the quality & speed of real-estate appraisal.
Supported by Ministry of Science, Industry and Technology of Republic of Turkey, we started working on this problem. Our approach used spatial attributes of real-estates extensively and we wanted to find out neighbourhood effects on market prices. Using official boundaries wouldn’t make sense as they often ignore the socio-economic perspective of the physical environment. Thus we needed to find out socio-economic sub-regions of the space.
Using a linear regression algorithm in the all solution space would cause higher rate of errors as certain factors such as aspect have different effects on different regions of physical geography. For example, facing North would have a negative effect in Ümraniye, while it would have a definitive positive effect in Kandilli due to the view of Bosphorus. As we don’t have all possible causal factors in our dataset, we wanted to minimize the error rate of our predictions as running regression in smaller regions. In order to find out meaningfull smaller regions we wanted to create sub-neighbourhoods with distinct economic values. Using our learning sample of around 12.000 houses in Kadıköy, clustering algorithm created 50 economically distinct sub-neighbourhoods in Kadıköy as shown in Map-1.
As you can see on the map above, there were certain regions that couldn’t be included in clusters. Those regions are either parks, large public service areas or regions that we have no sample data.
As we’ve built the sub-neighbourhoods, we started enriching our market dataset with spatial attributes.
For example for 12.000 real-estate data, we have calculated spatial attributes. Some of them can be found below:
- Distance to coast
- Distance to closest shopping mall
- Distance to closest bank
- Distance to closest primary school
- Distance to closest market
- Distance to closest subway station
- Distance to closest bus stop
- Distance to closest park
- Distance to closest restaurant, bar or pub
- Distance to closest ferry station
- Distance to closest movie theater
- Distance to closest metrobus station
Using the physical and spatial attributes of our learning sample, we have fitted a regression model for each sub-neighbourhood in Kadıköy. Our regression model showed that factors effecting our prediction success varied distinctly in every sub-neighbourhood. For example age of building was one of the most significant factors in %80 of the sub-neighbourhoods. On the other hand, distance to shopping malls was significant in only %42 of the sub-neighbourhoods.
Tests were carried by using around 3000 homes and their physical and spatial attributes without knowing the market value of them. Average R2 score of regression models for 50 sub-neighbourhoods were %65 while average prediction error of home values were %18.
Real Life Applications
Our approach enables us to improve our results for real-life applications in different ways, for example as we know our error rates for sub-neighbourhoods we can apply sub-neighbourhood based margins to our initial results in order to improve future predictions. Our clustering algorithm enables us to identify spatial paradigms that happens in time. Thus we will be able to identify regions such as Fikirtepe as the real data starts flowing in the system.
Mekansal İşler supports the center for its technology related projects. aARREC and Mekansal İşler will be starting joint research projects in near future.
Please contact Mekansal İşler for further details.