All Roads Lead to Third Mainland Bridge

Atonye Nyingifa
8 min readMar 22, 2021
Photo by Chuks Ugwuh on Unsplash

As someone who has lived in 5 houses over the past 10 years, I’m no stranger to the Nigerian real estate market. All our rentals have been within 15 minutes of each other, but price-wise, they’ve been miles apart. So whether it’s a family home or you’re moving out from your parent’s house to a studio apartment (Terminology alert — hereafter referred to as a self-contained in Nigerian lingo), you’re probably on the lookout for a property that has features you’re most concerned about, while keeping costs as low as reasonably possible.

In this article, I’d like to go through the whole gamut of data analysis steps on data related to Nigerian property rentals situated in Lagos state, including data scraping, feature creation, and exploratory data analysis.

  1. Getting the Data: Real data is messy! I’ve heard or read that phrase a couple hundred times while trying to learn data science and it never meant anything to me until I decided to start working on my own projects. Unless the data you need for your project is perfectly arranged, has already been curated, or gotten from somewhere like Kaggle or UCI, you’re probably going to need to scrape websites (legally) for the kind of data you need, then clean that data till it makes some modicum of sense. To get data for the houses, I used Python Beautiful soup and Requests libraries to scrape 100 pages, each containing 21 properties each from Nigeria Property Centre which has over 20,000 listings for Houses in Lagos alone. The full code can be found here. After scraping for a little while and removing all the property duplicates (I see you sponsored posts and “premium” listings). I ended up with just over 3800 properties, of which 88% were located on Lagos Island.
Property Distribution in Dataset — created using ChartStudio by ANyingifa

For greater visual effect and because it’s fun to do, I created a cluster map of the properties using the Folium Package (click on each cluster or zoom in to drill down and hover over an individual property to see its properties)

Folium Clustermap of House Rent Prices and Areas — created by ANyingifa

2. Feature Creation: Using the Geopy package geocoder (with my Google Maps billable i.e. not free API key) and the Bing Maps Distance Matrix (with my very much free API key lol), I translated the listing address strings to the associated latitude and longitude pairs, then used that to calculate the distance from the address to a static point — in this case, the Third Mainland Bridge (the great Lagos Island/mainland divider). I also created a transformer class using sklearn’s BaseEstimator and TransformerMixin classes to make a tidy, reproducible, one-and-done data transformer to parse out the features that I could later use and analyze from the house descriptions. After applying the transformer, I ended up with 20 odd features for each house (provided they existed in the house description).

df.columnsIndex(['Listing Title', 'Latitude/Longitude', 'Flat', 'Distance to 3ML','Link', 'Area', 'Locality', 'Property_Ref', 'Price', 'Island', 'Days_Since_Added', 'Days_SinceUpdated', 'Type', 'Bedrooms', 'Bathrooms', 'Toilets', 'Parking_Spaces', 'Total_Area', 'Covered_Area','Serviced', 'Furnishing', 'Newly_Built', 'Pool','Gym', 'Shared','Multiple_Units', 'Description_Len'],
dtype='object')

3. Getting Rid of Outliers: This step was done in tandem with the exploratory data analysis. First, I removed all items that had multiple units (e.g. 16 units of 3 bedroom flats, and in parallel all items whose type was “Block of Flats”). Looking through the prices of the houses remaining, I still saw houses as expensive as 1-10 billion naira (remember this is for rent, not to buy lol), so I definitely knew something was up. Cross-checking a few of the pricier properties, I saw that they were houses on sale that were erroneously listed for rent, and in some cases, typos where the agent had added an extra three zeros (Pretty please let that mistake be made in my paycheck just once..)

I couldn't possibly manually check all the prices and I didn't want to use an arbitrary cut-off price, so instead, I grouped the houses by Area and number of Bedrooms, and removed all entries that were above 1.5 times the inter-quartile range for those groups (using pandas groupby feature, this was one line of code). The graph below shows the result of removing those outliers — prices are now a lot more reasonable, with the most expensive property going for 40 million naira per annum and the least expensive being 60k (remember these are for all properties regardless of the number of bedrooms or utilities)

Box Plots of House Prices, before and after outlier removal — by ANyingifa

4. Exploratory Data Analysis: Time for the good stuff. The time to explore the variables in the dataset, their statistics, their correlations with other variables. The above figure already dealt with the price distribution, so I’ll start with exploring the null values present in my dataset.

df_no_outliers.info()<class 'pandas.core.frame.DataFrame'>
Int64Index: 3629 entries, 1 to 4072
Data columns (total 29 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Listing Title 3629 non-null object
1 Latitude/Longitude 3629 non-null object
2 Flat 3629 non-null object
3 Distance to 3ML 3629 non-null float64
4 Link 3629 non-null object
5 Area 3629 non-null object
6 Locality 3629 non-null object
7 Property_Ref 3629 non-null object
8 Price 3629 non-null int64
9 Island 3629 non-null int64
10 Days_Since_Added 3629 non-null int64
11 Days_SinceUpdated 3629 non-null int64
12 Type 3629 non-null object
13 Bedrooms 3629 non-null int64
14 Bathrooms 3629 non-null float64
15 Toilets 3629 non-null float64
16 Parking_Spaces 2314 non-null object
17 Total_Area 443 non-null float64
18 Covered_Area 383 non-null float64
19 Serviced 3629 non-null int64
20 Furnishing 3629 non-null int64
21 Newly_Built 3629 non-null int64
22 Pool 3629 non-null int64
23 Gym 3629 non-null int64
24 Shared 3629 non-null int64
25 Multiple_Units 3629 non-null object
26 Description_Len 3629 non-null int64
27 Latitude 3629 non-null float64
28 Longitude 3629 non-null float64
dtypes: float64(7), int64(12), object(10)
memory usage: 979.6+ KB

As you can see a good number of properties are missing the number of parking spaces they have, and most don’t have details on the square area of the properties. It’s customary to either drop rows with N/A values if they are only a few values missing or drop the entire feature (i.e. column) if there are many values missing which cannot be safely approximated. I decided to drop the Total Area and Covered Area Columns when choosing features for my model.

Next, I explored average rental prices on the Lagos mainland vs the Island, grouped by Area.

Island Rental Prices — Chart created by ANyingifa using ChartStudio

The most expensive property from the dataset is an 8-bedroom house in Lekki, listed as “perfect for corporate offices”. However, consistently and overwhelmingly, for Island areas, Ikoyi comes out on top as prime real estate. Rent prices are markedly higher than any other properties both on the mainland and island, accounting for the maximum prices per bedroom for every type (where there was an Ikoyi property available). For instance, the average 1 bedroom flat in Ikoyi in the dataset cost an eye-watering 3.16 million naira, compared to Lekki’s 1.3M, Victoria Island’s 1.8M, and Ajah’s 682k

Mainland Rental Prices — Chart created by ANyingifa using ChartStudio

For Mainland properties, pricier areas include Ikeja, Magodo, Gbagada, Ogudu, Ilupeju, and Maryland with localities such as Ikeja GRA, Allen, Onigbongbo, and Alausa having some of the most expensive houses and Ikorodu properties, especially those in Odogunyan, Igbogbo, and Adamo being cheapest. To compare, a 3 bedroom flat in Ikorodu will cost on average about 439 k, while in Ikeja, it would cost almost 4 times as much, at 2.4M.

Rent Prices — Island vs Mainland per No. of Bedrooms (Limited to 6 beds or less)

For every bedroom size, Island properties are averagely more expensive than mainland properties, the difference is most obvious around 2 -3 bedroom houses, which are over 60% more expensive, but this difference seems to decrease with a higher number of rooms. (although the comparable mainland properties are located in the mainland’s most expensive area, Ikeja GRA). Hovering over the outliers for each bedroom type, it’s clear to see that Ikoyi prices are outliers even compared to other island prices!

To wrap up my exploratory data analysis, I’ll explore the relation each variable has with the price of the house by using a heatmap to display how correlated the variables are. Variables with high positive correlations of 1 would indicate that they generally follow the same trend, i.e. when one goes up, so does the other. For negative correlations, when one goes up, the other goes down (think Michael Scott’s happiness and Toby Flenderson’s existence). The strength of this correlation is how close the values are to 1 or -1…

Some takeaways are:

  • House Price is most correlated with the number of bedrooms and bathrooms, which are also very correlated with each other, which makes sense- bigger houses, more bedrooms, and bathrooms (possibly en-suite rooms)
  • For all properties, distance to Third Mainland has a weakly negative correlation of -0.35 with the price of houses, suggesting on both ends, properties closer to the third mainland bridge, such as Yaba(mainland) and Ikoyi (island) would have higher prices than much farther areas.
  • Indoor pools and gyms obviously cost you a pretty penny, are positively correlated to whether a property is serviced, and even though serviced apartments seem not to affect the price as much, that’s because in listings the rent is mentioned in the listing separately first before service, caution and other such charges are tacked on.
  • Finally, a few recommendations for my compatriots. If you’re looking to move out into a single bedroom house, and perhaps work on the Island but want to live on the mainland to save costs, consider looking for properties in Soluyi, Bariga, Gbagada, Fola-Agoro, Abule-Oja, or Akoka as they are a good combo of shorter distances to the third mainland and moderate prices
  • If you’re determined to be on the Island- Onosa, Bogije, Olokonla, Oke-ira, and Awoyaya may be your best bet for a cheaper Island one-bedroom house. I encourage you to definitely explore the dataset yourself for your particular needs and visit the Nigerian Property Centre website as they have a crop of good analytics and past trends of houses in most areas.
  • As house hunting is a deeply personal process and some features that may be very important to one person may not be as important to another, I’ve included a final heatmap of the average house properties by locality normalized over the average properties of the entire dataset. A “high” on this heatmap would mean that properties in that locality exhibit higher values/proportion of the feature than the average over the entire dataset (eg properties in Nicon Town were added on average 200 days ago compared to the average number of days since added which was 57 so that rectangle is yellow i.e. high )Where no property exists in that locality for comparison, the rectangle is greyed out — if the locale was not provided, then I defaulted to the values for Area
Heatmap across Island Localities

I would have loved to get a lot more features to analyze, for example, flooding rates per area, average electricity provision per day, area crime rate, and a lot more on the actual conditions of the house themselves and see how those could have fed into a model to predict house prices, but that data was not readily available (and believe me I checked). I do however hope this has been helpful in realizing the current stats in the market and in some ways guide your decision-making about your rental home.

If you have gotten this far… then you might as well connect with me on LinkedIn or Instagram

--

--

Atonye Nyingifa

Data Analyst, Storyteller, Unrequited Love Poems Writer, African, Avid Binger of the Office and World Traveler (once the panini is over :) ))