-
Notifications
You must be signed in to change notification settings - Fork 0
Nice‐to‐have: Project idea list
These are projects I haven't been able to do yet because I'm still trying to add more data sources.
- Land value to density map
- Land value to equity map
It probably ties into #9, but it would be good to find a way to compare land independent of zoning
Might be worth reading
-
Zoning Effect on house prices, RBA, on page 3 of this paper they explicitly mention this is not a problem they solve, and mention the below paper - Urban Structure and Housing Prices: Some Evidence from Australian Cities
So far I've identified a means to accomplish this from page 8 of the Zoning Effect paper. Later i'll look into a means to accomplish this.
What i really want is a means is to compare value of land on an site independently of the size of the site is attached too. Which should help make visualisations within an LGA where people want to live based on the assumption the land value has priced all preferences in.
Here is the sites in the CBD ranked by $ per sqm
, look at the area of these sites.

While there are cases of data issues in the valuer general in the data, when you sort the sites within in sydney by land value by SQM, it seems like smaller sites tend to rank higher on a basis sqm basis.
I think this is because there's a marginal rate at which land increase in value where the next meter will be worth than the last. As you increase the size of land more types of projects become viable, but any meter you add after the earlier meters does nothing for the projects that viable regardless these extra meters.
If we had the shapefile for every lot this wouldn't actually be a problem, but because we are aggregating by meshblock, a single Telstra phone booth can inflate the aggregated value of the meshblock if it's something like max
, or mean
or something.
If you wanted to compare the value of land in Sydney independently of size of the lot, and you're grouping by things like meshblock. Aggregations like max or mean will be heavily skewed by this if you have a random 1x1 Telstra phone booth, which is the case in a few places in the CBD.
It would be nice to be able to weigh each meter independent of the size of the loot it's actually too, but maybe that is fanciful.
So far I've included something like this in my land value aggregations, but it's fairly arbitrary and dishonest
CASE
WHEN p.area < 10 THEN 10
ELSE p.area
END
Note, this isn't used in the data ingestion process but instead in the other notebooks where I've been trying to visualise the data.
- This will allow for the creation of a visualisation of different LGAs that show where the most valuable land is that isn't skewed by small sites.
- Maybe get a distribution of land values in an area by doing following for each site
With that population of land values you can see the distribution of land values, I'm honestly unsure what the most sensible way to do this is...
def marginal_value_of_land(valuation, nth_meter): # The paper says this # log(sale price) = c + b log(land area) + aX + e # # it would be neat to do something like # (b log(nth_meter)) - (b log(nth_meter - 1)) # # I don't even know if it makes sense to do that... # Possibly useless. maybe this is more reasonable # b log(1) pass def population_of_all_land_values(valuations): for v in valuations: for nth_meter in range(0, v.sqm_area): yield marginal_value_of_land(v, nth_meter)
- It's possible this methodology for comparing land by these aggregations is flawed and I should look at other methodologies.
- It's possible there some kind of coefficient you can figure out from hedonic pricing models?
- this problem is a problem because I'm aggregating multiple properties by mesh blocks, if I had the shape files for the actual properties this wouldn't be a problem
It's entirely possible I'm looking at this all wrong, I think first, it's best to establish a better understanding of the nature of things first before proposing a fix. Let's see what research says about it.
Consider reading the RBA paper, on Zoning Effect
-
20240912
- page 2, there's immediate mention of a "marginal value of land"
- page 5, mentions it's worth noting since lands may be deflated as sometimes land owners despite the high valuations of their land by not lower.
- page 8 (web link), here marginal value of land is explicitly mentioned.
- this relation was shown
log(sale price) = c + b log(land area) + aX + e
- Is it possible to use this with substitution to get the marginal value of land?
- this relation was shown
It would be great to have some helper functions to render the zoning
filed in the mesh blocks. It's not as accurate as actually planning data from each state, but it could be to do in the mean time. And I'm sure the matplotlib
code can be reused for the actual zoning stuff.
The mb_cat
filed in mesh blocks isn't great, but it's kind of fun to do. But the long term solution is figuring out how to map the zoning from the valuer general data on too these shape files, or shape files we end up using for properties.
- #13
- #8
You could do something like this. Fetch data like this
SELECT mb.mb_cat, mb.geometry as geom
FROM non_abs_main_structures.lga_2024 lga
RIGHT JOIN abs_main_structures.meshblock mb ON ST_Intersects(mb.geometry, lga.geometry)
WHERE lga.lga_name ILIKE 'Sydney'
AND (ST_Area(ST_Intersection(lga.geometry, mb.geometry)) / ST_Area(mb.geometry)) > 0.1
Then use helpers like this to render the legend and plot.
from collections import defaultdict
mb_cat_facecolor = {
'Commercial': '#0000ff',
'Residential': '#ff9999',
'Education': '#66ff66',
'Hospital/Medical': '#9933ff',
'Industrial': '#ffff00',
'Parkland': 'white',
'Water': 'white',
'Transport': 'white',
'Other': 'white',
}
mb_cat_edgecolor = {
'Commercial': None,
'Residential': None,
'Education': 'white',
'Hospital/Medical': 'white',
'Industrial': 'black',
'Parkland': '#00ff00',
'Water': '#0000ff',
'Transport': '#ff0000',
'Other': 'black',
}
mb_cat_hatch = defaultdict(lambda: None, {
'Education': '//',
'Hospital/Medical': '//',
'Parkland': '//',
'Water': '//',
'Transport': '//',
'Other': '//',
})
def render_zones(ax, df, col):
cats = list(df[col].unique())
for c in cats:
df[df[col] == c].plot(
ax=ax,
facecolor=mb_cat_facecolor[c],
edgecolor=mb_cat_edgecolor[c],
hatch=mb_cat_hatch[c],
alpha=mb_cat_alpha[c],
)
def render_zoning_legend(ax, df, col):
from matplotlib.patches import Patch
cats = list(df[col].unique())
ax.legend(
handles=[
Patch(label=c,
facecolor=mb_cat_facecolor[c],
edgecolor=mb_cat_edgecolor[c],
alpha=mb_cat_alpha[c],
hatch=mb_cat_hatch[c])
for c in cats
],
title='Zones',
loc='lower left',
fontsize='large',
title_fontsize='x-large',
)
