-
Notifications
You must be signed in to change notification settings - Fork 0
Data: Qualities of NSW Spatial LOT Layer
Note they way they format lot is like so
-
1234/1234/DP12345
- Note deposit plans are prefixed with
DP
, which we do not do innsw_lrs.parcel
.- Perhaps we should tho, the reason we haven't was largely based on it never really appearing in the land value data
- Or Perhaps we shouldn't as our format is also used by the NSW LRS.
- Note deposit plans are prefixed with
-
1234//DP12345
- note they don't drop the
/
when there is no section
- note they don't drop the
-
//SP12345
- note this how the specify a strata lot.
Then there are other shapes like this, these appear inconsistent with the other shapes.

There seems to be at most 2 of each lot_id_string
, based on this query:
Query | result |
---|---|
|
![]() |
This is what I saw when I looked at these lots
field | before | after | description |
---|---|---|---|
{modified,start,created}_date & last_updated
|
2019-03-26 02:21:46 | 2025-01-14 23:18:46 | clearly one piece of data was newer than the other |
shape_area | 13259.99992434575 |
13260.000083243887 |
this looks like different rounding? |
shape_length | 612.9018429031083 |
612.9018432012755 |
this also looks like different rounding? |
-
1//DP71431
, the changes are identical to the ones I mentioned above. -
2//DP246207
, same
lot | before | after |
---|---|---|
1//DP1087606 | ![]() |
![]() |
1//DP71431 |
![]() |
![]() |
After checking the 2 above rows, I stopped taking screenshots as it really seemed like each of these were just updates with different rounding.
I think these are just same pieces of information re-entered and they just have a bit of floating point variation. I figured we can test this by seeing variation between the new and old values, however after testing it with the below query it turns out there is none trivial variance in some cases.
Query | result |
---|---|
|
![]() |
Perhaps these are subdivisions?
I ran this query and there appears to be 106 examples of lot ids with variance greater than 0
select COUNT(*)
from (
select lot_id_string, (MAX(shape_area) - MIN(shape_area)) AS area_variance
from nsw_spatial_lppt_raw.lot_feature_layer
group by lot_id_string
order by 2 DESC) t
where area_variance > 0;
Here's how I choose to render everything
import geopandas as gpd
import matplotlib.pyplot as plt
from matplotlib.patches import Patch
df = gpd.read_postgis("""
select l.*, ROUND(v.area_variance::numeric, 2) as area_variance
from (
select lot_id_string, (MAX(shape_area) - MIN(shape_area)) AS area_variance
from nsw_spatial_lppt_raw.lot_feature_layer
group by lot_id_string
order by 2 DESC) v
left join nsw_spatial_lppt_raw.lot_feature_layer l using (lot_id_string)
where area_variance > 0
order by area_variance DESC, l.last_update DESC
""", db.engine(), geom_col='geometry')
lot_id_strings = df['lot_id_string'].unique().tolist()
resolution = 8
plot_cols, plot_rows = 11, 11
for offset in range(0, len(lot_id_strings), plot_cols * plot_rows):
figsize = (plot_cols * resolution, plot_rows * resolution)
fig, ax = plt.subplots(plot_cols, plot_rows, figsize=figsize)
for subplot in range(0, min(plot_cols * plot_rows, len(lot_id_strings) - offset)):
ax_col = subplot % plot_cols
ax_row = subplot // plot_rows
sub_ax = ax[ax_row][ax_col]
lot_id = lot_id_strings[offset + subplot]
sub_df = df[df['lot_id_string'] == lot_id].reset_index(drop=True)
sub_df.iloc[[0]].plot(ax=sub_ax, color='none', edgecolor='red', hatch='///')
sub_df.iloc[[1]].plot(ax=sub_ax, color='none', edgecolor='blue', hatch='\\\\\\')
sub_ax.set_title(lot_id)
sub_ax.set_xlabel(f"Variance {sub_df.loc[0, 'area_variance']}", loc='left')
sub_ax.legend(handles=[
Patch(facecolor='white', edgecolor='red', label=sub_df.loc[0, 'start_date']),
Patch(facecolor='white', edgecolor='blue', label=sub_df.loc[1, 'start_date']),
], loc='lower right')
plt.show()
And this is the result.
It's only really the top rows that have any significant difference. The rest seem to be very minor differences.
I think I'll just opt for the latest row.
Do we want more than 1 table? Should we restructure how we're currenting storing lots to reflect the fact that land is either part of a deposit lot or a strata plan with any number of lots?