Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pefromance enhancements #33

Closed
wants to merge 5 commits into from
Closed

Pefromance enhancements #33

wants to merge 5 commits into from

Conversation

mem48
Copy link
Collaborator

@mem48 mem48 commented Feb 1, 2024

@Robinlovelace some work in progress

The goal is faster performance on large datasets.

Changes are:

  1. New function points_to_od_maxdist that uses nngeo to get the nearest neighbours rather than creating the full matrix. Also adds support for projected coordinates and the ability to look for the nearest X regardless of distance. Could be useful when mixing rural and urban areas, where say 5000m is a long way to go for a shop in an urban area but a short distance in a rural area. Should be much faster for very large numbers of origins and destinations and less likely to run out of memory.
10,000 * 10,000 LSOAs with max_dist = 5000 took 29 seconds
35,672 * 35,672 LSOAs with max_dist = 5000 took 4.9 minutes
  1. Tweaked si_calculate and si_predict that use data.table and avoid copying data when possible. Slight breaking change as constraint_production now needs to be a quoted character. I couldn't figure out the dplyr syntax, so welcome suggestions on a fix.

Example

nrow(od)
[1] 3627616
t1 = Sys.time()
od_res = si_calculate(
   od,
   fun = gravity_model,
   constraint_production = "origin_all",
   d = distance_euclidean,
   m = origin_all,
   n = destination_all,
   beta = 0.9
)
 t2 = Sys.time()
 difftime(t2, t1)
Time difference of 0.576057 secs

@Robinlovelace
Copy link
Owner

New function points_to_od_maxdist that uses nngeo to get the nearest neighbours

Big 👍 to use of nngeo.

@Robinlovelace
Copy link
Owner

Also adds support for projected coordinates and the ability to look for the nearest X regardless of distance.

Shouldn't that functionality be in the {od} package, easy to be upstreamed?

@Robinlovelace
Copy link
Owner

10,000 * 10,000 LSOAs with max_dist = 5000 took 29 seconds
35,672 * 35,672 LSOAs with max_dist = 5000 took 4.9 minutes

🚀

@mem48
Copy link
Collaborator Author

mem48 commented Feb 1, 2024

Also adds support for projected coordinates and the ability to look for the nearest X regardless of distance.

Shouldn't that functionality be in the {od} package, easy to be upstreamed?

Possibly yes, I was working here to get things working but file could easily be moved

Copy link
Owner

@Robinlovelace Robinlovelace left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actions not passing for some reason. Happy with the ideas and code here but needs to be tidied up and not break the checks before merging.

@Robinlovelace
Copy link
Owner

This belongs upstream: itsleeds/od#18

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants