-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance/optimization opportunities #13
Comments
One lookup instead of two is in the class of constant time improvements and would probably result in some very marginal gains, if any. I would skip optimizing this kind of thing we have datasets with measurable performance strain. Leveldb (and IndexedDB using leveldb) is all in-process, so extra lookups are much faster than with a database that uses a remote backend like an external relational database would be. Also, the performance profile of leveldb can be quite counter-intuitive with the aggressive caching it does. One benefit to having separate structures for each index is that we can modify the indexes more easily in the future and. I think one thing that can be done here to avoid the extra reference lookups is to use a multi-dimensional interval tree for ways and relations and points would live in a kdb-tree. Queries on these two structures would spawn in parallel for any bounding box query. quadtiles are simple, but have balancing problems and generalizing updates is very complicated. We don't need to do any of this because kdb trees have much better asymptotic guarantees and we don't need to worry about how to build an adaptive bounding volume hierarchy. I think a much simpler way to accomplish multiple zoom levels would be to skim off certain tags into a separate kdb index that will be used at higher zoom levels. We can pass complicated ways and relations into simplify-geometry. Then instead of generating static tiles, you can quickly generate an ad-hoc view of the data that is relevant to the zoom level. Tiles are a manifestion of a few different dynamics: the data structures commonly used to store geospatial data such as quadtiles or rtrees, and also the typical read/write profile of a map server. Map services have many orders of magnitude more reads than writes, so it makes sense that they should want to cache all database queries into static files. However, if we have an editor where reads and writes are somewhere near the same order of magnitude, that tradeoff stops making as much sense. If we have an editor running on a mobile device, a whole new tradeoff emerges: minimizing CPU load and background processing to save on battery life. Building static tiles will probably involve lots of extra background processing and cache invalidation and I don't think that will scale very well for low-powered devices. With dynamic queries using the existing spatial indexes, we can generate the tile formats that different existing viewers expect without needing all the extra processing and complexity of toolchains designed around high read/write ratios for hosted services. If those queries are too slow, we can put them in an LRU cache them which ought to give the same performance gains as tiles without all the additional moving parts. |
As ever learning a lot from these answers to my many questions... Re. simplification, also relevant to digidem/osm-p2p-server#6: presimplification algorithms from d3.topojson and geojson-vt, although for rivers and "natural" lines I always find visvalingam simplification to give better results, implemented in JS in Mapshaper |
We're probably too early to start thinking about optimizations, but these ideas/questions were on my mind so leaving this here for discussion.
The text was updated successfully, but these errors were encountered: