Prakhar Uniyal (@PrakharUniyal)

Design your own Google Map

Introduction

I started planning my google map application by looking at the UI design of the real one. A lot of basic stuff can be figured out just by taking a close look at their interface. For example, each instance of app usage is directed as a reference to a specific location on the map which is given by x, y and z co-ordinates. From there on I have used my reasoning and ideas to come up with a half baked implementation of a similar application for pathfinding in real world.

Stored Data

We know that the best practice to store geographical maps and other such kinds of data is to model them as a graph data structure. But the graph data structure has constraints of its own and the most important factor in determining those constraints is the selection of node structure for the graph.

In the most basic terms, the challenge here is pathfinding between two locations on map in a given time limit, so let's considering making every location as a node of our graph. From analysis of UI, we can see that google maps provides a precision of upto 7 decimal places in location co-ordinates which is "good for much surveying and is near the limit of what GPS-based techniques can achieve"[1] but it means that our graph will have to be a matrix with around 10⁹ rows and columns, thus 10¹⁸ nodes. As far as I know, no pathfinding algorithm can work in a decent time limit without wasting a ridiculous amount of memory space for this kind of graph so this will definitely not work.

An improvisation would be to consider blocks of fixed and larger surface area as nodes and collectively store their information. As a lot of area in these blocks will have sparse information linked to it, it will increase our space efficiency.

Now this brings us to an important conclusion. The main utility of this interface is not to find path between every location on earth but to find paths between those locations which:

Have a real path between them.
Have a good probabiltiy of being travelled to and fro by a large number of users.

So this means that the most important part of our graph needs to be the roads(or routes in general, if considering air and sea travel also). If we can successfully store the information of these roads and use it to solve the challenge of pathfinding then we are pretty much done.

The sum of length of roads across the planet is around 64 million kilometres[2] which means that if we consider roads as a chain of 100 metre road sections then we can easily store them as a graph. Each node will be road object which will have the following attributes:

class road
{
    vector<roadSection*> roadSectionChain;
    vector<road*> connectedRoads;
}

class roadSection
{
    location* sectionLocation;
    float trafficFactor;
}

Here connectedRoads is the list of roads that are connected to the given road at one of its two endpoints. Normally the number of connected roads at one endpoint would vary from 2 to 7. Obviously, collecting all this data would require help from local workforce and authorities. Also we will need to map all the roadsections to their corresponding road.

The second one of the two utility points discussed earlier requires us to further optimize the user experience. Normally user provides the name of locations to and from which he has travel and not just the name of roads. So there need to be landmark objects which are accessible via the roads and can be used as references for the locations to be travelled between. A detailed layout for these objects would look like this:

// Large areas such as country, states, cities, blocks
class region
{
    string name;
    string detail;
    vector<imageObj*> imgList;
    location* geoloc;

    roadSection* roadAccessPoint;

    region(...);
}

// Offices, hospitals, banks, parks, markets, tourist spots
class place :: public region
{
    float rating;
    business_t* businessType;
    string address;
    contact* contactInfo;
    vector<review*> reviews;
    timing* timingData;

    // Road section which will be considered for pathfinding
    roadSection* roadAccessPoint;

    place(...);
    vector<business*> findNearby(business_t* businessType);
}

This looks fine, but what if the user wants to find path from his current location which may not be one of the landmarks that we have record of. In that case, we will need to find the closest roadSection to him which can easily be done using a binarysearch function on the list of all roadSection locations.

So this is basically everything that we need to represent the map as a graph data struture and find paths between certain locations. This approach is heavily inspired from the SIF and GDF geospatial data exchange formats[3].

Apart from this, the application may require a lot more data such as terrain information and satellite images to enhance the user experience.

Preprocessing

Now that we have our graph it is time to start experimenting with the various pathfinding algorithms. Most of the google searches would give the name of A*[4]. However, going deeper one will find that A* algorithms actually take a lot of space for execution as they have to store all the nodes. However a workaround for this can be to determine the distance between start and end nodes. As it is evident, the number of roads connecting cities and states is relatively sparse as compared to an intra city road network. We can achieve this optimization by determining an approximation of path length and working with roads classified for that length only.

Another great advantage can be of the low variability in traffic factor of roads. Traffic data is determined through sensors and local data annually[5] and it remains predicatable for an appreciable amount of time so most frequently used routes may be stored to predetermine the best route among gives routes at a given time in the year to provide quick results for cases where these routes are treated as a sub path.

Computing Quickest/Shortest Path

As already discussed, we will use the A* algorithm for pathfinding whose performance is determined by a heuristic function. Usually people take the manahattan distance/diagonal distance between next node(n) and end node as the heuristic function so we will also do the same.

If we do not use any heuristics, the performance of our application would certainly degrade however it will still be better than the Dijkstra algortihm[6].

For rerouting in case of detour we may need something like the ARA* algorithm to provide a temporary less optimal path which can then be replaced by a solution provided by the original A* algorithm for the rest of the path.

Efficiency and Feasibility

Theoretically, our application may perform well and good as all it has been developed keeping all the space-time constraints in mind. However, the real world implication will need much more optimizations and enhancements to provide a not bad user experience.

References

[1] Answer to "Measuring accuracy of latitude and longitude?" by whuber

[2] List of countries by road network size

[3] Map database management

[4] Which path finding algorithm is used by google maps

[5] How does google maps predict traffic

[6] Dijkstra’s Algorithm versus Uniform Cost Search

Conclusion

Google maps is very complex application which stands on the shoulders of great computer scientists who developed the algorithms and technologies that it uses. It is no simple task to compete with what they have done yet it is this spirit that fuels the thrust to find awesome solutions to such complex real world problems.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

B18128.md

B18128.md

Prakhar Uniyal (@PrakharUniyal)

Design your own Google Map

Table of Contents

Introduction

Stored Data

Preprocessing

Computing Quickest/Shortest Path

Efficiency and Feasibility

References

Conclusion

Files

B18128.md

Latest commit

History

B18128.md

File metadata and controls

Prakhar Uniyal (@PrakharUniyal)

Design your own Google Map

Table of Contents

Introduction

Stored Data

Preprocessing

Computing Quickest/Shortest Path

Efficiency and Feasibility

References

Conclusion