diff --git a/content/algorithms/Dijkstra/Dijkstra.md b/content/algorithms/Dijkstra/Dijkstra.md deleted file mode 100644 index 96a66f1d..00000000 --- a/content/algorithms/Dijkstra/Dijkstra.md +++ /dev/null @@ -1,241 +0,0 @@ ---- -jupyter: - jupytext: - text_representation: - extension: .md - format_name: markdown - format_version: '1.3' - jupytext_version: 1.13.8 - kernelspec: - display_name: Python 3 - language: python - name: python3 ---- - -# Shortest path with Dijkstra's algorithm - - -When it comes to finding the shortest path for a weighted graph, Dijkstra's algorithm has always been everyone's favourite. In this notebook, we'll get to know how it works and is implemented. Shortest path problem is a graph problem where the aim is to find a path between 2 nodes having the minimum distance covered. - - -## Shortest Path Problem - - -Let's say you want to travel from Delhi(DEL), India to London(LCY), UK via flights that have various routes with different stops, namely, Frankfurt(FRA), Zurich(ZRH), Amsterdam(AMS), Geneva(GVA) and Dublin(DUB). Now, you want to find the shortest path as you are in a hurry and want to get to London as soon as possible.
-An important thing to know is that any subpath from C $\rightarrow$ E of the shortest path A $\rightarrow$ E is also the shortest path from node C to node E. That means not only one will get the shortest path from Delhi to London but also to other stops from Delhi. - -ASSUMPTIONS -- Distance taken is imaginary. -- No waiting time at airports. -- The shortest distance in this problem means shortest time costing. -- Speed is considered to be uniform -- Scale : 1 unit = 1000kms - -So, the following directed graph describes all paths available with the distance between them. - -```python -#importing libraries -import networkx as nx - -flight_path= nx.DiGraph() - -flight_path.add_nodes_from(['DEL', 'AMS', 'DUB', 'ZRH', 'FRA', 'LCY']) - -flight_path.add_weighted_edges_from([('DEL', 'ZRH', 5), ('DEL', 'FRA', 6), ('DEL', 'DUB', 7), ('ZRH', 'LCY', 6), - ('FRA', 'LCY', 3), ('AMS', 'LCY', 5),('DUB', 'LCY', 4), ('AMS', 'FRA', 1), - ('DUB', 'AMS', 2), ('ZRH', 'GVA', 3), ('GVA', 'LCY', 1)]) - -pos= nx.planar_layout(flight_path) - -# drawing customised nodes -nx.draw(flight_path, pos, with_labels=True, node_size=1300, node_color='maroon', font_color='white') - -# adding edge labels -nx.draw_networkx_edge_labels(flight_path, pos, edge_labels = nx.get_edge_attributes(flight_path, 'weight')); -``` - -## Dijkstra's Algorithm - -Dijkstra's algorithm is used to find the shortest path between nodes or commonly from one source node to every other node in the graph, where edge weight represents the cost/ distance between 2 nodes in the case of a weighted graph. It can work with both directed and undirected graphs, but it is not suitable for graphs with NEGATIVE edges.
-Time complexity of Dijkstra's algorithm is $O(\ V^{2})$, but with minimum priority queue, it comes down to $O(\ V + E\text{ log } V\ )$ - -### Algorithm - -1. Convert your problem into a graph equivalent. -2. Create a list of unvisited vertices. -3. Assign the starting point as source node with distance(cost)= 0 and other node's distance as infinity. -4. For every unvisited neighbour, calculate the minimum distance from the current node. -5. The new distance is calculated as `minimum(current distance, distance of previous node + edge weight)` -6. When all the neighbours have been visited, remove the node from the unvisited list and select the next node with the minimum distance. -7. Repeat from step 4. -8. The final graph will represent all the nodes with minimum distance and the algorithm will end. - - -Let's look at the example of the directed graph mentioned above. But, before moving forward, here are some things one should keep in mind. In the following graphs, edge weight defines the distance between 2 nodes, black edge represents unvisited edges, red represents edges that are being traversed, and green represents visited edges. Let's begin!! - - -According to Dijkstra's algorithm, -- First, assign all stops(nodes) infinite values except the source node (DEL in this case as the path starts from Delhi), which is assigned a value of 0. This is because the distance one needs to cover to reach other nodes is assumed to be unknown and, hence maximum value possible is being assigned. (fig. 1) -- Dijkstra is based on the greedy approach, which means one needs to select the node with the minimum distance, which is DEL having a distance of 0 units, and this approach is being followed in the whole process. -- The next step is to start traversing the neighbours of DEL and update the distance of all neighbouring nodes as shown in fig. 2. While updating the distance, always keep in mind that the updated distance should be `minimum(current distance, distance of previous node + edge weight)`. Like, - - DUB : `min(infinity, 7) = 7` - - FRA : `min(infinity, 6) = 6` - - ZRH : `min(infinity, 5) = 5` - - -![Figure 1&2](Graphs/figure1_2.png "Step 1 and Step 2") - - -- Now, pick the next unvisited node with the minimum distance value. ZRH has the minimum distance (5 units), so it's time to update its neighbour's (LCY, GVA) distance.(fig. 3) - - LCY : `min(infinity, 5+6) = 11` - - GVA : `min(infinity, 5+3) = 8` -- Similar to the previous step, the next unvisited node with minimum distance is FRA (6 units).Hence, update its neighbours. (fig. 4) - - AMS : `min(infinity, 6+1) = 7` - - LCY : `min(11, 6+3) = 9` - - -![Figure 3&4](Graphs/figure3_4.png "Figure 3 and Figure 4") - - -- Here, 2 nodes are left with minimum distance (7 units), AMS and DUB. So, let's update their neighbours one by one.(fig. 5) - - DUB : - - AMS : `min(7, 7+2) = 7` - - LCY : `min(9, 7+4) = 9` - - AMS: - - LCY : `min(9, 7+5) = 9` -- Among the last 2 nodes, our destination is LCY. So, last update is for GVA's neighbour. - - LCY : `min(9, 8+1) = 9` - -Figure 6 shows the final graph with shortest distance to each node from DEL(source node) and it comes out that the shortest distance to LCY from DEL is 9 units which have 2 paths:
-- (DEL $\rightarrow$ FRA $\rightarrow$ LCY)
-- (DEL $\rightarrow$ ZRH $\rightarrow$ GVA $\rightarrow$ LCY) - -So, one can take any of these paths to reach as soon as possible. But, in case there are more than one path like in this situation, dijkstra's algorithm returns the shortest path with minimum number of edges. - - -![Figure 5&6](Graphs/figure5_6.png "Figure 5 and Figure 6") - - -## NetworkX Implementation - - -In the previous example the number of nodes were less and it wasn't that complicated, but, in real life problems, there can be a lot of nodes and thus, it is needed to maintain proper record. Let's see the example in NetworkX implementation. - -```python -#importing required libraries -from heapq import heappush as push -from heapq import heappop as pop -from itertools import count - -''' -So, the graph 'flight_path' is already defined with all nodes and edges. Now, before implementing the algorithm, -one first needs to make the data available in the proper format to access. - -The first thing to do is to have a dictionary with every node as keys and (connecting node, weight) as values so that one can -traverse easily. -''' -flight_succ = flight_path._succ if flight_path.is_directed() else flight_path._adj - -# we need to extract the distance between 2 nodes from the graph and for that we need to define weight function -def _weight_function(G, weight): - return lambda u, v, data: data.get(weight, 1) - -weight = _weight_function(flight_path, "weight") - -''' -The next step is to define various dictionaries to store and track all nodes path. -''' -dist = {} # dictionary of final distances -seen = {} # dictionary of visited nodes with recent shortest distance -paths= {} # dictionary to store path list - -# fringe is heapq with 3-tuples (distance,c,node) -# use the count c to avoid comparing nodes (may not be able to) -c = count() -fringe = [] - -# we want to find the shortest distance from DEL to LCY -source='DEL' -target='LCY' -paths[source]=[source] - -# Now, as I said earlier we'll assign the source node 0 value -seen[source] = 0 -push(fringe, (0, next(c), source)) - -''' -It's time to start traversing the graph starting from source node. -''' - -while fringe: - (d, _, v) = pop(fringe) # d will store the distance of the node and v will store the node name - if v in dist: - continue # already searched this node. - dist[v] = d - if v == target: - break - - # traversing the neighbours of the node - for u, e in flight_succ[v].items(): - distance = weight(v, u, e) - - if distance is None: - continue - # vu_dist stores the total distance from source node to u. Like, if v=ZRH and u= GVA, then vu_dist = 8 - vu_dist = dist[v] + distance - - ''' - If u is already in dist then there can be 2 cases, either the graph has negative cycle or there might - be another shortest path to u. - ''' - if u in dist: - u_dist = dist[u] - if vu_dist < u_dist: - raise ValueError("Contradictory paths found:", "negative weights?") - - # updating the new shortest distance and adding the next node to visit - elif u not in seen or vu_dist < seen[u]: - seen[u] = vu_dist - push(fringe, (vu_dist, next(c), u)) - if paths is not None: - paths[u] = paths[v] + [u] - -# printing the distance and path from the source node 'DEL' to target node 'LCY' -print(dist[target], paths[target]) -``` - -Don't worry, you don't need to write all this code again and again. NetworkX got you covered!! So, NetworkX provides a lot of functions with the help of which one can actually find the [shortest path](https://networkx.org/documentation/stable/reference/algorithms/shortest_paths.html) based on their needs. - -All functions using dijkstra's algorithm are similar, but, for this example the most suitable is [single_source_dijkstra()](https://networkx.org/documentation/stable/reference/algorithms/generated/networkx.algorithms.shortest_paths.weighted.single_source_dijkstra.html#networkx.algorithms.shortest_paths.weighted.single_source_dijkstra). It comes out that this function actually gives the same output as the one calculated in the above example. - -```python -nx.single_source_dijkstra(flight_path, 'DEL', target='LCY', weight="weight") -``` - -## Applications of Dijkstra's Algorithm - -It is used as a part of applications to find the shortest path if required. There are other factors as well which are considered in every application while implementing Dijkstra's algorithm. Like, -- In special drones or robots for delivery service, it is used as a part to identify the shortest path possible. -- One of the most common use case is Google Maps. It helps to find the best route possible in shortest time. -- In social media applications, for smaller graphs it can be used effectively to suggest the "people you may know" section. -- As the above example, it can be used in a software which calculates and informs the estimate arrival time, best route etc. of a flight to a user. -- It is used in IP routing to find Open shortest Path First. -- It is used in the telephone network. - - -## Advantages and Disadvantages of Dijkstra's Algorithm - -ADVANTAGES - - Once it is carried out, we can find the shortest path to all permanently labelled node. - - Only one diagram is enough to reflect all distances/paths. - - It is efficient enough to use for relatively large problems. - -DISADVANTAGES -- It cannot handle negative weights which leads to acyclic graphs and most often cannot obtain the right shortest path. -- It is a greedy algorithm that means it is possible for the algorithm to select the current best option which can make the algorithm get sidetracked following a potential path that doesn’t exist, simply because the edges along it form a short path. - - -#### Reference - -Shivani Sanan, Leena jain, Bharti Kappor (2013). (IJAIEM) "Shortest Path Algorithm"
-https://www.ijaiem.org/volume2issue7/IJAIEM-2013-07-23-079.pdf diff --git a/content/algorithms/Dijkstra/Graphs/figure1_2.png b/content/algorithms/Dijkstra/Graphs/figure1_2.png deleted file mode 100644 index bc64151f..00000000 Binary files a/content/algorithms/Dijkstra/Graphs/figure1_2.png and /dev/null differ diff --git a/content/algorithms/Dijkstra/Graphs/figure3_4.png b/content/algorithms/Dijkstra/Graphs/figure3_4.png deleted file mode 100644 index 974eeed5..00000000 Binary files a/content/algorithms/Dijkstra/Graphs/figure3_4.png and /dev/null differ diff --git a/content/algorithms/Dijkstra/Graphs/figure5_6.png b/content/algorithms/Dijkstra/Graphs/figure5_6.png deleted file mode 100644 index 78097855..00000000 Binary files a/content/algorithms/Dijkstra/Graphs/figure5_6.png and /dev/null differ diff --git a/content/algorithms/MST Prim/MST Prim.md b/content/algorithms/MST Prim/MST Prim.md new file mode 100644 index 00000000..ab64a0d2 --- /dev/null +++ b/content/algorithms/MST Prim/MST Prim.md @@ -0,0 +1,346 @@ +--- +jupyter: + jupytext: + text_representation: + extension: .md + format_name: markdown + format_version: '1.3' + jupytext_version: 1.13.8 + kernelspec: + display_name: Python 3 + language: python + name: python3 +--- + +# Minimum Spanning Tree using Prim's Algorithm + + +We all have travelled by road from one city to another. But, ever wondered how they decided where to create the route and what path to choose? If one will get the job to connect 5 cities via road, then, the naive approach will be to start connecting from one city and continue doing that until one covers all destinations. But, that's not the optimal solution, not only one should cover all nodes but also at the lowest cost possible (minimum length of the road) and for that Minimum Spanning Trees are constructed using Prim's and Kruskal's algorithm. + +When each and every node of a graph is connected to each other without forming any cycle, it is known as the Spanning Tree. A graph $G$ having $n$ nodes will have spanning trees with $n$ nodes and $n-1$ edges. Thus, as its name indicates, Minimum Spanning Tree is the tree with the shortest possible distance covered among all other spanning trees.
Let's look at the following example to understand this better. + + +## Minimum Spanning Tree Problem + + +Suppose there are 5 cities (A, B, C, D and E) that needs to be connected via road. Now, there can be more than one path connecting one city to another but our goal is to find the one having the shortest distance. + +ASSUMPTION: Distance taken is imaginary. + +The following graph depicts our situation in this case. + +```python +# importing libraries +import networkx as nx +import matplotlib.pyplot as plt + +roads = nx.Graph() + +# adding weighted edges +roads.add_weighted_edges_from( + [ + ("A", "B", 1), + ("A", "C", 7), + ("B", "C", 1), + ("B", "D", 4), + ("B", "E", 3), + ("C", "E", 6), + ("D", "E", 2), + ] +) + +# layout of the graph +position = {"A": (0.5, 2), "B": (0, 1), "C": (1, 1), "D": (0, 0), "E": (1, 0)} +pos = nx.spring_layout( + roads, pos=position, weight="weight", fixed=["A", "B", "C", "D", "E"] +) +fig = plt.figure(figsize=(5, 5)) + +# drawing customised nodes +nx.draw( + roads, + pos, + with_labels=True, + node_size=900, + node_color="#DF340B", + font_color="white", + font_weight="bold", + font_size=14, + node_shape="s", +) + +# adding edge labels +nx.draw_networkx_edge_labels( + roads, + pos, + edge_labels=nx.get_edge_attributes(roads, "weight"), + font_size=12, +); + +``` + +Now, in order to find the minimum spanning tree, this notepad will cover the Prim's algorithm. +Let's understand it in detail. + + +## Prim's Algorithm +Prim's algorithm uses greedy approach to find the minimum spanning tree.That means, in each iteration it finds an edge which has the minimum weight and add it to the growing spanning tree. + +The time complexity of the Prim’s Algorithm is $O((V+E) \text{ log} V)$ because each vertex is inserted in the priority queue only once and insertion in priority queue take logarithmic time. + +Algorithm Steps +1. Select any arbitrary node as the root node and add it to the tree. Spanning tree will always cover all nodes so any node can be a root node. +2. Select the node having the minimum edge weight among the outgoing edges of the nodes present in the tree. Ensure the node is not already present in the spanning tree. +3. Add the selected node and edge to the tree. +4. Repeat steps 2 and 3 until all nodes are covered. +5. The final graph will represent the Minimum Spanning Tree + + +### Example solution +Let's get back to the example and find its minimum spanning tree using Prim's algorithm. Before moving forward, here are a few notations that one should remember: +- Red nodes represent unvisited vertices while green nodes represent visited vertices. +- Edges of minimum spanning tree are represented in purple color. + +```python +# converting graph to dictionary +road_list = roads._adj + +# infinity is assigned as the maximum edge weight + 1 +inf = 1 + max([w['weight'] for u in road_list.keys() for (v,w) in road_list[u].items()]) + +# initialising dictionaries +(visited, distance, TreeEdges) = ({}, {}, []) +``` + +### Step 1 +Suppose the road construction starts from city A, so A is the source node. The distance of other cities is assumed to be unknown, so all other visited vertices are marked as 'not visited' and the distance as infinite (which equals 8 in this case). + +```python +# assigning infinite distance to all nodes and marking all nodes as not visited +for v in road_list.keys(): + (visited[v], distance[v]) = (False, inf) # false indicates not visited +visited['A'] = True +distance['A']=0 + +# plotting graph +# Nudge function is created to show node labels outside the node +def nudge(pos, x_shift, y_shift): + return {n: (x + x_shift, y + y_shift) for n, (x, y) in pos.items()} + +pos_nodes = nudge(pos, 0.025, 0.16) # shift the layout +fig= plt.figure(figsize=(5, 5)) + +# assigning green color to visited nodes and red to unvisited. +node_colors = ["#4EAD27" if visited[n] == True else "#DF340B" for n in visited] +labels = {v:distance[v] for v in distance} + +# plotting the base graph +nx.draw( + roads, + pos, + with_labels=True, + node_size=900, + node_color=node_colors, + font_color="white", + font_weight="bold", + font_size=14, + node_shape="s", +) +# adding node labels +nx.draw_networkx_labels( + roads, + pos= pos_nodes, + labels= labels, + font_size= 14, + font_color='blue' +) +# adding edge labels +nx.draw_networkx_edge_labels( + roads, + pos, + edge_labels=nx.get_edge_attributes(roads, "weight"), + font_size=12, +); +``` + +### Step 2 +Now, the next step is to assign distances to A's neighbouring cities and the distance is equal to the edge weight. This needs to be done in order to find the minimum spanning tree. The distance will be updated as `minimum(current weight, new edge weight)`.
+Here, the following nodes will get updated: +- B : `min(1, 8) = 1` +- C : `min(7, 8) = 7` + +```python +# updating weights of A's neighbour +for (v, w) in road_list["A"].items(): + distance[v] = w["weight"] +# plotting graph +fig = plt.figure(figsize=(5, 5)) + +node_colors = ["#4EAD27" if visited[n] == True else "#DF340B" for n in visited] +labels = {v: distance[v] for v in distance} + +# plotting the base graph +nx.draw( + roads, + pos, + with_labels=True, + node_size=900, + node_color=node_colors, + font_color="white", + font_weight="bold", + font_size=14, + node_shape="s", +) +# adding node labels +nx.draw_networkx_labels( + roads, pos=pos_nodes, labels=labels, font_size=14, font_color="blue" +) +# adding edge labels +nx.draw_networkx_edge_labels( + roads, + pos, + edge_labels=nx.get_edge_attributes(roads, "weight"), + font_size=12, +); + +``` + +### Step 3 & Step 4 +After updating the distance in the previous step, it's time to find the next node with the minimum distance. To do this, iterate across the neighbours of the visited nodes and find the node with the minimum distance. Then update its distance and mark it as visited. + +```python +# initialising the required dictionaries for plotting graphs +visited_list = [] +distance_list = [] +edge_list = [] + +# iterating through every node's neighbour +for i in road_list.keys(): + (mindist, nextv) = (inf, None) + for u in road_list.keys(): + for (v, w) in road_list[u].items(): + d = w["weight"] + + # updating the minimum distance + if visited[u] and (not visited[v]) and d < mindist: + (mindist, nextv, nexte) = (d, v, (u, v, d)) + if nextv is None: # all nodes have been visited + break + visited[nextv] = True + visited_list.append(visited.copy()) + # adding the next minimum distance edge to the spanning tree + TreeEdges.append(nexte) + edge_list.append(TreeEdges.copy()) + + # updating the new minimum distance + for (v, w) in road_list[nextv].items(): + d = w["weight"] + if not visited[v]: + distance[v] = min(distance[v], d) + distance_list.append(distance.copy()) + +``` + +Let's understand each iteration and plot the graph! + +Figure 1
+B has the minimum distance of 1 unit from Node A and hence got added in the spanning tree. The next step is to update the distance of B's neighbour, which are as follows: +- A : Already visited +- C : `min(7, 1)` = 1 +- D : `min(8, 4)` = 4 +- E : `min(8, 3)` = 3 + +Figure 2
+The next node with the minimum distance is C with a distance of 1 unit, so now C will get added to the spanning tree, and it's neighbours will get updated: +- A : Already visited +- B : Already visited +- E : `min(3, 6)` = 3 + +Figure 3
+Among the last 2 nodes, E has the minimum distance of 3 units. So, E will get added to the spanning tree, and its neighbours will get updated: +- B : Already visited +- C : Already visited +- D : `min(4, 2)` = 2 + +Figure 4
+The final node D, with a distance of 2 units, got connected to the minimum spanning tree. This figure illustrates the final Minimum Spanning Tree of the example. + +```python +fig, axes = plt.subplots(2, 2, figsize=(15,15)) +c=0 +for v,d,ax,edges in zip(visited_list,distance_list,axes.ravel(), edge_list): + c+=1 + ax.set_title("Figure "+str(c), fontsize=16) + node_colors = ["#4EAD27" if v[n] == True else "#DF340B" for n in v] + labels = {k:d[k] for k in d} + nx.draw( + roads, + pos, + with_labels=True, + node_size=900, + node_color=node_colors, + font_color="white", + font_weight="bold", + font_size=14, + node_shape="s", + ax= ax +) + nx.draw_networkx_edges( + roads, + pos, + edgelist=edges, + width=3, + edge_color="#823AAF", + ax=ax, +) + + nx.draw_networkx_labels( + roads, + pos= pos_nodes, + labels= labels, + font_size= 14.5, + font_color='blue', + ax= ax +) + nx.draw_networkx_edge_labels( + roads, + pos, + edge_labels=nx.get_edge_attributes(roads, "weight"), + font_size=12, + ax=ax +); + +``` + +### Step 5 +The final output of the program is stored as a list of tuples in `TreeEdges` as shown below. + +```python +print(TreeEdges) +``` + +## NetworkX Implementation + +The above code is a basic implementation of Prim's algorithm with the time complexity of $ O (mn)$, which further can be improved to $O((V+E) \text{ log} V)$ with the help of priority queues. Here's the good part, with the help of NetworkX functions, one can implement it in $O((V+E) \text{ log} V)$ without even writing the whole code.
+NetworkX provides various [Tree](https://networkx.org/documentation/stable/reference/algorithms/tree.html#) functions to perform difficult operations, and [minimum_spanning_tree()](https://networkx.org/documentation/stable/reference/algorithms/generated/networkx.algorithms.tree.mst.minimum_spanning_tree.html#networkx.algorithms.tree.mst.minimum_spanning_tree) is one of them. Not only this, you can also find the maximum spanning tree with the help of the [maximum_spanning_tree()](https://networkx.org/documentation/stable/reference/algorithms/generated/networkx.algorithms.tree.mst.maximum_spanning_tree.html#networkx.algorithms.tree.mst.maximum_spanning_tree) function.
+The following code uses NetworkX function and gives us the same output as our code. + +```python +MST= nx.minimum_spanning_tree(roads, algorithm= 'prim') +print(sorted(MST.edges(data=True))) +``` + +## Applications of Prim's Algorithm +- It is used to solve travelling salesman problem. +- As said earlier, used in network for roads and rail tracks connecting all the cities. +- Path finding algorithms used in artificial intelligence. +- Cluster analysis +- Game development +- Maze generation + +There are many other similar applications present in this world. Whenever there's a need to find a cost-effective method to connect nodes (it can be anything), there's a high chance of Prim's algorithm playing its role in the solution. + + +## Reference +R. C. Prim "Shortest connection networks and some generalizations." The bell system technical journal, Volume: 36, Issue: 6, (Nov. 1957): 1389-1401
+https://ia800904.us.archive.org/18/items/bstj36-6-1389/bstj36-6-1389.pdf diff --git a/content/algorithms/index.md b/content/algorithms/index.md index e20f18ff..1a94f454 100644 --- a/content/algorithms/index.md +++ b/content/algorithms/index.md @@ -10,5 +10,5 @@ maxdepth: 1 assortativity/correlation dag/index flow/dinitz_alg -Dijkstra/Dijkstra +MST Prim/MST Prim ``` diff --git a/site/index.md b/site/index.md index 5d60e453..0ae60dc3 100644 --- a/site/index.md +++ b/site/index.md @@ -38,6 +38,5 @@ maxdepth: 1 content/algorithms/index content/generators/index content/exploratory_notebooks/facebook_notebook - ```