Start repl with lein repl
.
(def random-graph (make-graph 10 20))
(shortest-path random-graph (first (keys random-graph))
(last (keys random-graph)))
; => list of nodes which is the shortest path
(eccentricity random-graph (first (keys random-graph)))
; => number expressing eccentricity for `first` vertex in random-graph
(radius random-graph) ; => minimal eccentricity
(diameter random-graph) ; => maximal eccentricity
(radius random-graph ecc-distance-fn)
; => maximal eccentricity with weight calculated by distance
- Graph doesn't have to be strongly connected; based on the test description:
(shortest-path random-graph (first (keys random-graph)) (last (keys random-graph)) ; => list of nodes which is the shortest path by edge weight between the 2 nodes, or no path if one does not exist.
-
Weight <= Integer/MAX_VALUE
, or we could have long overflows. It could be fixed by using operators that support big integers (like+'
,inc'
, etc). -
There is only one direct path from one vertex to another.
-
Zero weight are possible, but not negative weights.
-
No OOM protection is needed. OOM will not be possible for graphs with less than 6000 vertices and 6_000_000 edges (for
Xmx
= 2048mb). -
This is expected to be used only from the REPL. So assertions are good enough to handle illegal arguments.
- G, g — graph;
- e — edges;
- v — vertices;
- w — weight;
- adj — adjacent, adjacency.
Taking into account the assumptions
above, graph is represented as:
{:1 {:2 10, :3 15}
:2 {:3 12}
:3 {}}
Meaning there is an edge from 1 to 2 with cost 10, from 1 to 3 with cost 15, from 2 to 3 with cost 12.
This representation allows to query graph with effectively O(1)
. Example:
(contains (get G :1) :2) ; => true
Expected time: O(v log v)
; worst case: O(v^3)
, where v — count of vertices
(Paper);
Expected time: O(v^2)
; O(v)
in best case, where v — count of vertices.
Random weights are generated in place for performance reason. It's also possible
to generate a graph without weights and then extend it (O(e)
) as a third step.
Asymptotically it would be the same, but would work slower.
Another approaches is to extend one of the methods described in these papers:
- The Random Plots Graph Generation Model for Studying Systems with Unknown Connection Structures
- Efficient and simple generation of random simple connected graphs with prescribed degree sequence
Using an optimized implementation without infinities in the initial queue,
having O(E * Log V)
.
Practical optimizations and infinite graphs.
Eccentricity is based on distance functions. There are two of them:
ecc-count-edges-dist-fn
— count of edges.ecc-distance-dist-fn
— sum of edges weights.
Memoization is used with eccentricities-memo
function to avoid recalculations
for radius and diameter when working with one graph.
Cache contains 2 slots, because we have now only 2 weight/distance calculation functions. So function may be invoked with two different combinations for one graph.
1. Extend the graph definition to include a weight between graph edges`
It should be between graph vertices