-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug: NumberOfTrianglesMetric computes wrong results in some cases #3
Comments
We did some bugfixes for our approaches, where we encountered the following problems, which seem to be the same problems for the edge iterator approach, as of the results of the according (extended) test (testOnSimpleSelfloopTriangle() We use a very simple graph containing one selfloop. simple_selfloop_triangle.txt We think that there is a check missing, so that v1 < v2 < v3 to ensure that no triangle is counted multiple times. Another problem we had with our approaches is that a self loop together with one additional neighbor is wrongly counted as a triangle, which seems to be also the case for this approach. |
… for the DatasetCreator class. (#3)
The main difference between the triangles that are calculated in Lemming and the triangle statistics from Stanford for the email-Eu-core network is the definition of a triangle. Lemming is counting edge triangles while the statistics from Stanford seem to use node triangles. Using the simple loop example referenced above, we have 3 nodes and 5 edges (
If we count node triangles (as for the statistics from Stanford), the example is easy because there is only one of them (the only one possible): However, in Lemming we would count 2 triangles: Implementing the (simpler) node triangle counting (as done in https://github.com/dice-group/Lemming/blob/master/src/main/java/org/aksw/simba/lemming/metrics/single/SingleValueMetric.java), the result matches the number of triangles listed on the webpage of Stanford. |
The current NumberOfTriangleMetric (https://github.com/dice-group/Lemming/blob/master/src/main/java/org/aksw/simba/lemming/metrics/single/NumberOfTrianglesMetric.java#L41-L75) (edge iterator approach) does not compute the correct amount of triangles on all graphs.
We extended the current unit tests in our fork and apart from more simple examples, we added a test case based on a graph (see https://snap.stanford.edu/data/email-Eu-core.html) from the SNAP dataset. The actual test implementation can be found here:
https://github.com/BlackHawkLex/Lemming/blob/master/src/test/java/org/aksw/simba/lemming/metrics/single/triangle/AbstractNumberOfTrianglesMetricTest.java
For running the tests associated with the NumberOfTriangleMetric in this fork, the following test case has to be run:
https://github.com/BlackHawkLex/Lemming/blob/master/src/test/java/org/aksw/simba/lemming/metrics/single/triangle/EdgeIteratorNumberOfTrianglesMetricTest.java
The algorithm outputs a triangle count of 1439598, although the graph actually contains 105461 triangles according to the SNAP graph detail page: https://snap.stanford.edu/data/email-Eu-core.html.
Furthermore, from what we have tested so far it seems that it also outputs wrong counts for most on the graphs on the SemanticWebDogFood assuming that our implementation of the "Forward" algorithm is correct (which according to our test cases seems to be the case). In order to compare the results of the edge-iterator approach and our forward approach, one can add the EdgeIteratorNumberOfTrianglesMetric and the ForwardNumberOfTriangleMetric to the metrics list in the EvaluationRunner (https://github.com/BlackHawkLex/Lemming/blob/master/src/main/java/org/aksw/simba/lemming/tools/EvaluationRunner.java) and use the
getSemanticDogFoodGraphs()
method to choose the right graphs for running an evaluation.The text was updated successfully, but these errors were encountered: