You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In our collection of datasets that we use for our benchmarking process, we find some datasets that do not represent a hard problem and exhibit the following properties when solving them:
all tuners reach the same result;
the result is too perfect;
or there is no significant improvement as a result of tuning;
We still keep these datasets in the our benchmarking process and do test them every release. These datasets essentially provide us a testing possibility. That is in future we will incorporate a test that looks from changes or deviations from these expected results and if we see significant difference, we will have to examine what changed in the release. These anomalies, which can occur both in our tuners and in the tuners of an external library, help us detect if there have been any changes to the code or to an external library used by the tuners.
We also use them to know if we have done an integration of another library correctly, as we are expecting the same amount of ties with the others on 100 iterations.
We will formalize this process in the near future.
The text was updated successfully, but these errors were encountered:
kveerama
changed the title
Benchmarking Datasets
Some benchmarking datasets are too easy.
May 21, 2020
In our collection of datasets that we use for our benchmarking process, we find some datasets that do not represent a hard problem and exhibit the following properties when solving them:
We still keep these datasets in the our benchmarking process and do test them every release. These datasets essentially provide us a testing possibility. That is in future we will incorporate a test that looks from changes or deviations from these expected results and if we see significant difference, we will have to examine what changed in the release. These anomalies, which can occur both in our tuners and in the tuners of an external library, help us detect if there have been any changes to the code or to an external library used by the tuners.
We also use them to know if we have done an integration of another library correctly, as we are expecting the same amount of ties with the others on 100 iterations.
We will formalize this process in the near future.
The text was updated successfully, but these errors were encountered: