You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Jul 3, 2024. It is now read-only.
Thank you for your great resource.
I have been starting to use your benchmark but had some queries about the final test accuracies that are returned for some models.
I'm using the nasbench_only108.tfrecord and checked the sha256sum is correct.
A few models I find reported to have final test accuracies of ~10% are as follows (I give the hash returned by nasbench.hash_iterator() )
However when I train them for a few epochs with a constant learning rate just to see if I expect the test accuracy to be completely random I normally get > 40% validation accuracy, so the fact that the test accuracy is random doesn't seem right to me. (Obviously the test accuracy isn't the validation accuracy and I'm not using the same training procedure but I wouldn't expect such different results between the 2).
I get my test accuracies from the nasbench api as follows
for unique_hash in nasbench.hash_iterator():
matrix = nasbench.fixed_statistics[unique_hash]['module_adjacency']
operations = nasbench.fixed_statistics[unique_hash]['module_operations']
spec = ModelSpec(matrix, operations)
data = nasbench.query(spec)
acc = 100.*data['test_accuracy']
Is this correct?
If they are incorrect, is there some systematic cause which would let me know which models I can trust the test accuracies for and which ones I can't?
Apologies if I've misunderstood something.
Thanks again
The text was updated successfully, but these errors were encountered:
As far as I understand it, this happens sometimes with tensorflow when the learning rate is borderline too high (for the selected model).
It can then be a result of the model constantly overshooting local optima because of the high step size and some tensorflow internal safeguards that basically output predictions as a dummy classifier.
Usually this behavior is also quite stochastic, meaning it might only appear in one of the 3 runs.
For this reason, some work simply disregards architectures that achieve <80% final accuracy as noise.
However, it appears as if the models you listed could not be trained in all 3 runs, so across 3 different random inits.
Also the models were trained with cosine lr decay which should prevent this kind of training problem in the first place.
It is probably an issue with the TPU v2 architecture.
Sign up for freeto subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Hi,
Thank you for your great resource.
I have been starting to use your benchmark but had some queries about the final test accuracies that are returned for some models.
I'm using the nasbench_only108.tfrecord and checked the sha256sum is correct.
A few models I find reported to have final test accuracies of ~10% are as follows (I give the hash returned by
nasbench.hash_iterator()
)01bcceabc42489b3af4b4496e333a86e
003d4f8e9c5a066d7b248230d8a4fcb5
However when I train them for a few epochs with a constant learning rate just to see if I expect the test accuracy to be completely random I normally get > 40% validation accuracy, so the fact that the test accuracy is random doesn't seem right to me. (Obviously the test accuracy isn't the validation accuracy and I'm not using the same training procedure but I wouldn't expect such different results between the 2).
I get my test accuracies from the nasbench api as follows
Is this correct?
If they are incorrect, is there some systematic cause which would let me know which models I can trust the test accuracies for and which ones I can't?
Apologies if I've misunderstood something.
Thanks again
The text was updated successfully, but these errors were encountered: