-
Notifications
You must be signed in to change notification settings - Fork 565
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Upgrade to tensorflow 2.4.0 #635
base: master
Are you sure you want to change the base?
Conversation
b2d4ae9
to
30f3edc
Compare
Does this change upgrade all python scripts to run with tensorflow 2.4? I also need to upgrade python scripts to run with tensorflow 2.9 because it is the only version that can be installed in my Macbook M1 Pro. If all python scripts are upgraded to tensorflow 2.4 or 2.9, it gives me a chance to see if we can utilize Apple neural engines to speed up katago by converting a tensorflow network to Apple's Core ML model in a Macbook M1 Pro. |
i dont know, just test it over windows 11 that only supports up to 2.4 |
That's OK. I found a method to convert a TensorFlow 1.x network to Apple's Core ML model on Google Colab, so I don't have a strong requirement to upgrade to TensorFlow 2.x now. |
How did you do that? Can you share the model? |
See my recent release page: https://github.com/ChinChangYang/KataGo/releases/tag/v1.11.0-coreml3 |
@ChinChangYang Thank you! |
The M1-series has 16 Neural Engine cores, but you can only set two threads in KataGo because it's a limitation of the code. However, you can modify the code to run more threads for the Neural Engine. The two threads in KataGo run two identical Core ML models, and then Core ML decides which compute units to use for each operation in the neural network. These compute units can be the CPU, GPU, or Neural Engine (NE). In my experience, KataGo's Core ML only selects the CPU or NE, and I've noticed that the NE is kept busy by the two threads running Core ML. It's up to you whether you want to try using more threads to see if it improves performance. |
So that could use CUDA 11.0 + tensorflow_gpu 2.4.0, makes synchronous training on a single machine possible.