Skip to content

Collaborative documentation for and from Jean Zay users. Official Jean Zay documentation is here: http://www.idris.fr/eng/jean-zay/

License

Notifications You must be signed in to change notification settings

mypey/jean-zay-doc

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Why this doc?

We are researchers and engineers in AI (very vague term but oh well ...) who have managed to get access to Jean Zay and think this can be a very useful cluster for your AI research.

At the time of writing (end January 2020), the GPU part of Jean Zay is very much underused and we think a user-contributed documentation could help people navigating the access procedure and knowing a few necessary tips and tricks to be productive on such a cluster.

This is supposed to be a collaborative doc, if you spot errors or things that could be improved, open an issue or even better a Pull Request (PR)!

Content

  • Access procedure. The access procedure for Jean Zay will take roughly 3 weeks (add 1-2 monts on top of that if you are not French for background security checks). It does seem long but it is definitely worth it.
  • Tips and tricks
  • Limitations

In the medium term, more material could be added to discuss tips and tricks, limitations, work-arounds, etc ... on Jean Zay. In particular, feel free to share tutorials, tools and scripts to help users have a more productive use of the Jean Zay cluster, e.g.:

  • how to make your code use checkpointing to be able to get long running processing despite the 20 hour wall time limit;
  • how to make sure your code can leverage the hardware optimally (e.g. with mixed precision and tensorcores);
  • how to make sure that your processing is not limited by suboptimal data access patterns on the disks or inefficient pre-processing on the CPUs;
  • how to do efficient hyper-parameter tuning at scale;
  • how to synchronize you code between local computer and the cluster.

Useful links

Hardware: http://www.idris.fr/eng/jean-zay/cpu/jean-zay-cpu-hw-eng.html

Doc: http://www.idris.fr/eng/jean-zay/

Doc in French (more accurate sometimes): http://www.idris.fr/jean-zay/

Email for Jean Zay user support: [email protected].

Generic advice

  • There is a big cultural gap between traditional HPC (High Performance Computing) users and AI users. For example, most traditional "serious" HPC clusters do not have access to the internet, yes you have read this correctly, people in traditional HPC do not need internet access to work on their problems.
  • So far every interaction we have had with Jean Zay user support has been very positive. Even if there may be some frustration (on both sides), try to be both pedagogical and constructive when you send an email to [email protected].

About

Collaborative documentation for and from Jean Zay users. Official Jean Zay documentation is here: http://www.idris.fr/eng/jean-zay/

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published