Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Container for building sphinx-based documentation #7

Open
billsacks opened this issue Oct 23, 2020 · 18 comments
Open

Container for building sphinx-based documentation #7

billsacks opened this issue Oct 23, 2020 · 18 comments
Labels
enhancement New feature or request

Comments

@billsacks
Copy link
Member

This is not at all high priority, but I wanted to mention this as something to think about. It just occurred to me that it could be very helpful if we had a Docker container with the software prerequisites needed for building the sphinx-based documentation that we use throughout CESM. At least in CTSM, we are asking scientists both inside NCAR and externally to build the documentation themselves, and getting all the tools installed can be a major barrier for them.

The installation requirements are documented here https://github.com/ESCOMP/CTSM/wiki/Directions-for-editing-CLM-documentation-on-github-and-sphinx. Essential prerequisites are sphinx, our sphinx theme, latexmk and maybe rst2pdf, and git-lfs. I don't understand containers well enough to know if there would be issues with any of this (I'm especially thinking about whether there are issues with making git-lfs in a container play nicely with your system's git and any git repositories that exist outside the container), but we could hopefully at least make it much easier to get an environment set up to build the documentation, even if we can't get 100% of the way there.

Another benefit of this is that we could ensure that we all use the same version of sphinx, which would avoid the annoying thing that happens now, where building with a different sphinx version leads to changes in all of the generated html pages.

@briandobbins @mvertens @mnlevy1981 @negin513 @wwieder

@billsacks billsacks added the enhancement New feature or request label Oct 23, 2020
@mvertens
Copy link
Collaborator

mvertens commented Oct 23, 2020 via email

@wwieder
Copy link

wwieder commented Oct 23, 2020 via email

@briandobbins
Copy link
Collaborator

Hi all,

  Installing these is pretty trivial; I just did a quick trial and was able to already do it.   It does add just under 400MB of packages (fonts, largely) to the image, so you could have separate containers for 'using' CESM (no documentation tools) and another for 'development' (including those tools), but frankly, I think a simple one-stop-shop is better, especially given the community nature of the model.  We could even add a tutorial on adding documentation if people feel it helps.

  In my opinion, the primary question in terms of installation is whether we want this in the 'base' image (which all the ESCOMP images build off of - it includes MPI, HDF, NetCDF, etc), or the CESM image?  If other ESCOMP models use Sphinx, we probably want it in the base.  If not, we probably want it in a CESM image.  I'd like to get Ben's ( @bekozi ) take on this.  What does ESMF do for documentation?

  As for git issues, I don't expect any, unless we get severe version mismatches between the containerized and native versions of git, and people swap back and forth between the two, but I'll set up a test container and walk through the documentation instructions myself as a quick test. Maybe one or two of us can get together next week and try it out too?

@billsacks
Copy link
Member Author

Cool, thanks @briandobbins ! Good point that a one-stop-shop may be better here.

For CTSM purposes (and I'm guessing similarly for other components), it doesn't really make sense to get the whole of CESM, but I also guess there isn't a huge downside to that if it is easier to maintain or use that way. Many of the ESCOMP models do use sphinx, so my opinion is that it would make sense to put this in the base image, but I don't know enough about this to have a very good sense. I'm curious if there's a third option of having an image that extends the base ESCOMP image and adds this documentation stuff, and then the CESM image could extend that further (like multiple levels of inheritance)? But I could also see that maybe being too complex.

I should note that I don't think most components need latexmk and rst2pdf: CTSM needs these because of the way we handle equations and also because we generate pdf as well as html documentation.

I'd be happy to spend some time with you next week trying it out, if you don't mind holding my hand through the process.

@briandobbins
Copy link
Collaborator

briandobbins commented Oct 23, 2020 via email

@bekozi
Copy link
Contributor

bekozi commented Oct 23, 2020

What does ESMF do for documentation?

ESMPy uses sphinx. The base image recipe is here: https://github.com/ESCOMP/ESCOMP-Containers/blob/master/ESMF/doc/esmpy-doc-base/Dockerfile

The actual documentation is built in this recipe: https://github.com/ESCOMP/ESCOMP-Containers/blob/master/ESMF/doc/esmpy-doc/Dockerfile

It's a little strange because the ESMPy documentation requires an ESMF build. This could be hacked by just providing an esmf.mk file, but it seemed better just to build ESMF once (generally) and be done with it.

For reference, the ESMF CI builds the above containers and publishes the docs:

This approach allows for Docker layer caching in the base build.

The ESMF doc recipes, which do not use sphinx, are located in the same folder.

Ping @rsdunlapiv @him-28

@briandobbins
Copy link
Collaborator

A brief update -- I've added the Sphinx (and related) utilities to the base-centos8 image, used to build CESM, CESM-Lab and ESMF. The current ESMPy images seem to use an Ubuntu base instead, so they're unaffected. This seemed like a good approach if we expect future ESCOMP containers will also use Sphinx, since it ensures the 'base' image has all the common libraries / tools (now, MPI, compilers, NetCDF, HDF5, PNetCDF, Sphinx, etc), and the 'application' images are simpler.

That said, I did also build an 'escomp/sphinx' image that just has the tools needed for documentation, and thus is a fair bit smaller. I think it'll take some feedback from potential users to see what the best approach is - eg, if people are using the CESM container (for example) to run / develop AND do documentation, maybe the 'sphinx' container is superfluous. But if they're using larger systems like Cheyenne for runs, but doing documentation on their laptops, then perhaps the Sphinx one is useful.

As such, I'm going to leave this issue open for now, but did want to provide that update. I'll revisit this in the coming weeks.

The CTSM wiki page on using the containers for documentation is here, for anyone interested:
https://github.com/ESCOMP/CTSM/wiki/Directions-for-editing-CLM-documentation-on-github-and-sphinx-(container-method)

@billsacks
Copy link
Member Author

Thanks so much @briandobbins ! I'm a little confused about this (maybe just because I'm a noob when it comes to the use of containers): If sphinx is part of the base image, then what's the need for the 'escomp/sphinx' image? Could someone just get 'escomp/base' and get everything they need? Or is there some reason why that doesn't work (or isn't a preferred way of working)?

If just getting 'escomp/base' doesn't make sense, then I also don't have a great sense right now of the best way to go here – suggesting escomp/sphinx or the one-stop-shopping escomp/cesm. I think the relevant questions are: (1) how hard it is to maintain the separate 'escomp/sphinx' image, and (2) what fraction of people who want the sphinx image are also going to want to run CESM on the same machine where they are building the documentation. I don't have any sense of either of these. I'm happy to defer to your intuition, or to leave things unsettled for now and return to this question later, once we have a better sense of what users are doing.

@briandobbins
Copy link
Collaborator

briandobbins commented Oct 27, 2020

@billsacks

Good question - the basic difference is that, broadly speaking, it's like asking for a hammer. I can give you just the hammer (escomp/sphinx), or I can give you a toolbox with a hammer, a screwdriver, and some nails (escomp/cesm.. or maybe escomp/base-centos8, but more on that in a minute). If you KNOW all you need is the hammer, maybe that's a better solution, since it's smaller. But if you might need other tools, the 'toolbox' version is better. In this case, the 'base' and 'cesm' images contain things people don't need for documentation, like GNU compilers, MPI, NetCDF, HDF5, PNetCDF, etc. Does this hurt in any way? Not really, it just adds some size to the image, but it goes against the principle of simple, focused tools.

As for escomp/base-centos8 (there is no straight 'base', in case we later do bases on other distros... but maybe that should be reassessed too), the (minor) issue there is we aren't yet creating the 'user' account, so the mapping of directories upon running is slightly different. I'd left that off originally because inside the CESM container (for example), we make the 'user' account a member of the 'ncar' group, and figured other applications might do their own. But now, I'm thinking maybe we make an 'escomp' group, add the user during the 'base' install, and make the 'base' image into escomp/base, and tag versions (eg, 'escomp/base:centos8').

This would give us more consistency ('escomp' for the user's group, paths are always /home/user), a simpler naming convention ('escomp/base') for the base container, and eliminate the need to maintain an extra container (sphinx). It does still give you the whole 'toolbox', at a cost of a few hundred extra megabytes, but that seems relatively minor. Ultimately I expect most users will download the application containers, as opposed to the 'base', but at this point it's a moot point since both are maintained anyway.

Thanks for the feedback - I hope that made sense. I'll start updating things, and will send you fixes to the documentation soon, too.

@billsacks
Copy link
Member Author

I'm not following all of the details, but it sounds like you have a good path forward. I had been confused, I guess: I was thinking that the base image was the minimal set, and then everything else just added to that. I'm still somewhat confused, but since it sounds like you have a path forward that you're happy with, I'm happy to go along with that. (I really don't know enough to have opinions of my own at this point.)

@briandobbins
Copy link
Collaborator

Basically, we had a 'tree' originally - the 'base' image, on which CESM, CESM-Lab and ESMF containers were built. The 'Sphinx' container was a whole new tree, with no relation to the base image, since the 'minimal set' for building the documentation doesn't include the compilers, NetCDF libraries, etc.

I think doing away with that, and just including things in the base (your original understanding, I think), is a better path forward after seeing your feedback. So from now on, the 'base' image will indeed include Sphinx, and we'll have a single hierarchy again. Easier to maintain and understand, at the minor expense of a slightly larger download.

@billsacks
Copy link
Member Author

Ah, got it, now I understand. Yes, your proposed path forward is what I had understood all along, and this makes sense to me.

@billsacks
Copy link
Member Author

@briandobbins The cime and cesm documentation have an extra requirement:

pip install sphinxcontrib-programoutput

Can you please add that to the base image when you get a chance?

@briandobbins
Copy link
Collaborator

briandobbins commented Oct 29, 2020 via email

@billsacks
Copy link
Member Author

billsacks commented Oct 29, 2020

Thanks @briandobbins . No rush.

I ran into another issue, though it may not impact many people beyond me: I use git worktrees heavily in my development workflow. This becomes a problem in the CTSM documentation build, because this build invokes a git lfs pull. In a git worktree, the .git directory is replaced with a text file giving the absolute path to the parent git repository, e.g.:

$ cat .git
gitdir: /Users/sacks/ctsm/ctsm0/.git/worktrees/ctsm5

So when trying to execute a git command from within the Docker image, I get a message like this:

fatal: not a git repository: /Users/sacks/ctsm/ctsm0/.git/worktrees/ctsm5

This is because in Docker-land, this path doesn't exist.

From browsing some StackOverflow suggestions, I came up with the solution of doing this from inside my Docker shell:

sudo mkdir -p /Users/sacks
sudo ln -s /home/user /Users/sacks/ctsm

This seems to work, but I wanted to check with you:

  • Most importantly, do you see any issues that this might cause?

  • Secondarily, do you see off-hand a better way to solve this problem?

By the way, I am going down the path of building the docker run command into a python script to do the build, rather than suggesting use of an interactive docker terminal session. I am able to leverage a python tool I had already built to wrap the documentation build process, and I think this will make things easier for users. And issues like the above can be handled programatically rather than needing to run those commands manually all the time.

@briandobbins
Copy link
Collaborator

@billsacks

Ah, this is a good question. I guess before looking at potential ways to make it easier, I'd love to get from you a sense of how common that use case is? In short, it seems like there's two approaches:

  1. What you did (user customizes the environment), then saves it. You can 'save' your modified image via something like:
docker commit <container ID> cesm

If you're unsure the container ID, you can get it while it's running via 'docker ps'. Note that the above line would the modified image to a new docker container called 'cesm', as opposed to the 'escomp/cesm-2.2' one you're running, so in the future you'd want to just run your local one... otherwise those changes won't persist!

I definitely don't see that causing any issues, no. Conceivably, if you had symlinks or references outside that 'ctsm' directory, that would be a problem, but I don't think that's likely?

  1. Offer more flexibility on what we map the user's home directory to

In this case, instead of mapping whatever directory you provide to /home/user, we allow you to specify what to map it to - in this case, /Users/sacks/ctsm. I feel like this is more dangerous because it removes the commonality of always knowing what a container user's home directory is (/home/user), and may run into issues if someone specifies something weird, like /var. Can we do it? Probably, but unless this is a very common use case, I think the slight customization of the Docker environment by the user is better. We can even post documentation on how to do this for expert users.

Chances are there are other approaches too, but that's the best off the top of my head. When I get some time, I'll try to duplicate this and look into ways of dealing with it.

@billsacks
Copy link
Member Author

Thanks a lot for your thoughts @briandobbins . I think there's no need for you to spend more time on this git worktree-related question for now. My takeaway is that it's safe to put the mkdir & ln commands in the docker run command in my python wrapper script for now, and we can revisit this if it's causing more general issues down the line.

@billsacks
Copy link
Member Author

@briandobbins - Okay, I have added capability for using docker in ESMCI/doc-builder#3 . No need to look through that unless you're interested, but I thought I'd point out a few key elements in case it helps for future tooling:

  • This automatically determines an appropriate docker mount point via python's os.path.commonpath to find a common parent of the specified documentation build and source directories.

  • As we discussed above, this creates the sym links so that this will work with git worktrees.

  • The final docker run command looks like this (where the name is generated using a random string): docker run --name build_docs_xifyxald --volume /Users/sacks/ctsm:/home/user --workdir /home/user/ctsm5/doc --rm escomp/base /bin/bash -c "sudo mkdir -p /Users/sacks && sudo ln -s /home/user /Users/sacks/ctsm && make BUILDDIR=../../ctsm-docs/versions/master -j 4 html"

  • I created a signal handler in the python so that SIGINT calls docker kill before exiting. Otherwise, doing a Ctrl-C during the documentation build left the docker process still running.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

5 participants