-
Notifications
You must be signed in to change notification settings - Fork 630
Git Notes Submodules
It is often the case that two or more projects need to use or link to some other project or database. In the case of computational models you could imagine linking to a math library, for example. In the "firemodels" organization, we maintain two different models, each requiring the same set of experimental data for validation. It would be handy for both the CFAST and FDS projects to link to one set of data, instead of trying to maintain two copies of the data. Further, it would be beneficial for many other groups to maintain their own experimental databases, which we could all link to in a network. This could eventually achieve the long sought after goal of a well-maintained experimental fire database.
Git has a way to achieve these goals via Submodules. In this wiki we will try to boil down all the documentation and online chatter into some specific commands that will work with the firemodels repos.
The basic idea is not too difficult to understand. Somewhere within a given repo (the "super project") we will define a directory that will contain another repo (the "submodule"). At the root level of the super project we will have a new file called .gitmodules
that stores some information about all the submodules in the super project, like the relative path and the URL of the submodule repo.
For the time being, we are using the UMD_Line_Burner validation case as a test bed for submodules. This data is going to be maintained by the MaCFP (Measurement and Computation of Fire Phenomena) Working Group. Here is a link to the GitHub repo for the MaCFP database. We are going to store the submodule under a new directory called Submodules/macfp-db/
within the firemodels/exp
repo.
Note: Skip to the end for a summary of workflow for continuous integration (firebot).
How you initially set up the submodule depends on whether you are cloning the firemodels/exp
repo for the first time or whether you are adding the submodule to the existing repo.
NOTE: SSH keys will be required because that is how we initially cloned the submodule and that is how the URL is stored in the .gitmodules
file. Instructions for generating SSH keys are given here. Make sure to select your specific platform in the upper-left. If using SmartGit on Windows, you may need to skip Step 3 (reported by Jason Floyd).
If you are starting a brand new FDS repo, after you have forked the repo on GitHub, you can simply add the option --recursive
to your git clone command:
$ git clone --recursive [email protected]:<username>/exp.git exp
...
Submodule path 'Submodules/macfp-db': checked out 'b06340dae72dfa9643a3b4230a022d02128c4ee2'
At the end of the cloning process you will have checked out a certain commit of the submodule. Change directories into the submodule repo directory and you will see content.
$ cd Submodules/macfp-db/
$ ls -1a
.
..
Buoyant_Plumes
Extinction
Gaseous_Pool_Fires
.git
.gitignore
LICENSE
Liquid_Pool_Fires
README.md
Utilities
Wall_Fires
If you are the first to add the submodule to the project, follow the instructions in Pro Git on adding submodules. One of the tricks is that you need to be at the root level of the firemodels/exp
repository. Then, suppose we are adding the MaCFP submodule for the first time. We would do
$git submodule add [email protected]:MaCFP/macfp-db.git Submodules/macfp-db
...
If your firemodels/exp
already exists, what will happen when you do a git remote update
is that the new submodule directory and .gitmodule
file will be fetched and you can then merge these into your repo, as usual.
$ git remote update
...
$ git merge firemodels/master
...
But you will be disappointed when you cd to Submodules/macfp-db/
. It will be empty.
From the top level of your super project, you need to run
$ git submodule update --init --recursive
...
Now cd to the submodule directory and the database will be there.
READ THIS FIRST If you are not the one making the initial changes to the submodules then you can skip down to the section "Submodule Updates from the Central Repo".
In the case of experimental databases, the submodules will likely not need to be updated that often. But eventually the data may change or more experiments may be added and so periodically we will update the submodules.
When the index of the remote submodule repo moves forward nothing happens to our super project until we fetch the change. Things can get confusing here, especially since the behavior can depend on the version of Git you are using. First, we will show how to navigate around using Git 1.7 and earlier and then we will show a simplification that can used with Git 1.8 and later.
The way to wrap your head around what is happening is to imagine that the submodule is, in some ways, its own independent repo. If you simply cd into the macfp-db
submodule directory, for example, and do a git branch
you will see
$ cd Submodules/macfp-db/
$ git branch
* master
Notice that this is a different branch than the master
branch of the firemodels/exp
repo!
Now if you do a git branch -a
to see the remotes you get
$ git branch -a
* master
remotes/origin/HEAD -> origin/master
remotes/origin/master
So, the master branch for the macfp-db submodule is tracking the master branch of the "origin" repo on the MaCFP GitHub site. As usual, then, you can simply do
$ git remote update
Fetching origin
remote: Counting objects: 3, done.
remote: Compressing objects: 100% (3/3), done.
remote: Total 3 (delta 0), reused 0 (delta 0), pack-reused 0
Unpacking objects: 100% (3/3), done.
From github.com:MaCFP/macfp-db
c0ef816..a6e3063 master -> origin/master
to fetch any changes. If you see any changes have been fetched, you can merge these into your master branch via
$ git merge origin/master
Updating c0ef816..a6e3063
Fast-forward
README.md | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)
In this example, there has been a change to the README.md
file on origin/master. Doing a status from the present directory returns
$ git status -uno
# On branch master
nothing to commit (use -u to show untracked files)
Now, do a log report so we know what the SHA-1 (which we will use later).
$ git log --oneline
a6e3063 Update README.md
c0ef816 Update README.md
Here is where things may get confusing. Cd back to the top level of the super project repo and do a status:
$ cd ../../
$ git status -uno
# On branch development
# Changed but not updated:
# (use "git add <file>..." to update what will be committed)
# (use "git checkout -- <file>..." to discard changes in working directory)
#
# modified: Submodules/macfp-db (new commits)
#
no changes added to commit (use "git add" and/or "git commit -a")
This message is warning you that there are commits to the submodule beyond the commit your repo is pointing to. The practical result of this is that, since your origin (forked repo on GitHub) is up-to-date with your local repo, if someone were to fork your repo from GitHub and clone it with --recursive they would not get the latest commit of the submodule.
Let's see how Git knows there are "new commits". From the top level (where you are) do a submodule status:
$ git submodule status
+a6e3063923213eac4221b8d3db7c77532d15ea2c Submodules/macfp-db (heads/master)
Notice that this hash matches the latest commit from log message above. Now let's see what the master
branch thinks the submodule is pointing to.
$ git ls-tree master:Submodules
160000 commit c0ef8166eca8bbddef1c84bc040dc6ed3a1edd0d macfp-db
Notice that the master
branch is pointing to the prior commit. So, to make our git status
command happy we need to add
and commit
this change.
$ git add Submodules/macfp-db
$ git commit -m "Submodules: new commit"
[master 11e6148] Submodules: new commit
1 files changed, 1 insertions(+), 1 deletions(-)
Now look at the tree again.
$ git ls-tree master:Submodules
160000 commit a6e3063923213eac4221b8d3db7c77532d15ea2c macfp-db
The commits are up-to-date. But now your local repo is one commit ahead of your origin. So do
$ git push origin master
Counting objects: 7, done.
Delta compression using up to 12 threads.
Compressing objects: 100% (3/3), done.
Writing objects: 100% (4/4), 367 bytes, done.
Total 4 (delta 2), reused 0 (delta 0)
To [email protected]:rmcdermo/exp.git
8debd8f..11e6148 master -> master
and submit a pull request to the central repo.
Below, we will discuss what to do when you are not the project member who originally updated the submodule. But first let's look at a couple of alternate ways of doing the submodule update.
There is a handy submodule command called foreach that can be used from the top level of the super project to issue commands to ALL the submodules at once. In our example submodule, the repo name is "origin" and the branch is "master" (I suspect these conventions have materialized to facilitate this foreach
command!). The equivalent to cd'ing into each submodule directory and issuing the fetch and merge commands would be (remember to be at the top level of the super project)
$ git submodule foreach git remote update
$ git submodule foreach git merge origin/master
The foreach
command basically looks at the list in .gitmodules
and performs the commands after "foreach" on those submodule repos.
With Git 1.8 and later there is a somewhat more intuitive way to update the remote submodules. Again, from the top level of the super project do
$ git submodule update --remote
Suppose you are not the project member who first updated the submodule. In your daily routine you come in and do a fetch
(remote update) and merge
$ git remote update
...
$ git merge firemodels/master
...
Now do a status.
$ git status -uno
On branch master
Your branch is up-to-date with 'origin/master'.
Changes not staged for commit:
(use "git add <file>..." to update what will be committed)
(use "git checkout -- <file>..." to discard changes in working directory)
modified: Submodules/macfp-db (new commits)
no changes added to commit (use "git add" and/or "git commit -a")
We are back to our (new commits) problem! But it turns out that we now have the opposite problem as we did above when we were the original submodule updater. First now check the tree to see where it is pointing.
$ git ls-tree master:Submodules/
160000 commit 99f3e550c931a8211cc2c8654a601a7702eea470 Submodules/macfp-db
Now do a submodule status to see where that is.
$ git submodule status
+c0ef8166eca8bbddef1c84bc040dc6ed3a1edd0d Submodules/macfp-db (heads/master-1-gc0ef816)
The commit hashes are different, but we really can't tell who is ahead of whom until we fetch the submodule. So, go ahead and do
$ git submodule foreach git remote update
Entering 'Submodules/macfp-db'
Fetching origin
$ git submodule foreach git merge origin/master
Entering 'Submodules/macfp-db'
Updating c0ef816..99f3e55
Fast-forward
README.md | 1 -
1 file changed, 1 deletion(-)
Now do a status.
$ git status -uno
On branch master
Your branch is up-to-date with 'origin/master'.
nothing to commit (use -u to show untracked files)
If the gods are good, everything should be in sync!
The firebot script used for FDS continuous integration should behave like the previous section "Submodule Updates from the Central Repo". Firebot runs (at the top level of the fds repo)
$ git submodule foreach git remote update
$ git submodule foreach git merge origin/master
If there has been a change to the submodule repo, it will be fetched and merged.
However, if one of the owners of the fds repo has not already committed the submodule changes to the central repo then firebot will see the modified: Submodules/macfp-db (new commits)
status message mentioned above. Here, firebot's working tree is actually ahead of the central repo. In this case, firebot should send a WARNING email to the FDS developers to let them know about the new commits to the submodule repo. Once the FDS developers manually update the central repo, the "new commits" status message will go away for firebot.