-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rosdistro_build_cache should have a max age option for the source repos cache #66
Comments
For release repositories the distribution file contains an exact version. Until that changes the cache entry is valid and doesn't have to be rechecked. That is why each cache update only takes a few seconds. For the source entry the repository state can change anytime and therefore the script must query the state of the repo every time. Each cache update will take quite some time. Tools like rosinstall_generator should therefore maybe clone the exact hash used for building the cache. Otherwise there will be inconsistencies between the cached information and the cloned repos. Beside that I don't see any other problem. I would suggest to implement the new option (to build a separate cache) and then gain experience by using it and seeing if anything unforeseen happens. |
Another way that we're looking at to tackle this (and by extension, the business of creating reproducible nightlies) is by a Would be happy to discuss further how this workflow would go and how we might be able to collaborate on it. (For now, our plan had been to build all/most of this as in-house tooling, but if a script like |
Just looking at the tools we already have and what they do:
If we ignore efficiency for a second your workflow is almost covered by:
The remaining step would be to use the resulting set of repos to generate a manifest cache. This cache is slightly different then the standard rosdistro cache since it contains manifest files for packages the distribution file doesn't know about. Because other then for release entries, source entries do not specify the package names contained in the repo. And since the package names and location within a repo are unknown I think this step always requires to (shallow) clone the repos in question. So I think the next step would be to write a script which generates a cache based on the manifest files found in a workspace which contains a set of cloned repos. Does that sound reasonable? |
I fixed the above usage to use the correct |
I'd really like to have the tag information all the way back at the master rosdistro level, though, rather than only at the rosinstall (a derived entity). There are a variety of reasons for this, but a big one is giving us the ability to +1 it— like, I want all the sources from that nightly, plus this one slight change (for example, a set of PR branches). I also don't care for the snapshot definition being only a rosinstall file + cache, since it's not as clear how that would be stored, whereas a git tag of the rosdistro repo is an extremely natural and obvious means of storing it. And, of course, efficiency. Doing So yes, it would be reasonable to generate the cache from an existing workspace, but if we have a snapshotted/frozen rosdistro anyway, and we already have a suite of functions capable of quickly fetching |
I don't understand what you mean at "master distro level". On one hand you mention it being updated automatically on e.g. a nightly base. On the other hand want the ability to +1 those. (Maybe we should talk via Hangout to figure out the exact needs / goals?) Since the source repos don't have to be on GitHub the tree API can only be an optimization. The script needs to be able to work for arbitrary repos (even non-git). |
I'm conflating two related use cases. The first is freezing the rosdistro for the purposes of cutting an overall release of the software stack— that's where the +1 builds are most critical. The second is generating nightly builds. +1 builds of the nightly may be important, but in that case, it's more about easy reproducibility. The developer debugging a failed nightly shouldn't be casting around for a generated rosinstall file that was stashed somewhere, they should be checking out a tag from the rosdistro, and pointing rosinstall_generator at that. Re: Github. Yup— it's just that the current Github manifest provider won't work in a source branch cache builder, since it depends on the package.xml being the root. To respond to this implementation proposal more directly:
This feels brittle. There would be a I'd really rather generate a source distro cache directly from the distribution.yaml itself, even if that does necessitate a shallow clone of every repo whose devel branch has changed. |
From a phone discussion between @dirk-thomas, @jjekircp, and myself on March 2:
I think that mostly covers it, at least with respect to changes in this repo. |
Regarding the content of the source cache: I think it might be helpful to also store the relative path of each package in the repo. |
Is the thinking there that it assists with future package.xml grabs? I'm unsure what's gained, since you always need to clone the whole repo anyway in order to check for new packages. That said, relative path could be helpful for a future rosinstall_generator/wstool arrangement capable of extracting subdirectories. In any case, I think I'm thinking of a format something like
Here there is provision for storing multiple versions, but my assumption is that the initial implementation would store only the newest, and it would be up to a policy on the consuming end to either a) trust that, or b) check each one and freshen as required. |
The proposed format looks good to me. |
Yes, I thought it might be helpful to pull newer manifests with the knowledge of the path. |
One potential argument for including both caches in the same file, and having switches on rosdistro_build_cache to control which are included: A massive amount of de-duplication can occur completely for free via yaml references, eg:
Result:
EDIT: Looks like it's not completely for free— the above code is benefitting from the python compiler noticing that those two strings are the same and aliasing them to each other. In the real world, it would be like this:
|
This ticket could be closed in the sense that the feature is now available (we've been using it for the better part of a year), however it would be great to have the source cache max age option completed, as discussed in #84 (diff). That would enable the source cache to be turned on for the main buildfarm, which would significantly broaden the availability and exposure of the feature. |
I updated the title to cover the remaining task. |
Via #65 (comment):
The biggest issue I see here is that unlike with the GBP release branches, it's harder to tell if a devel branch cache is stale or not. The best thing you could do would be cache the commit hash, and then use
git ls-remote
to determine if the branch has moved on from that point.Apart from that, there are two major ways I see this going:
rosdistro_build_cache
gets a switch like--from-source
, and the resulting cache format is unchanged. Has the disadvantage of needing to build a separate cache for source, but the advantage is that other tooling like rosinstall_generator is unchanged.Thoughts?
The text was updated successfully, but these errors were encountered: