Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Symlink node_modules/.cache to tempdir #661

Conversation

c0d1ngm0nk3y
Copy link
Contributor

@c0d1ngm0nk3y c0d1ngm0nk3y commented Apr 12, 2024

Fixes paketo-buildpacks/builder-jammy-base#375

Summary

The folder node_modules/.cache is written during runtime (e.g. by eslint) this does not work

  • With jammy-builder since run und build user are different
  • with read-only file-systems

So this pr always creates the .cache folder as a symlink to the tempdir and the helper will create the needed dir for the link to be valid.

Use Cases

As brought up in paketo-buildpacks/builder-jammy-base#375 already a very simple react app will not be able to be built with buildpacks since it contains eslint.

Details

Note: The pr looks larger than it actually is. Most of the changes come from the integration folder where a simple react was added.

After creating the mode_modules, the .cache folder is removed and a symlink to /tmp/node_modules_cache is added instead. This makes sure that the run user has the permission to add files in that directory. Since the build and the run user have different uids per default, this was not the case before.

In addition /tmp/node_modules_cache has to be created at launch time since the symlink would be broken otherwise. For this, the existing helper setup_symlinks also creates this folder.

Checklist

  • I have viewed, signed, and submitted the Contributor License Agreement.
  • I have linked issue(s) that this PR should close using keywords or the Github UI (See docs)
  • I have added an integration test, if necessary.
  • I have reviewed the styleguide for guidance on my code quality.
  • I'm happy with the commit history on this PR (I have rebased/squashed as needed).

@c0d1ngm0nk3y
Copy link
Contributor Author

@paketo-buildpacks/nodejs-maintainers Before I polish the pr (e.g. adding tests), could someone have a look if this is a valid way to go? At least it works :)

@c0d1ngm0nk3y c0d1ngm0nk3y marked this pull request as ready for review April 26, 2024 12:52
@c0d1ngm0nk3y c0d1ngm0nk3y requested a review from a team as a code owner April 26, 2024 12:52
@c0d1ngm0nk3y
Copy link
Contributor Author

@paketo-buildpacks/nodejs-maintainers Could someone have a look?

@c0d1ngm0nk3y c0d1ngm0nk3y added the semver:minor A change requiring a minor version bump label May 17, 2024
@c0d1ngm0nk3y c0d1ngm0nk3y force-pushed the temp_node_modules_cache_dir branch 4 times, most recently from 41387e6 to 3367c6c Compare June 21, 2024 12:48
@c0d1ngm0nk3y
Copy link
Contributor Author

@paketo-buildpacks/nodejs-maintainers Could you provide some feedback? One specific question: Why are ALL integration tests run with this ubi builder from paketocommunity? This forces every test to use the extension, right? Doesn't feel right that there is no test without extension anymore tbh.

@c0d1ngm0nk3y c0d1ngm0nk3y force-pushed the temp_node_modules_cache_dir branch 2 times, most recently from a00f64a to 329db12 Compare July 10, 2024 12:43
@mhdawson
Copy link
Member

mhdawson commented Jul 11, 2024

This forces every test to use the extension, right? Doesn't feel right that there is no test without extension anymore tbh.

I don't think this is the case, you can see that the integration tests are run twice. Once with jammy and once with ubi. Those with ubi use the extenstion and those with jammy don't

@github-actions
Test Pull Request / Integration Tests with Builders (paketocommunity/builder-ubi-buildpackless-base:latest) (pull_request) Successful in 13m
Details
@github-actions
Test Pull Request / Integration Tests with Builders (paketobuildpacks/builder-jammy-buildpackless-full) (pull_request) Successful in 9m
Details

@c0d1ngm0nk3y
Copy link
Contributor Author

I don't think this is the case, you can see that the integration tests are run twice. Once with jammy and once with ubi.

From my understanding, that is only partially correct. All tests have to use the extension in order to also run with the ubi builder, right? Like here

Tbh, I do not really now what this ubi is about, so I might be wrong with that.

The second point, is it really worth/needed to run all tests with ubi?

@pacostas
Copy link
Contributor

pacostas commented Jul 15, 2024

The ubi extension is added conditionally

if builder.BuilderName == "paketocommunity/builder-ubi-buildpackless-base:latest" {
settings.Extensions.UbiNodejsExtension.Online, err = buildpackStore.Get.
Execute(settings.Config.UbiNodejsExtension)
Expect(err).ToNot(HaveOccurred())
}
, so in case the builder is not the ubi one, the extension falls back to an empty string and as a result is not included during the build process. It could be refactored to be more clear by adding an array instead of having the names of the buildpacks or extensions. This way, someone could easily hypothesize that the array might have zero or more buildpacks or extensions, thus there might be no extensions or buildpacks during the build process.

@mhdawson
Copy link
Member

mhdawson commented Sep 5, 2024

I bit of info I think on the cache option - https://docs.npmjs.com/cli/v10/using-npm/config#cache, not much.

Would be interesting to see if can be set in .npmrc

@mhdawson
Copy link
Member

mhdawson commented Sep 9, 2024

Ok so the cache option I mentioned in the last post seems to be unrelated.

There is a cache option for eslint which is --cache-location and I've not been able to find a corresponding environment variable.

Unfortunately some of the dependencies seem to set that directly:

cnb@8fa6e49119d5:/workspace$ grep -r "cache-location" *
node_modules/eslint/lib/options.js: * @property {string} cacheFile Path to the cache file. Deprecated: use --cache-location
node_modules/eslint/lib/options.js:                description: "Path to the cache file. Deprecated: use --cache-location"
node_modules/eslint/lib/options.js:                option: "cache-location",
node_modules/file-entry-cache/package.json:    "eslint": "eslint --cache --cache-location=node_modules/.cache/ 'cache.js' 'test/**/*.js' 'perf.js'",
node_modules/fill-range/package.json:    "lint": "eslint --cache --cache-location node_modules/.cache/.eslintcache --report-unused-disable-directives --ignore-path .gitignore .",
node_modules/flat-cache/package.json:    "eslint": "eslint --cache --cache-location=node_modules/.cache/ ./src/**/*.js ./test/**/*.js",
node_modules/picomatch/package.json:    "lint": "eslint --cache --cache-location node_modules/.cache/.eslintcache --report-unused-disable-directives --ignore-path .gitignore .",

After some googling, I also found this package which seems to be widely used for finding the right place to write cache files - https://www.npmjs.com/package/find-cache-dir. It seems to use node_modules/.cache by default and also has an environment variable that can be used to override but I don't think that helps us with the modules that specify the cache location as shown above.

@mhdawson
Copy link
Member

mhdawson commented Sep 9, 2024

I tried chaning the instances I showed above manually to /workspace, but none of those were the cause.

Looking through other packages in node_modules results in find a bunch more places in bable, webpack etc that default to using the node_modules/.cache as a default location.

Sooo, I've not convinced myself that the use of node_modules/.cache as a default is widespread enough that we should ensure it is writable by since its referenced in https://www.npmjs.com/package/find-cache-dir but also because it seems to be hard coded into packages like webpack and babel.

@c0d1ngm0nk3y my only remaining question is if there is a risk that by removing the cache we might slow down startup of some applications because they benefit from data in the cache created during the build phase? If that might be the case then we might want an environment variable that can be used to turn the new behaviour off?

@c0d1ngm0nk3y
Copy link
Contributor Author

c0d1ngm0nk3y commented Sep 11, 2024

@c0d1ngm0nk3y my only remaining question is if there is a risk that by removing the cache we might slow down startup of some applications because they benefit from data in the cache created during the build phase? If that might be the case then we might want an environment variable that can be used to turn the new behaviour off?

I cannot rule this out completely. What would you call this environment variable? Something like BP_KEEP_BUILD_CACHE? What would be the default? The problem would be that it has to be either or. So either you freeze the cache at build time or you make it writable at runtime.

Update: I would probably lean to BP_KEEP_NODE_BUILD_CACHE and let the default be false. What I am not sure about is that this depends on the specific node_module, but this setting would be for all. WDYT?

@mhdawson
Copy link
Member

I agree with letting the default be false as I hope there will be no issues but think having an escape hatch if there are is a good idead.

I'm good with BP_KEEP_NODE_BUILD_CACHE.

In terms of this depends on the specific node_module based on my googling/investigation node_modules/.cache is used but a number of modules and is as close to a standard place for putting module cache files into as exists so I'm not too worried about that unless that's not what you meant.

@c0d1ngm0nk3y
Copy link
Contributor Author

...unless that's not what you meant...

Not quite. Let's have you have two modules a and b...

  • a is filling the cache during build to speedup the application
  • b needs the cache to be writeable during runtime

You could not use BP_KEEP_NODE_BUILD_CACHE since b would crash, so you would loose the speedup from a

BUT

  • it is a cache after all, so the app should be fine if the cache get emptied
  • this is highly hypothetical

So I will go on with your suggestion.

@c0d1ngm0nk3y
Copy link
Contributor Author

@mhdawson Is this the kind of use case you had in mind?

paketo-buildpacks/node-run-script#126

@mhdawson
Copy link
Member

@c0d1ngm0nk3y that does look like an example where not maintaining the cache could lead to slowdowns.

@mhdawson
Copy link
Member

Just to add on

You could not use BP_KEEP_NODE_BUILD_CACHE since b would crash, so you would loose the speedup from a

Agree with your points of it not being a blocking concern. Also regressing existing applications is not a concern as there cannot be any working ones (since a would crash/not work).

Copy link
Member

@mhdawson mhdawson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mhdawson
Copy link
Member

@c0d1ngm0nk3y I'm ready to land but it tells me that the brach is out of date with the base and is not giving me the option to rebase in the UI. Could you rebase and then I'll land once the tests complete?

Thanks for all of your work, explanation and tweaks along the way on this.

@c0d1ngm0nk3y c0d1ngm0nk3y merged commit 990013e into paketo-buildpacks:main Oct 11, 2024
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
semver:minor A change requiring a minor version bump
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Permission problem with using Jammy builder
3 participants