Skip to content
This repository has been archived by the owner on Jun 11, 2024. It is now read-only.

Upload images to global location, with filenames that are hashes of file contents #178

Open
lencioni opened this issue Jan 5, 2017 · 4 comments

Comments

@lencioni
Copy link
Contributor

lencioni commented Jan 5, 2017

I was chatting with @lelandrichardson about some of our plans for supporting multiple rendering targets (#73) and generating a review HTML page (#176), and we thought it might not be a bad idea to change the way we upload images.

Currently, we upload an HTML file and a set of images as a sort of bundle to a single "directory". Over many runs, this is likely to produce duplicate images. What if, instead, we put the images in a more general location where the filenames are hashes of the file contents. That way, if we have multiple images that are the same, we don't upload them again.

This would have the advantage of allowing us to make all examples browsable on the review page with minimal overhead.

If we do this, the files should be stored in sub-directories so that the directory doesn't end up growing to contain millions of files over time.

We could probably also apply the same technique to the JavaScript files that we need for the review page, to cut down on duplication, but that seems less valuable than doing it for the images.

@trotzig
Copy link
Contributor

trotzig commented Jan 5, 2017

This is a good idea. Do you have an idea for how it could be implemented as well? It could be as simple as using the uploader configuration option that I'm adding/thinking of adding as part of supporting #176.

// .happo.js
const S3Uploader = require('happo/server/S3Uploader');

module.exports  = {
  uploader: new S3Uploader({ 
    imageDirectory: 'some-global-dir',
  }),
}

@lencioni
Copy link
Contributor Author

lencioni commented Jan 5, 2017

It would be nice to avoid re-uploading assets that already exist, so we might want to build in that check to the upload process. imageDirectory option sounds okay to me, but I think we could probably just set it to something and not make it configurable, right?

@trotzig
Copy link
Contributor

trotzig commented Jan 5, 2017

Ah, that might be right. Since people mostly have to specify an s3 bucket name. We can choose to make a special folder for images, and check before upload.

@lencioni
Copy link
Contributor Author

lencioni commented Jan 18, 2017

We might also try a potential optimization where we upload a JSON file for every sha that maps examples to their images. ~~~Then we can potentially download all of the images for the base commit if they already exist. Not sure if that will be faster but it might depending on how long the build takes.~~~ Then we can compare the hash of the current snapshot to the hash in the JSON file, which should be a lot faster.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants