Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE REQUEST] Prevent cache upload (to S3) if cache file/folder is empty/non-existent #62

Open
ghost opened this issue Jun 28, 2023 · 2 comments
Assignees
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@ghost
Copy link

ghost commented Jun 28, 2023

In one of my CI jobs, I noticed that the plugin was able to upload an empty/non-existent file/folder to S3.

[2023-06-27T23:17:34Z] [typescript-cache] - 🔍 Locating cache:  my_project/some_cacheable_folder
[2023-06-27T23:17:34Z] tar: my_project/some_cacheable_folder: Warning: Cannot stat: No such file or directory
upload: ./cacheable_folder-v1-checksum123.tar.gz to s3://bucket/caches/cacheable_folder-v1-checksum123.tar.gz

This has consequences for CI jobs that have the same cache key because those jobs end up downloading and using an empty cache. For example, this might lead to slower build times (e.g. imagine a bundler like Webpack trying to read from an empty cache which causes bundling time to be slower compared to not using a cache).

Ideally the build would complete but not upload the cache if the cache file is not found. Maybe we can add an option to do this.

e.g.

nienbo/cache#v2.4.14:
    id: node-modules-cache
    backend: s3
    ...
    paths:
      - "my_project/some_cacheable_folder"
    upload: 
      - "always"(default) | "skip-if-tarfile-empty" | "skip"
@gencer gencer self-assigned this Jun 29, 2023
@gencer gencer added enhancement New feature or request help wanted Extra attention is needed labels Jun 29, 2023
@gencer
Copy link
Collaborator

gencer commented Jun 29, 2023

We could do something like this:

  if [ -d "$TAR_TARGETS" ]
  then
  	if [ "$(ls -A $TAR_TARGETS)" ]; then
       info "Caching... (hit)"
  	else
      warning "Root exist but empty (exist-but-empty)"
  	fi
  else
  	echo "Root dir not found (non-existent)"
  fi

However, this will work on your example above but will ultimately fail on most users because many of them uses multiple paths. Globs will be fine, however, giving multiple dirs. to cache can fail. One of them could be empty but other is not.

So, this is still doable, but we need to think better on how to properly implement those checks.

Probably making a loop and excluding paths that's empty or non-existents then cache what remains.

@kliakhovskii-brex
Copy link

In case when there are multiple paths and some empty, some non-empty, it's still ok to cache as cache will have smth at least. In case when everything's empty, it doesn't make sense at all to create an empty cache.

PS Not sure if using multiple paths that common as well, as usually there is one cache per tool (e.g. lint cache for linter, prettier cache for prettier and so on)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants