clean-npm-cache: Fix buggy garbage-collection logic.

We saw issues with /srv/zulip_npm_cache being cleaned incorrectly by
this tool in production (more correctly, we noticed broken symlinks to
those directories, even from the current deployment).  Print-debugging
showed that indeed older deployments were being ignored, because the
logic for `get_caches_in_use` was totally broken (this was sorta
masked because we also keep the last week's deployments).

The specific bug here turned out to be that we weren't passing the
`production` argument to generate_sha1sum_node_modules, but the
broader problem is that this logic isn't robust to changes in the
hashing algorithm.

Fix this by replacing the broken logic for trying to compute the
correct hash for that deployment with just checking the symlink inside
the deployment to let it self-report.

We can't easily do this same change for clean-venv-cache, because we
use multiple virtualenvs there.  But a similar change could be useful
for the emoji cache as well.

Fixes #8116.
This commit is contained in:
Tim Abbott 2018-03-28 15:32:47 -07:00
parent 00dd86967b
commit 7b2c9223e7
1 changed files with 4 additions and 4 deletions

View File

@ -40,12 +40,12 @@ def get_caches_in_use(threshold_days):
caches_in_use.add(CURRENT_CACHE)
for setup_dir in setups_to_check:
PACKAGES_FILE = os.path.join(setup_dir, "package.json")
if os.path.exists(PACKAGES_FILE):
node_modules_link_path = os.path.join(setup_dir, "node_modules")
if not os.path.exists(node_modules_link_path):
# If 'package.json' file doesn't exist then no node_modules
# cache is associated with this setup.
sha1sum = generate_sha1sum_node_modules(setup_dir=setup_dir)
caches_in_use.add(os.path.join(NODE_MODULES_CACHE_PATH, sha1sum))
continue
caches_in_use.add(os.readlink(node_modules_link_path))
return caches_in_use