Pre-commit hooks

This repository uses the Python package pre-commit to manage pre-commit hooks. Pre-commit hooks are actions which are run automatically, typically on each commit, to perform some common set of tasks. For example, a pre-commit hook might be used to run any code linting automatically before code is committed, ensuring common code quality.

Purpose

For this repository, we are using pre-commit for a number of purposes:

  • Checking for secrets being committed accidentally — there is a strict definition of a “secret”; and

  • Checking for any large files (over 5 MB) being committed.

We have configured pre-commit to run automatically on every commit. By running on each commit, we ensure that pre-commit will be able to detect all contraventions and keep our repository in a healthy state.

Installation

In order for pre-commit to run, action is needed to configure it on your system.

  • Install the pre-commit package by running poetry install

  • Run poetry run pre-commit install in your terminal to set up pre-commit to run when code is committed.

Using the detect-secrets pre-commit hook

Secret detection limitations

The detect-secrets package does its best to prevent accidental committing of secrets, but it may miss things. Instead, focus on good software development practices!

See the definition of a secret for further information.

We use detect-secrets to check that no secrets are accidentally committed. This hook requires you to generate a baseline file if one is not already present within the root directory. To create the baseline file, run the following at the root of the repository:

poetry run detect-secrets scan > .secrets.baseline

Next, audit the baseline that has been generated by running:

poetry run detect-secrets audit .secrets.baseline

When you run this command, you’ll enter an interactive console. This will present you with a list of high-entropy string and/or anything which could be a secret. It will then ask you to verify whether this is the case. This allows the hook to remember false positives in the future, and alert you to new secrets.

Definition of a “secret” according to detect-secrets

The detect-secrets documentation, as of January 2021, says it works:

…by running periodic diff outputs against heuristically crafted [regular expression] statements, to identify whether any new secret has been committed.

This means it uses regular expression patterns to scan your code changes for anything that looks like a secret according to the patterns. By definition, there are only a limited number of patterns, so the detect-secrets package cannot detect every conceivable type of secret.

To understand what types of secrets will be detected, read the detect-secrets documentation on caveats, and the list of supported plugins that the detect-secrets uses. Also, you should use secret variable names with words that will trip the KeywordDetector plugin; see the DENYLIST variable for the full list of words.

If pre-commit detects secrets during commit

If pre-commit detects any secrets when you try to create a commit, it will detail what it found and where to go to check the secret.

If the detected secret is a false positive, there are two options to resolve this, and prevent your commit from being blocked:

In either case, if an actual secret is detected (or a combination of actual secrets and false positives), first remove the actual secret before following either of these processes.

Updating .secrets.baseline

To exclude a false positive, you can also update the .secrets.baseline by repeating the same two commands as in the initial setup.

During auditing, if the detected secret is actually a secret (or other sensitive information), remove the secret and re-commit. There is no need to update the .secrets.baseline file in this case.

If your commit contains a mixture of false positives and actual secrets, remove the actual secrets first before updating and auditing the .secrets.baseline file.