Benutzer:Thk/GIBF
Git Ignore Big File
This page tracks information about a tool idea waiting for implementation.
Manpage
Name
gibf - manage big files with git without checking them in by tracking information in gitignore file comments
Synopsis
- gibf download FILE [FILE...] - Downloads files
- gibf lsurls FILE - List available download urls for FILE
- gibf add FILE URL [URL...] - Adds file and its hash to .gitignore and records one or more download urls.
- gibf rm FILE [FILE...] - remove .gitignore entries and associated gibf comments for given files
- gibf verify PATTERN [PATTERN...] - verify matching files managed by gibf aginst their sha512 values
Description
This command manages big files inside a git repository safely without committing the files themselves. Instead download URLs and file hashes are tracked in .gitignore files next to ignore lines for the respective files.
This tool was written to manage installation archives in a configuration mangement system instead of downloading them from an external source at deploy time. This has a couple of advantages which might not all be relevant for everybody:
- External inputs MUST always be distrusted. This tool allows to manage a hash of the file to at least use Trust-At-First-Use (TAFU). Ideally one also verifies upstream signatures on import time or compare the tarball with the version control checkout.
- Deployment is independent of the current availability of an external server that could be down at the worst time.
- Deployment of multiple nodes does not cause unnecessary traffic on the external server.
- The external source does not learn about the public IP address where the archive is eventually used.
- The deployment node does not need to have a route to the external server. This is especially interesting for nodes in a private network behind a load-balancer.
Gitignore Syntax
This tool parses specially formatted comment lines that follow gitignore entries referring unambigoulsly to one specific file, i.e. they do not end with a slash and do not contain any wildcard or pattern. All comment lines parsed by this tool start with "#gibf: " which is followed by a key value pair, e.g.:
somegreatsoftware-0.1.tar.gz #gibf: url=https://example.com/downloads/somegreatsoftware-0.1.tar.gz #gibf: url=https://intranet/artefactbackup/somegreatsoftware-0.1.tar.gz #gibf: sha512: e68f8e37d65d38b28af0c995dc6f6790e8baeff0d557b0537c7f7205b1235127bf0e90072a89753f4979f6bd54b683322c52a05f2e4730a649c6143dcac0bf19 otherbigfile.iso #gibf: url=https://example.com/images/otherbigfile.iso #gibf: sha512: 35a71fcbdd1a9f0ed0548450facb213ee18f79e012f668b06ea8c7e7d0394ea53f9aca56d4a4ab0fb4f1752adb6697d6525395dd83ce21d3921dd05d5b71f7ea
The tool currently understands these keys:
- url - A download url for the file
- sha512 - The sha512 hash of the file to download
Examples
gibf download somegreatsoftware-0.1.tar.gz
Looks up the download location for somegreatsoftware-0.1.tar.gz in a .gitignore file in the same folder, downloads the file to a temporary location and verifies its sha512 hash before moving it to its place in the git worktree.
See also
gitignore(5), sha512sum(1)