Rclone update hashes can be a game-changer for your syncing experience.
Rclone's update hashes feature allows you to update the hashes of files on your remote storage, which can improve syncing performance and accuracy.
This is especially useful if you have a large number of files or a slow internet connection, as it can save you time and reduce the risk of errors.
Rclone update hashes can also help you detect and fix corrupted files, which can be a major problem if you're using a remote storage service.
By updating hashes regularly, you can ensure that your files are accurately synced and that you don't experience any issues with data corruption.
Configuration
To configure rclone for updating hashes, you'll need to open your rclone config file, usually located at YOURHOME/.config/rclone/rclone.conf. This file contains the path to your current active config file.
You can create a new section for the hasher by following the manual configuration instructions. This involves adding the required parameters to the section, including remote, hashes, and max_age.
Here are the required parameters for the hasher: remote is required,hashes is a comma separated list of supported checksums (by default md5,sha1),max_age - maximum time to keep a checksum value in the cache, 0 will disable caching completely, off will cache "forever" (that is until the files get changed).
Manual Configuration
Manual configuration is a powerful way to customize your rclone setup. You can access the current active config file by running `rclone config path`, which will typically be located at `YOURHOME/.config/rclone/rclone.conf`.
To manually configure rclone, you'll need to open this file in your favorite text editor. Find the section for the base remote and create a new section for the hasher. The hasher takes the following parameters:
- remote: required,
- hashes: a comma-separated list of supported checksums (by default md5,sha1),
- max_age: maximum time to keep a checksum value in the cache, 0 will disable caching completely, off will cache "forever" (that is until the files get changed).
Make sure the remote has a colon in it, or rclone will use a local directory of that name. So if you use a remote of `/local/path`, rclone will handle hashes for that directory.
Configuration Reference
The configuration reference is where things get really interesting. Here are the Standard options specific to hasher, which can help you get better checksums for other remotes.
Hasher is a feature that provides better checksums. This can be super helpful when working with other remotes.
The Standard options for hasher include specific settings that can improve checksum accuracy. This is especially useful when you're working with large files or complex projects.
Better checksums can save you a lot of time and frustration in the long run. By using the Standard options for hasher, you can ensure that your files are accurately checked and synced.
Hashing
Hashing is an essential part of rclone, allowing you to verify the integrity of your files. The supported checksum types can be found in the hasher-hashes configuration, which is a comma-separated list.
Rclone's hasher-hashes configuration includes md5 and sha1 by default. You can also specify additional checksum types if needed. For example, if you need to support sha256, you can add it to the list.
Rclone follows a specific process to calculate hashes, which involves checking if the requested hash is supported by the lower level, and if not, calculating the hash on the fly or building an object fingerprint. This process ensures that rclone can handle various file sizes and types efficiently.
Here are the supported checksum types in rclone, as specified in the hasher-hashes configuration:
- md5
- sha1
You can check the rclone documentation for more information on how to configure and use hasher-hashes.
Pre-Seed and Cache
Pre-seeding with a SUM file is a one-time action that fills in cache entries with fingerprints from the file. The command will not check if the supplied values are correct, so you must know what you're doing.
Paths in the SUM file are treated as relative to the hasher directory. This means you can use the full path or a relative path to the file.
The SUM file will not get "attached" to the remote, and cache entries can still be overwritten later if the object's fingerprint changes.
The stickyimport command is similar to import but works much faster by skipping the initial tree walk and stat checks. It creates sticky entries bound to the file name alone, ignoring size, modification time, etc.
Here are some key differences between import and stickyimport:
- Import requires a tree walk, while stickyimport skips it.
- Import checks file fingerprints, while stickyimport ignores them.
- Import can be slower due to stat checks, while stickyimport is faster.
Pre-Seed from Sum File
Hasher supports two backend commands for pre-seeding: generic SUM file import and faster but less consistent stickyimport. The generic SUM file import command can use any hash supported by the remote, not just SHA1.
You can point the command to a local or remote text file in SUM format, and it will parse the file and fill in the cache entries accordingly. Paths in the SUM file are treated as relative to hasher:dir/subdir.
Here are some key things to keep in mind when using the generic SUM file import command:
- Paths in the SUM file are treated as relative to hasher:dir/subdir.
- The command will not check that supplied values are correct.
- This is a one-time action.
- The tree walk can take long depending on the tree size.
The command will not attach the SUM file to the remote, so cache entries can still be overwritten later if the object's fingerprint changes. You can increase the number of checkers to make the tree walk faster.
Cache Storage
Cache Storage is a crucial aspect of rclone's efficiency. It stores cached checksums as bolt database files under the rclone cache directory, usually ~/.cache/rclone/kv/.
Databases are maintained one per base backend, named like BaseRemote~hasher.bolt. This means you'll have a separate database for each base backend you're using.
Checksums for multiple aliases into a single base backend will be stored in the single database. This is useful for managing multiple paths that point to the same base backend.
Local paths are treated as aliases into the local backend (unless encrypted or chunked) and stored in ~/.cache/rclone/kv/local~hasher.bolt. This is a convenient way to manage local files.
Basic Operations
You can use rclone to update hashes for a specific file or directory by referring to it as "Hasher2:subdir/file" instead of the base remote. This allows for more flexibility and ease of use.
The rclone command will automatically update the cache with new checksums when a file is fully read or overwritten.
To refresh all cached checksums for a subtree, you can re-download all files in the subtree using the "hashsum --download" command with a supported hashsum on the command line.
This command will re-read the files in the subtree, updating the cache with the latest checksums.
Featured Images: pexels.com