Working with hash algorithms (The Libgcrypt Reference Manual)

7.2 Working with hash algorithms

To use most of these function it is necessary to create a context; this is done using:

Function: gcry_error_t gcry_md_open (gcry_md_hd_t *hd, int algo, unsigned int flags)

Create a message digest object for algorithm algo. flags may be given as an bitwise OR of constants described below. algo may be given as 0 if the algorithms to use are later set using gcry_md_enable. hd is guaranteed to either receive a valid handle or NULL.

For a list of supported algorithms, see Available hash algorithms.

The flags allowed for mode are:

GCRY_MD_FLAG_SECURE

Allocate all buffers and the resulting digest in "secure memory". Use this if the hashed data is highly confidential.

GCRY_MD_FLAG_HMAC

Turn the algorithm into a HMAC message authentication algorithm. This only works if just one algorithm is enabled for the handle and that algorithm is not an extendable-output function. Note that the function gcry_md_setkey must be used to set the MAC key. The size of the MAC is equal to the message digest of the underlying hash algorithm. If you want CBC message authentication codes based on a cipher, see Working with cipher handles.

GCRY_MD_FLAG_BUGEMU1

Versions of Libgcrypt before 1.6.0 had a bug in the Whirlpool code which led to a wrong result for certain input sizes and write patterns. Using this flag emulates that bug. This may for example be useful for applications which use Whirlpool as part of their key generation. It is strongly suggested to use this flag only if really needed; and if possible, the data should be re-processed using the regular Whirlpool algorithm.

Note that this flag works for the entire hash context. If need arises, it may be used to enable bug emulation for other hash algorithms. Thus you should not use this flag for a multi-algorithm hash context.

You may use the function gcry_md_is_enabled to later check whether an algorithm has been enabled.

If you want to calculate several hash algorithms at the same time, you have to use the following function right after the gcry_md_open:

Function: gcry_error_t gcry_md_enable (gcry_md_hd_t h, int algo): Add the message digest algorithm algo to the digest object described by handle h. Duplicated enabling of algorithms is detected and ignored.

If the flag GCRY_MD_FLAG_HMAC was used, the key for the MAC must be set using the function:

Function: gcry_error_t gcry_md_setkey (gcry_md_hd_t h, const void *key, size_t keylen): For use with the HMAC feature or BLAKE2 keyed hash, set the MAC key to the value of key of length keylen bytes. For HMAC, there is no restriction on the length of the key. For keyed BLAKE2b hash, length of the key must be in the range 1 to 64 bytes. For keyed BLAKE2s hash, length of the key must be in the range 1 to 32 bytes.

After you are done with the hash calculation, you should release the resources by using:

Function: void gcry_md_close (gcry_md_hd_t h): Release all resources of hash context h. h should not be used after a call to this function. A NULL passed as h is ignored. The function also zeroises all sensitive information associated with this handle.

Often you have to do several hash operations using the same algorithm. To avoid the overhead of creating and releasing context, a reset function is provided:

Function: void gcry_md_reset (gcry_md_hd_t h): Reset the current context to its initial state. This is effectively identical to a close followed by an open and enabling all currently active algorithms.

Often it is necessary to start hashing some data and then continue to hash different data. To avoid hashing the same data several times (which might not even be possible if the data is received from a pipe), a snapshot of the current hash context can be taken and turned into a new context:

Function: gcry_error_t gcry_md_copy (gcry_md_hd_t *handle_dst, gcry_md_hd_t handle_src): Create a new digest object as an exact copy of the object described by handle handle_src and store it in handle_dst. The context is not reset and you can continue to hash data using this context and independently using the original context.

Now that we have prepared everything to calculate hashes, it is time to see how it is actually done. There are two ways for this: one to update the hash with a block of memory and one macro to update the hash by just one character. Both methods can be used on the same hash context.

Function: void gcry_md_write (gcry_md_hd_t h, const void *buffer, size_t length): Pass length bytes of the data in buffer to the digest object with handle h to update the digest values. This function should be used for large blocks of data. If this function is used after the context has been finalized, it will keep on pushing the data through the algorithm specific transform function and change the context; however the results are not meaningful and this feature is only available to mitigate timing attacks.

Function: void gcry_md_putc (gcry_md_hd_t h, int c): Pass the byte in c to the digest object with handle h to update the digest value. This is an efficient function, implemented as a macro to buffer the data before an actual update.

The semantics of the hash functions do not provide for reading out intermediate message digests because the calculation must be finalized first. This finalization may for example include the number of bytes hashed in the message digest or some padding.

Function: void gcry_md_final (gcry_md_hd_t h): Finalize the message digest calculation. This is not really needed because gcry_md_read and gcry_md_extract do this implicitly. After this has been done no further updates (by means of gcry_md_write or gcry_md_putc) should be done; However, to mitigate timing attacks it is sometimes useful to keep on updating the context after having stored away the actual digest. Only the first call to this function has an effect. It is implemented as a macro.

The way to read out the calculated message digest is by using the function:

Function: unsigned char * gcry_md_read (gcry_md_hd_t h, int algo): gcry_md_read returns the message digest after finalizing the calculation. This function may be used as often as required but it will always return the same value for one handle. The returned message digest is allocated within the message context and therefore valid until the handle is released or reset-ed (using gcry_md_close or gcry_md_reset) or it has been updated as a mitigation measure against timing attacks. algo may be given as 0 to return the only enabled message digest or it may specify one of the enabled algorithms. The function does return NULL if the requested algorithm has not been enabled.

The way to read output of extendable-output function is by using the function:

Function: gpg_err_code_t gcry_md_extract (gcry_md_hd_t h, int algo, void *buffer, size_t length): gcry_mac_read returns output from extendable-output function. This function may be used as often as required to generate more output byte stream from the algorithm. Function extracts the new output bytes to buffer of the length length. Buffer will be fully populated with new output. algo may be given as 0 to return the only enabled message digest or it may specify one of the enabled algorithms. The function does return non-zero value if the requested algorithm has not been enabled.

Because it is often necessary to get the message digest of blocks of memory, two fast convenience function are available for this task:

Function: gpg_err_code_t gcry_md_hash_buffers ( int algo, unsigned int flags, void *digest, const gcry_buffer_t *iov, int iovcnt )

gcry_md_hash_buffers is a shortcut function to calculate a message digest from several buffers. This function does not require a context and immediately returns the message digest of the data described by iov and iovcnt. digest must be allocated by the caller, large enough to hold the message digest yielded by the the specified algorithm algo. This required size may be obtained by using the function gcry_md_get_algo_dlen.

iov is an array of buffer descriptions with iovcnt items. The caller should zero out the structures in this array and for each array item set the fields .data to the address of the data to be hashed, .len to number of bytes to be hashed. If .off is also set, the data is taken starting at .off bytes from the begin of the buffer. The field .size is not used.

The only supported flag value for flags is GCRY_MD_FLAG_HMAC which turns this function into a HMAC function; the first item in iov is then used as the key.

On success the function returns 0 and stores the resulting hash or MAC at digest.

Function: void gcry_md_hash_buffer (int algo, void *digest, const void *buffer, size_t length);

gcry_md_hash_buffer is a shortcut function to calculate a message digest of a buffer. This function does not require a context and immediately returns the message digest of the length bytes at buffer. digest must be allocated by the caller, large enough to hold the message digest yielded by the specified algorithm algo. This required size may be obtained by using the function gcry_md_get_algo_dlen.

Note that in contrast to gcry_md_hash_buffers this function will abort the process if an unavailable algorithm is used.

Hash algorithms are identified by internal algorithm numbers (see gcry_md_open for a list). However, in most applications they are used by names, so two functions are available to map between string representations and hash algorithm identifiers.

Function: const char * gcry_md_algo_name (int algo): Map the digest algorithm id algo to a string representation of the algorithm name. For unknown algorithms this function returns the string "?". This function should not be used to test for the availability of an algorithm.

Function: int gcry_md_map_name (const char *name): Map the algorithm with name to a digest algorithm identifier. Returns 0 if the algorithm name is not known. Names representing ASN.1 object identifiers are recognized if the IETF dotted format is used and the OID is prefixed with either "oid." or "OID.". For a list of supported OIDs, see the source code at cipher/md.c. This function should not be used to test for the availability of an algorithm.

Function: gcry_error_t gcry_md_get_asnoid (int algo, void *buffer, size_t *length): Return an DER encoded ASN.1 OID for the algorithm algo in the user allocated buffer. length must point to variable with the available size of buffer and receives after return the actual size of the returned OID. The returned error code may be GPG_ERR_TOO_SHORT if the provided buffer is too short to receive the OID; it is possible to call the function with NULL for buffer to have it only return the required size. The function returns 0 on success.

To test whether an algorithm is actually available for use, the following macro should be used:

Function: gcry_error_t gcry_md_test_algo (int algo): The macro returns 0 if the algorithm algo is available for use.

If the length of a message digest is not known, it can be retrieved using the following function:

Function: unsigned int gcry_md_get_algo_dlen (int algo): Retrieve the length in bytes of the digest yielded by algorithm algo. This is often used prior to gcry_md_read to allocate sufficient memory for the digest.

In some situations it might be hard to remember the algorithm used for the ongoing hashing. The following function might be used to get that information:

Function: int gcry_md_get_algo (gcry_md_hd_t h): Retrieve the algorithm used with the handle h. Note that this does not work reliable if more than one algorithm is enabled in h.

The following macro might also be useful:

Function: int gcry_md_is_secure (gcry_md_hd_t h): This function returns true when the digest object h is allocated in "secure memory"; i.e. h was created with the GCRY_MD_FLAG_SECURE.

Function: int gcry_md_is_enabled (gcry_md_hd_t h, int algo): This function returns true when the algorithm algo has been enabled for the digest object h.

Tracking bugs related to hashing is often a cumbersome task which requires to add a lot of printf statements into the code. Libgcrypt provides an easy way to avoid this. The actual data hashed can be written to files on request.

Function: void gcry_md_debug (gcry_md_hd_t h, const char *suffix): Enable debugging for the digest object with handle h. This creates files named dbgmd-<n>.<string> while doing the actual hashing. suffix is the string part in the filename. The number is a counter incremented for each new hashing. The data in the file is the raw data as passed to gcry_md_write or gcry_md_putc. If NULL is used for suffix, the debugging is stopped and the file closed. This is only rarely required because gcry_md_close implicitly stops debugging.