ARPA2 Identity: Local-part Encryption

This details a procedure that may be used to encrypt most of the local-part in an ARPA2 Identity.

Signature Format

The local-part is quite small, it only allows up to 64 characters of content. Signatures come with a number of fields, that we shall represent as compactly as possible.

The signature is normally written in BASE32 form. What it captures is an operator-defined number of bits to capture a key identity, followed by the beginning of a hash. The highest bits of the first byte of the hash are the highest bits of the first character of BASE32 after the key identity.

The operator chooses locally how long the key identity and hash are, in bits. The total number of bits is rounded up to match with a character ending of the encoding.

The idea of BASE32 is that it is case-insensitive. This being a local convention, it would be possible for administrators to assume that all foreign mail servers retain case and because of that require reply addresses to be used with case sensitivity, in which case they might opt for another encoding to capture more bits in less characters.

The idea of the key identity is to allow key rollover without the need to compute all occurring keys. The length of the key identity may be set per key, so the administrator may learn. Keys with the longest key identity are probably the most recent, so they would be good to test first. In general, it should be assumed that the key identity allows quickly dumping a key, but may match with multiple keys.

Key Information to Store

The signature uses a secure hash algorith, setup as a local choice by an operator or by the software.

Keys hold as much entropy as the output of the hash. They are stored with the same size as the hash's block size, which can be longer than the output size. When sufficient entropy exists in less than the block size, it is possible to fill up with fixed bits, such as zeroes.

The information to save for a key involves:

The number of bits in the key identity.
The least number of bits in the signatures with this key.
The key identity as a number.
The hash algorithm used.
The key data for the block size of the hash algorithm.
The signature flags supported by this key.

Selective use of Signature Flags

Applications such as Access Control Lists may impose minimum requirements on Signature Flags. When they do, they do this in the form of the bits that are minimally required. This is in fact a field of Signature Flags; any passable signature must hold at least these bits (since they are all restricting).

Signature Calculation

The signature is computed over a sequence of fields, as indicated by the signature flags in the order in which these are mentioned. Nothing of the local-part is included yet, however.

Before anything else, the key data, with the same size as a block size of the hash algorithm, is included.

The first data to include are the signature flags in their BASE32 representation. These flags describe when they terminate, so there is no need to prefix this data with a length.

User data parts each start with 16 bits with the number of included data bytes in network byte order, followed by that number of bytes, which must not exceed 65535 bytes. Anything longer will be disapproved of. This applies to the data per flag, as well as to the local identity data described below.

If encryption is requested, a clone of the hash state is needed before including the local part. This is because we need to encrypt and sign. The clone is extended with a literal string (not prefixed with a length) that holds the printable ASCII codes for

-----ARPA2 IDENTITY ENCRYPTION BLOCK 0-----

and then the hash is computed. The outcome serves as an encryption key, to be used for mapping the local identity as described below. Might that routine need more bits than supplied with one such calculation, then it is possible to change BLOCK 0 into BLOCK 1 and compute the hash once more (of course starting from a separate clone of the hash state) and so on.

The pre-clone hash state continues with the insertion of decrypted local-part information of the local ARPA2 Identity. This may or may not involve freetext, as determined by signature flags and the result is one string, included in the pre-cloned hash state with an update that starts with the usual 16 bit length prefix.

The signature calculation is terminated with the string holding the printable ASCII codes for

-----ARPA2 IDENTITY SIGNATURE BLOCK 0-----

As before, additional bits may be created by replacing BLOCK 0 with BLOCK 1 and so on.

The length of the signature is bound to a minimum. There are two ways to grow beyond this value:

When there are extra bits due to the encoding, which gives blocks of 5 or 6 bits.
When less than the full 64 bytes are used in the local-part, and signer policy dictates that a signature always fills the remainder to the formally permitted maximum.

Encryption for Local Parts

The local parts can only be encrypted inasfar as their values are not incorporated into the signature, or at least before the cloning for derivation of an encryption key. This is why the base identity and freetext are incorporated in the signing schema after the cloning part. As a result, these parts can be encrypted.

Though not currently done, it would also be possible to apply encryption on a subdomain. This is not simple and not yet defined. It would involve an extra flag and its interpretation would be a requisite as for all flags, so we can make such an extension later on.

For all the characters open to encryption, we define a class of characters, and roll with the key material.

When the local part is considered case sensitive, the following classes of characters are subject to encryption:

All lowercase and uppercase letters and digits

When the local part is considered case insensitive, the following classes of characters are subject to encryption:

All lowercase letters and digits
All uppercase letters are mapped to lowercase

For these classes, there is a number of bits to describe them, rounded up to the nearest integer. This is the number of bits taken off from the hash that was derived as an encryption key. If it is larger than the range, these bits are all dropped and the next so-many bits are taken off. (As an alternative, one might div/mod by the range of the class.) When the range no longer fits in the hash outcome, any remainder of the block is dropped and the next block of encryption output must be used to continue.

Since we have a sequence of random bits we can make the one-time pad assumption. Once we have a random part with the same range as the class of a character, we can offset the value by that number, cycling around, to find the new value. Encryption adds a value with wrap-around, decryption subtracts the same value with wrap-around. The rules above are made to avoid reuse of the same bits and any bias is removed, so the one-time pad assumption is reasonable.

When the Signature Flag for encryption is set, the base part and freetest in the local-part is passed through this mapping; otherwise, these parts of the local-part pass unmodified.