### Entropy is forever ...

Dear all:
Quoting the basic definition of entropy from Wikipedia, "In information
theory, entropy is the average amount of information contained in each
message received. Here, message stands for an event, sample or character
drawn from a distribution or data stream." In applied cryptography, the
entropy of a truly random source of "messages" is an important
characteristic to ascertain. There are significant challenges when
applying the information theory, probability, and statistics concepts to
applied cryptography. The truly random source (and computations using
random message data) must be kept secret. Also the following dilemma
should be noted: the truly random source is needed in a digital
processor system typically engineered with determinism as a design goal
derived from the basic reliability requirement. Quantitatively, the
entropy measure for applied cryptography, in the order of hundreds of
bits, is way beyond any workable statistical analysis processes. In
practice, a truly random source usable for applied cryptography is a
system procurement issue that can seldom be blindly delegated as an
ordinary operating system service. Thus, one wants a reliable source of
uncertainty, a trustworthy one than can barely be tested, as a universal
operating system service totally dependent on hardware configuration.
Applying the information theory to actual situations is error-prone. Is
there a lower entropy in "Smith-725" than in "gg1jXWXh" as a password
string? This question makes no sense as the entropy assessment applies
to the message source. A password management policy that rejects
"Smith-725" as a message originating from the end-user population
actually constraints this message source with the hope of a higher
average amount of information in user-selected passwords. From a single
end-user perspective having to deal with an ever growing number of
passwords, the entropy concept appears as a formalization of the
impossible task he/she faces.
Significant conceptual deviation may occur from the common (and correct)
system arrangement where a software-based pseudo-random number generator
(PRNG) of a suitable type for cryptography is initially seeded from a
secret true random source and then used for drawing secret random
numbers. It is often inconvenient for statistical testing to apply
directly to the true random source messages, but statistical testing of
the PRNG output gives no clue about the true random source. The design
of PRNG seeding logic is an involved task dependent on the true random
source which may be hard to modelize in the first place. In actual
system operations, the inadequate seeding may have catastrophic indirect
consequences but it may be difficult to detect, and it is certainly a
challenging error condition for service continuity (programmers may be
inclined to revert to insecure PRNG seeding when the proper true random
source breaks down).
Despite these pitfalls, I assume my reader to share my endorsement of
the true random seeding of a cryptographic PRNG as the main source of
random secrets for a digital processor system dedicated to cryptographic
processing. As this PRNG output is being used in various ways, chunks of
the output sequence may be disclosed to remote parties. It is an
essential requirement for a cryptographic PRNG that no output chunk may
allow the recovery of its internal state (i.e. some data equivalent to
PRNG seed data leading to the same PRNG output sequence as the secret PRNG).
In this note, I challenge the view that an entropy pool maintained by an
operating system ought to be depleted as it is used. I am referring here
to the Linux "entropy pool." My challenge does not come through a review
of the theory applied to the implementation. Instead, I propose a
counter-example in the form of the above arrangement and a very specific
example of its use.
The central question is this problem. A system is booted and receives
2000 bits of true randomness (i.e. a 2000 bits message from a source
with 2000 bits of entropy) that are used to seed a cryptographic PRNG
having an internal state of 2000 bits. This PRNG is used to generate 4
RSA key pairs with moduli sizes of 2400 bits. The private keys are kept
secret until their use in their respective usage contexts. No data leak
occurred during the system operation. After the key generation, the
system memory is erased. What is the proper entropy assessment for each
of the RSA key pairs (assume there are 2^2000 valid RSA moduli for a
moduli size of 2400 bits, a number-theoretic assumption orthogonal to
the entropy question)?
My answer is that each of the 4 RSA key pairs are independently backed
by 2000 bits of entropy assurance. The entropy characterization
(assessment) of a data element is a meta-data element indicating the
entropy of a data source at the origin of the data, plus the implicit
statement that no information loss occurred in the transformation of the
original message into the assessed data element. Accordingly, my answer
should be made more precise by referring to an unbiased RSA key
generation process (which should not be considered a reasonable
assumption for the endorsement of lower ranges of entropy assessments).
To summarize, the entropy assessment is a characterization of a the data
source being used as a secret true random source. It also refers to the
probability distribution of messages from the data source and the
quantitative measure of information contents derived from the
probability distribution according to the information theory. This
mathematical formalism is difficult to apply to actual arrangements
useful for cryptography, notably because the probability distribution is
not reflected in any message. The information theory is silent about the
secrecy requirement essential for cryptographic applications. Maybe
there is confusion by assuming that entropy is lost when part of the
random message is disclosed, while only (!) data suitability for
cryptographic usage is being lost. In applying the information theory to
the solution of actual difficulties in applied cryptography, we should
address secrecy requirements independently. The probability distribution
preservation through random message transformations is an important
lesson from the theory that might have been overlooked (at least as an
explicit requirement).
A note about the genesis of the ideas put forward. In my efforts to
design applied cryptography key management schemes without taking
anything for granted and paying attention to the lessons from the
academia and their theories, I came with a situation very similar to the
above problem statement. The 2000 bit random message from a 2000 bits
entropy truly random source is a simplification to the actual situation
in which a first message transformation preserves the probability
distribution of random dice shuffling. In the above problem statement,
the PRNG seeding is another distribution preserving transformation. The
actual PRNG is based on the Blum-Blum-Shub x^2 mod N generator, which
comes with two bits of entropy loss upon seeding. The above problem
statement is thus concrete.
Maybe the term entropy is used, more or less by consensus, with a
definition departing from the information theory. Indeed, NIST documents
covering the topic of secret random numbers for cryptography use
conflicting definitions surrounding the notion of entropy.
Although my own answer to the stated problem puts into question the
Linux "entropy pool" depletion on usage, I do not feel competent to make
suggestions. For instance, my note hints that a PRNG algorithm selection
should be part of the operating system service definition for
*/dev/*?random offered for cryptographic purposes but I have just a vague
idea of whether and how the open source community might move in this
direction.
Entropy is forever ... until a data leak occurs.
A diamond is forever ... until burglars break in.
Regards,
- Thierry Moreau