Cryptology with small message spacesWayne Patterson
Howard University
Cybersecurity Education and Research Center
Small Message SpacesConsider the standard cryptographic environment with a message space M, a cipher text space C, and a family of transformations parameterized by the elements of a key space K. Consider further the extraordinary restriction that M is small, that is that there are a very limited number of messages to pass.
Although this is somewhat of a pathological example, it certainly has its analog both historically and in contemporary practice. For example, on a battlefield, in a critical situation, only a limited number of messages are required to be transmitted among the troops: “advance”, “retreat”, “flank left”, “flank right”, and so on. We will give another example later that is perhaps more common in contemporary society.
Current cryptosystems in widespread use are designed to be flexible, secure, and efficient with very large message spaces. For example, in the Advanced Encryption Standard (AES), the simplest configuration allows for messages drawn from a message space of cardinality
2^{128}, which is approximately
3 x 10^{38}. In the RivestShamirAdleman Public Key
Cryptosystem (RSA), the designers’ recommendation is that the message
space have cardinality approximately 10^{400}.
In a cryptosystem where the number of possible messages is small  in the previous example and in one to follow M < 100  either AES or RSA can be broken.
Consequently, if we were to use, for example, an RSA cryptosystem for this restricted case, with the small set of potential messages m1, m2, …, mk, then the cryptanalyst monitors and stores c1 = m1e (mod n), … and assuming he or she has a known plaintext, then the ci’s can be easily mapped back to the mi.
RSA for Small Message Spaces
There are several methods that will allow us to modify standard cryptosystems to enable them to be used in the situation with small message spaces. For example:
Key generation: p, q are large primes as in the RivestShamirAdleman
public key cryptosystem. They are kept secret. n = p × q, and e, a
number pseudorandomly chosen that is relatively prime to
f(n)
are made public, as is a hash function H with the property that it is
easy to generate integers k in the range [2,n1] that will hash to any
given value. A simple example of such a function could be the sum of the
individual digits. The final element of the cryptosystem is also secret:
it is d, the inverse of e mod (f(n)),
that is e× d = 1 (mod (f(n)).
Preprocessing: For each element
in the small message space, a
number m’ in the range [2,n1] is pseudorandomly generated by Alice and
sent to Bob. This initial phase is not publickey, as the correspondence
between the element of the message space m and the corresponding m’ must
be communicated to Bob using a common key.
Encryption: Alice chooses the element m to be
communicated, and computes its hash value H(m); then she finds another
m’ such that H(m) = H(m’), encrypts m’ (computes c’ = m’^{e}(mod
n) and sends to Bob.
Decryption: Bob receives c’, decrypts using his
private key d, and obtains c’^{d} = m’ (mod n). Then he computes
H(m’), which is also H(m), and so he can compare H(m) to each of the
hash values of the original messages he had received in the
preprocessing phase to obtain the required message.
Cryptanalysis: If Trudy intercepts c’, she cannot
compute the hash function H because the value of c’ yields no
relationship to the value of m’ or of H(m’). Thus Trudy is unable to
intercept the signal.
Although this is a simple enough modification to
RSA, it could be extended to other common cryptosystems. In addition, it
has been suggested that
either El Gamal approaches or probabilistic encryption could also be
used instead of the hash function approach described above.
A Second
Example
In June of 2010, I was invited to give a
presentation at the Baseball Hall of Fame in Cooperstown, New York, for
the 22^{nd} Annual Symposium on Baseball and American Culture,
on the subject of “The Cryptology of Baseball.” I had given numerous
talks at the Hall of Fame in the past, on various aspects of baseball
and how it impacts and is impacted by cultural factors in society. My
research and writing on baseball is really only an avocation for me.
As a professor of computer science over the past
thirty years, most of my scholarly work in the field has been with
respect to cryptorelated issues. So although I thought at first in
speaking to an audience of persons who study baseball that I would have
to address the issue in a very nontechnical fashion.
One might question
the value of encryption in baseball given the limited amount of
information being communicated. However, a baseball game can be decided
by one pitch, and consequently one pitch can determine the winner or
loser of a series. It should perhaps be noted that the share for the
World Series winning team in 2009 was $21,266,321.79, and for each
player earning a full share, $365,052.73.
The use of cryptography and cryptanalysis in the
baseball world is normally referred to as “signals” and “sign stealing”.
Sending signals to batters and runners provides the greatest requirement
for encryption, and as a consequence, the greatest opportunity for
cryptanalysis. These are signals that must be conveyed both to a batter
and any baserunner for the team at bat. The signals originate with the
manager, and are relayed to the thirdbase coach (always the thirdbase
coach and not the firstbase coach, although there is no valid reason
for this).
The batter and any baserunners receive their
signals from the thirdbase coach, and so this is the transaction we
have studied in the greatest detail.
The ThirdBase Coach Cryptosystem
As indicated, the most critical set of signals with
the greatest complexity and highest risk involved in the interception of
the signals are those that are given by the thirdbase coach to the
batter and any base runners.
In the thirdbase coach cryptosystem (henceforth,
TBC), the messages that need to be conveyed to the batter and
baserunner(s) largely define the following message space:
M_{TBC} = {
Steal, Hold, StealOnOverthrow, Take, Bunt, Swing, HitAndRun, RunAndHit,
SqueezeBunt, Null}
When the signal is to be given, the thirdbase
coach selects a number of body movements, such as touching his belt
(Belt), clapping his hands (Clap), touching the brim of his cap (Cap),
and so on. The collection of potential body movements we will call BODY.
An example might be:
BODY = {Belt, Clap, Cap, Leg, Nose,
Ear, Shoulder, Thigh, Cheek, Wipe }
Prior to the game, a mapping from the M_{TBC}
to BODY is defined, say s: M_{TBC}
® BODY. Touch the cap might mean bunt, for example, so
than s(Bunt) = Cap.
However, the ciphertext consists of a sequence
selected from the product set HS × BODY, such as (1, Belt), (2, Leg),
(3, Nose), (4, Ear), (5, Shoulder), (6, Clap). In order to decrypt the
ciphertext, the batter and runner, knowing the hot sign, look for the
body movement associated with the hot sign and perform the inverse map
that originally associated the plaintext message to the body movement.
As an example, if x = 4
Î HS; and the ciphertext sequence is (1, Belt), (2, Leg),
(3, Nose), (4, Ear), (5, Shoulder), (6, Clap) as indicated above, the
signal is the preimage of the value Ear in M_{TBC}. So if s^{1}(Ear)
= Take, then the correct decryption by the batter and runner would be
for the batter to take (that is, not swing) at the next pitch.
Thus in more standard crypto terminology, the “hot
sign” and the mapping s define the key.
A full description of the TBC cryptosystem and its
cryptanalysis will be found in the forthcoming paper
“The Cryptology of Baseball”, by this author, in the journal
Cryptologia, appearing in 2011.
Wayne Patterson
Howard University Cybersecurity Education and Research Center
December 31, 2010
