Archive for the 'Cryptology' Category

Geometry Club Talk: Computational aspects of ECDLP

Wednesday, April 23rd, 2008

On Friday I gave a geometry club seminar, speaking about some of the computational aspects of discrete-logarithm cryptography in general and as implemented for elliptic curves. My notes supplement rather than completely describe the talk, being heavier on the formalities and lighter on the narrative.

The topics covered are: Diffie-Hellman and one-way functions for key exchange; the generic Discrete Logarithm Problem and BSGS algorithm; scalar multiplication- addition chains, fast exponentiation, m-ary methods and windowing; group law implementations, Side-channel attacks and the Edwards form.

I’ve discussed several of these ideas elsewhere on this blog, as well as the cryptanalysis ideas I mentioned on the day but which are not in the notes. I also refered to a recent real-world example of a side-channel attack; see this story from The Register for details.

Tate pairing computation in SAGE III

Thursday, January 10th, 2008

The latest version of my ellnet class is ellnet2d_lowmem.spyx. It combines all the tricks I know of:

  • The use of precomputed inverses for all steps, and precomputed squares/products for each step, as described by Stange,
  • computation with a local vector to avoid overhead from function calls to keep the dictionary up-to-date,
  • mixed block lengths as described in the previous post,
  • and compilation to pyrex.

Thus it’s the fastest implementation I currently have for finding Tate pairings in SAGE (about twice as fast as accessing Stange’s PARI script from SAGE). Attach it in the normal way; example calculations are here.

Understanding Public Key Cryptography with Paint

Saturday, December 8th, 2007

(this post roughly corresponds to the narrative from part of a talk I’m preparing for S5 (approximately 16 years old) students, intended to portray a modern research area in an accessible manner. So I’d very much appreciate feedback in the comments!)

For centuries, cryptography - literally, `secret writing’ - has been used to securely send and receive messages. But although the sophistication of these systems increased, the core idea remained the same: combining a secret encryption rule with the plaintext message yields a ciphertext, from which the message is recovered by a corresponding decryption rule. Thus the secrecy of messages depended on preserving the secrecy of the cryptographic system (or at least certain parameters).

While this might be feasible for governments or armies, it leads to a fatal flaw when trying to communicate securely with a stranger, a task that underpins the millions of ecommerce transactions that take place every day, for instance. To share secrets, you must first share a secret, the particulars of the cryptographic system you wish to protect the message with. This presents a seemingly impossible hurdle: how can you share that first secret with a previously-uncontacted individual, if any instructions you give will also be available to your adversaries?

Public Key encryption is the solution to this problem; to get a feel for how this is achieved, we’ll consider a non-mathematical formulation in terms of mixing paint, before abstracting to the properties that make it work.

Secret Sharing with Paint

Suppose, then, that our protagonists, Alice and Bob, wish to share a secret: but all their communication is intercepted by an eavesdropper, Eve. How can Alice and Bob arrive at a colour without Eve also knowing it?

Alice and Bob are assumed to know a public base colour- and there’s no problem with Eve knowing this too. They then choose a private colour of their own, and combine some of that with the base colour to create a public mix. They can then send these mixtures to each other: Eve sees both public colours, but (since it’s a lot harder to unmix paint), has no idea what private colours were used to produce them.

Having received each others’ mix, Alice and Bob can then mix in their own private colour again, to produce a blend of three colours. But, each of them will have the same colour, since the order in which we mix paint is irrelevant. Eve, however, has no idea what this new mixture looks like.

The following table summarises who sees what, for a particular set of chosen private colours.

Useful Properties

What are the key ideas that make the example above work? and how can we mimic them mathematically?

Unmixing is necessary…

No combination of the colours Eve has seen will mix to give the fetching shade of mustard yellow that Alice and Bob know. Since they had to agree on the procedure, Eve would be able to recreate the desired shade if she knew either of the private colours, since then she could mix it with the corresponding public colour just like Alice and Bob. Unfortunately, the private colours are never disclosed, only their combination with the base colour. So Eve must analyse the public colours in the hope of extracting a private colour.

…but unmixing is hard!

Without an encyclopaedic knowledge of all combinations of paint, Eve cannot know what private colours have been used to generate the public ones. So her only apparent option is to keep trying candidates, mixing each of them with the base coat until she arrives at one of the public colours by sheer luck. This brute force approach will obviously take a very long time!

Fortunately, Alice and Bob don’t need to unmix.

For Alice and Bob, this is irrelevant- they’re only ever required to mix, which is much easier than unmixing. However, it’s vital that their two routes through the colours lead to the same result: that is, that Blue+Red+Green is the same as Blue+Green+Red.

Secret Sharing with Maths

This leads us in search of a ‘one-way function’: roughly speaking, a mathematical function with the property that it’s much more difficult to recover the inputs from the outputs (reverse the function) than it is to compute those outputs in the first place, thus satisfying the second property above. Moreover, we need a procedure by which Alice and Bob can make use of such functions to independently arrive at a mutual secret which cannot be obtained by Eve. To do so, we therefore require that the only way to deduce a shared secret is indeed to reverse the function (the first property). Finally, to make the whole thing work as described above, we require that two applications of the function can be performed in either order to give the same result. This is the third property, described mathematically as commutativity.

Unfortunately, no-one has been able to demonstrate a genuinely one-way function. Fortunately, there are a few candidates, for which even the best publically-known techniques for the reversal are painfully slow. But there remains a risk that someone, somewhere, will devise a smarter way to perform this mathematical unmixing, rendering the function useless for cryptographic application.

A candidate One-Way Function: Modular Exponentiation

For a number x, we say x is congruent to y modulo N (written x=y mod N) if y is the remainder after dividing x by N. This might seem a strange idea, but it’s something we do every day: a clock face works “mod 12″, so that if you add 6 to 7 you get 1, as 13 = 12*1 +1 = 1 mod 12.

So, for a fixed base g and modulus N (our equivalent of the base colour), we can compute the modular exponent of any x, defined as gx mod N (that is, multiply g by itselfx times, subtracting lots of N until the answer is at most N-1).

For instance, with a base of 2 and a modulus of 11, the modular exponent of x=6 is 26 mod 11. Working that out, we get 2*2*2*2*2*2 mod 11 = 64 mod 11 = 9 mod 11 since 64 = 5*11 +9.

The possibly surprising (but desired) result is that going backwards- that is, given y, finding x such that gx mod N =y mod N - is apparently hard for decently-sized N. This reversal is known as the Discrete Logarithm Problem, and generalises to some very abstract mathematical objects, known as finite groups, with varying difficulty.

For our example function, with N=21024+643 and assuming a computer capable of a billion tests per second, naively trying each possible private key in turn can take up to a staggering 3, 671743, 063000, 000000, 000000, 000000, 000000, 000000, 000000, 000000, 000000 years to match with a public key. This is significantly longer than the life of the universe so far - some 13700000000 years - and definitely longer than anyone is prepared to wait. Of course, there are smarter ways to try keys- and finding yet smarter ones for a given DLP is a very active area of research- but for such values of N none of the publically known ones fall within feasible time scales.

Further, we have the commutativity property we required: working out (ga)b is the same as (gb)a. So we should be able to share secret numbers via modular exponentiation just as we were able to share secret colours with paint mixing. This process is known as Diffie-Hellman Key Exchange.

Diffie-Hellman Key Exchange

  • Alice and Bob agree (in public) on a modulus N and a base g (the public base colour)
  • Each chooses a private key between 1 and N-1; a and b respectively (their private colour)
  • They each construct a public key by computing A=ga mod N and B=gb mod N (mixing private with base)
  • These can be safely exchanged, as it’s hard to get back a or b from the public keys A and B (unmixing hard)
  • Each then performs another modular exponentiation on the public key received:
    • Alice computes Ba mod N = (gb)a mod N = (gab) mod N = S mod N for some S between 0 and N-1.
    • Bob computes Ab mod N = (ga)b mod N = (gab) mod N = S mod N, the same S.

Hence (assuming N is chosen for the DLP to be sufficiently hard) Alice and Bob know a secret,S, which Eve does not.

Man-in-the-Middle Attacks

But is Diffie-Hellman truly secure? If we alter the ‘intruder power’, replacing our eavesdropper Eve with a more powerful character, the malicious Mallory, then we can construct a scenario in which Alice and Bob think they share a secret with each other, but instead share one with Mallory.

To achieve this, Mallory must be able to not just listen in on messages, but replace them with messages of his own. Given that power, he can pick a private key of his own, m, and generate the corresponding public key M, since the choice of g and N must be made in the clear.

Then, when Alice attempts to retrieve Bob’s public key B, Mallory instead supplies her with M, keeping A; and when Bob asks for Alice’s key A, revealing B, he is given M as well. Alice then computes S=Ma mod N, but Mallory has seen A and thus can compute S as Am mod N; Bob meanwhile computes T=Mb mod N which is also known to Mallory, being Bm mod N.

Thus, if Alice the tries to use a classical encryption system depending on the secrecy of S, then Mallory will be able to decode the ciphertext. Even more cleverly, he can re-encrypt it using T and forward it to Bob- who can decrypt it with the secret he thinks he shares with Alice, T. Thus no suspicion is raised, yet Mallory has also read the message, without ever having to reverse the one-way function.

Fortunately, this attack depends on near-total control of Alice and Bob’s communication. If Alice ever looks up Bob’s public key without Mallory intervening, then she’ll notice that it’s B, not M. Or, if she sends Bob a message encrypted with S that Mallory doesn’t intercept and repack, Bob will recieve a message that cannot be decrypted with the key he has, T: and knows that the Diffie-Hellman key exchange must have been compromised.

Conclusions

Identifying and preventing such attacks is part of cryptanalysis, and even with perfect cryptography, forms a vital part of designing secure communication systems. Since we don’t yet have a provably one-way function, cryptography itself remains an active field of mathematical research, drawing on a range of topics from both pure and applied areas to assess the difficulty of functions. Together, these two fields are known as cryptology; a subject which is becoming increasingly vital as computers and communication systems work themselves further into our everyday lives.

Tate pairing computation in SAGE II

Thursday, November 22nd, 2007

A couple of comments now that I’ve tested my Tate pairing procedures on some random input.

First, it turns out to be not too difficult to make use of PARI scripts from within SAGE. I spent ages trying to figure out how to call them through the gp interface, until I discovered that new functions simply become additional methods of the gp object; although some SAGE objects have to be converted to their PARI equivalent before you can use them as input. For instance, assuming you have Stange’s tate_via_nets script in your working directory, you can attach it with gp.read(”tate_via_nets.gp”). Then for SAGE points P,Q on a curve E with the order of P stored as n, the following will compute the Tate pairing via PARI:

Egp=gp(E)
Pgp=gp([gp(P.xy()[0]),gp(P.xy()[1])])
Qgp=gp([gp(Q.xy()[0]),gp(Q.xy()[1])])
gp.tate_pairing_alg(Egp,Pgp,Qgp,n)

It’s hard to get accurate measures of the time taken by a PARI subprocess (it’s invisible to SAGE’s cputime, and walltime will be influenced by any other running processes), and these calculations are pretty quick, but it seemed the PARI implementation was faster. So I’ve developed a SAGE version which avoids the overhead of dictionary read/writes by not caching intermediate terms; I’ve also set it up as a pyrex file, which despite containing purely python code (I’ve not ported any of the internals to C) is faster than both the old ellnet2d and the above PARI approach after a one-time compilation. You can download that here; then use attach “ellnet2d_lowmem.spyx” instead of attach “ellnet2d.sage” before proceeding as before.

The good news is that in every test I’ve run (over both prime and prime-power fields), the various implementations all agree. The bad news is that it doesn’t seem possible to compute elliptic curve point orders beyond about 64bits in SAGE (or rather in PARI, upon which it depends), and before that stage the order calculation is significantly slower than the pairing computation. It also seems that a random curve tends not to feature particularly high powers of 2 in its order, so the speed-up I suggested earlier doesn’t offer much advantage at all. Hopefully I can find a class of curves where it is of merit!

Tate pairing computation in SAGE

Thursday, November 8th, 2007

Eventually, then, we’re in a position to compute Tate pairings in SAGE. The necessary ingredients are an elliptic net class (the more general one or a faster version for pairings only), an eds class (this one based on Stange’s algorithm is recommended and used in the examples below; the old one should also work), and supporting procedures tate.

Load these in the usual fashion:

sage: attach "tate.sage"
sage: attach "ellnet1d.sage"
sage: attach "ellnet2d.sage"

We now need to introduce a curve and some points. Stange gives a simple example of y^2+y=x^3+x^2-2x over the field with 5 elements corresponding to points P=(0,0) and Q=(1,0). We enter the field, curve and points and set up an elliptic net:

sage: K=GF(5)
sage: E=EllipticCurve(K,[0,1,1,-2,0])
sage: E
Elliptic Curve defined by y^2 + y = x^3 + x^2 + 3*x over Finite Field of size 5
sage: P=E([0,0])
sage: Q=E([1,0])
sage: X=netFromPoints(P,Q)
sage: X

Elliptic net with (1,0) block:
                1       1       1
4       4       0       1       1       2       1       3

The block centred at 1 is displayed; this has been generated from the initial data, and suffices to compute any term of the form (x,0) or (x,1). If such a term is known, it is simply retrieved from the hash table; otherwise, the algorithm is employed. Requests for unreachable points give a ValueError:

sage: X.has_element((5,0))
True
sage: X[(5,0)]
3
sage: X.has_element((100,1))
False
sage: time X[(100,1)]
Computing terms
CPU times: user 0.01 s, sys: 0.00 s, total: 0.01 s
Wall time: 0.01
1
sage: X.has_element((100,1))
True
sage: time X[(100,1)]
CPU times: user 0.00 s, sys: 0.00 s, total: 0.00 s
Wall time: 0.00
1
sage: X[(100,2)]
---------------------------------------------------------------------------
<type 'exceptions.ValueError'>            Traceback (most recent call last)

/home/s0677951/SAGE/eds/<ipython console> in <module>()

/home/s0677951/SAGE/eds/ellnet2d.py in __getitem__(self, A)
     41                                 # Term not entered and cannot be computed; give an error.
     42                                 raise ValueError, \
---> 43                                 "Term unavailable, cannot compute unless of form (x,0) or (x,1)"
     44
     45         def has_element(self, A):

<type 'exceptions.ValueError'>: Term unavailable, cannot compute unless of form (x,0) or (x,1)

You can also test for and print entire (small)blocks; this illustrates the route taken during computation of elements, for instance in computing the 18th term, the blocks with centres 1, 2, 4, 9 are constructed, then the small block centred on 18:

sage: X.has_block((9,0))
False
sage: X[(18,0)]
Computing terms
0
sage: X.has_block((9,0))
True
sage: X.print_block(9,0)
                1       1       3
4       3       2       0       3       2       1       2
sage: X.has_block((18,0))
False
sage: X.has_small_block((18,0))
True
sage: X.print_small_block(18,0)
                2       4       3
3       4       4       0       1       1       2

To compute the pairing of P and Q, use TatePairing(m,P,Q), where m is the order of P (determining this left as an exercise!):

sage: TatePairing(9,P,Q)
Computing terms
3

Finally, a call to TatePairing with P=Q will construct an eds rather than an ellnet. We can verify an example of Stange’s (see slides from ECC):

sage: K=GF(73)
sage: E=EllipticCurve(K,[0,1,1,-2,0])
sage: P=E([0,0])
sage: TatePairing(9,P,P)
Computing terms
24

Elliptic Nets

Thursday, November 8th, 2007

As always in mathematics, when something turns out to be useful, it’s worth asking whether it can be generalised to a broader setting. For Elliptic Divisibility Sequences, that generalisation is Elliptic Nets. For a free abelian group of finite rank, this is a function W satisfying the condition:

for any p,q,r,s from the group.
(EDS are then recovered as the special case where the chosen group is the integers).

The theory of elliptic nets has been developed extensively by Katherine Stange, and the papers/presentations from her website provide a wealth of information. I’ve been writing a SAGE implementation of her results, from which we need a few key ideas.

What’s remarkable is that it is not just that sequences can be generalised to arrays: the connection to elliptic curves is also preserved. That is, given n distinct points of an elliptic curve, there is a rank n elliptic net corresponding to those points. The terms of the EDS corresponding to each point will appear in the array, but more complicated interactions of the points are also captured. In the rank 1 case, it was possible to estimate heights purely by manipulation of terms of an EDS; in a similar way, routine manipulation of elliptic nets allows for computation of the Tate pairing.

The use of pairings is currently a hot topic in cryptography, with both destructive and constructive applications. The Tate Pairing allows pairs of elements of an elliptic curve group to be mapped to a finite field. In certain cases, and provided the pairing can be computed efficiently, this transplants the Discrete Logarithm Problem (DLP) into an easier environment, thus weakening the cryptosystem. On the other hand, the gap in difficulty between the two DLPs can itself be used to design cryptosystems and protocols, such as Identity based cryptography.

For the theoretical background, I recommend again Stange’s work. The take-home message however, is as follows:

  • For a point P of order m on an elliptic curve E, to compute the pairing of P with itself it suffices to recover terms m+1 and m+2 of the EDS associated with P.
  • For distinct points P,Q on an elliptic curve E with P of order m, to compute the pairing of P with Q it suffices to recover elements (m,0) and (m,1) of the elliptic net associated with P and Q.

Thus I have been interested in the sequence/net calculations necessary to recover these quantities. Notice that it is not necessary to compute general (i,j) elements, but only the rows (i,0) and (i,1). The former is in fact the EDS associated with P. Like Shipsey’s algorithm, Stange presents a double-and-add algorithm for the general case of reaching elements (m,0) and (m,1) via successive blocks of elements. For details on this, a possible refinement of my own, and a SAGE implementation, see the next post. This would solve the first problem as well, by choosing a random Q, but the (i,1) terms introduce an unnecessary overhead. Eliminating this gives rise to an alternative to Shipsey’s algorithm for EDS, so I’ve implemented this case separately with an equivalent interface to my existing eds class. It’s unclear which approach is preferable where (although Stange’s is almost definitely the one to use over finite fields), so I hope to furnish both with equivalent functionality and compare their performance for height computation (details to follow). Finally, the wrappers to move from points to nets and compute pairings have been implemented, although I need a lot more test cases to identify potential shortcomings! Ultimately, I’d like to bundle all these ideas together into a more friendly, rank-aware elliptic net class for both roles (heights and pairings).

The Secrecy Problem

Thursday, May 3rd, 2007

I took a number of interesting modules in formal logic as an undergraduate, and I’m often fascinated by ideas in this field. There are natural connections between cryptographic protocol analysis and string rewriting, and with enough freedom one can emulate general models of computation such as Turing machines by protocols. One particularly neat side effect of this is that it allows an elegant proof of the undecidability of the secrecy problem: namely, given a protocol and a secret message, is there a decision procedure to establish whether an intruder can learn the secret? By reducing to Post’s Correspondence Problem, it can be shown that the answer is no! Of course, this raises issues of intruder strength and ‘honest’ protocols; I’ve written up some of these ideas (along with the proof and an outline of PCP) over on e2.

Geometry Club Talk: Hyperelliptic curves

Friday, April 27th, 2007

Today I spoke at the Geometry Club about the use of hyperelliptic curves in public key cryptography. You can find my slides here, although they were supplemented by some boardwork that you’ll have to figure out from my other postings!

Baby Steps, Giant Steps, and element orders

Monday, April 2nd, 2007

The discrete logarithm problem is the vital part of elliptic curve cryptography, but can be defined (to varying cryptographic strength) for any cyclic group:

Discrete Logarithm Problem (DLP)
Let G be a cyclic group, with operation ⊕. Let [n] represent (n-1)-fold application of ⊕ i.e., [2]g=g⊕g, [3]g=g⊕g⊕g etc.

Given g,h from G, the discrete logarithm problem is to find t such that [t]g=h.

The baby-step, giant step (BSGS) algorithm is a generic algorithm for solving the DLP- that is to say, it makes no appeal to properties of the group involved, merely calculating abstractly with ⊕. There is a theoretical upper bound to the effectiveness of generic algorithms, and BSGS approaches are of that order of magnitude.

The simplest demonstration of BSGS (the original by Shanks) assumes that g generates G, with both of known order n: recall that the group order is simply the number of elements it contains, whereas the order of an element g is the least n such that [n]g=idG, should such a value exist. The order n of an element always divides the order of the group N, with equality when g generates G. Between the two there is also the exponent, the least value e such that [e]g=idG for all g in G; for cyclic groups this is N, but for products of cyclic groups (the structure of groups of rational points) it is the lowest common multiple of their respective orders, and thus, although a divisor of N, may be significantly smaller than it.

So, if we can find an order m rational point on a curve, we know that the cardinality of the group of rational points is zero modulo m. By testing random points and taking the lowest common multiple of their order we can usually find the exponent e of the group. With luck, but not always, this will be large enough that when combined with bounds on the cardinality the latter is established exactly (as with modular information in the SEA algorithm).

But it is no good attempting to do so using an algorithm that requires the order of the point! There are BSGS algorithms for the DLP which can handle unknown order, for a given g we can apply these to solve for h=idG in the cyclic group generated by g. Provided the algorithm gives the minimal t such that [t]g=h=idG in G=<g>, then t is the order of g.

One such algorithm, by Terr, is given in Cohen and Frey’s Handbook of Elliptic and Hyperelliptic curve cryptography (my current bible!) It relies on the following observation:

Let Tn be the nth triangular number: that is, defined by the recursion T1=0, Tn+1=Tn+n.

Then any non-negative integer t, there are unique integers j and k with t=Tj+1-k with 0≤k<j.

To see how, notice that there is a unique j such that t lies in the interval (Tj , … , Tj+1]=(Tj+1-j , … , Tj+1-0] and that this interval has width j so there is an appropriate choice of k.

We then suppose that t satisfies [t]g=h. Then we have [Tj+1]g=h⊕[k]g. So, instead of testing each [t] in turn until we hit equality, for a given j we need only test the ‘giant step’ [Tj+1]g against the set of ‘baby steps’ β={βi}={h⊕[0]g,…,h⊕[i]g,…,h⊕[j-1]g. If that j yields no matches, we move to j’=j+1: by keeping track of [j]g at each iteration, the new Tj’+1 and the extra h⊕[j’-1]g are found by a single group operation each. Hence this is attractive from both storage and time complexity perspectives: we need only record the baby steps β, [j]g and a single giant step at any given iteration j, whilst the time complexity is of order t1/2.

Terr’s BSGS variant for the DLP
Finds t such that [t]g=h for g generating G of unknown order, h from G.

Initialise β={β0=h}, γ=δ=idG, j=0. ((For each iteration, we have γ=[Tj+1]g and δ=[j]g)
Then loop over j as follows:

  • Increment j by 1, then update:
  • Set δ=δ⊕g=[j-1]g⊕g=[j]g.
  • Set γ=γ⊕δ=[Tj]g+[j]g=[Tj+1]g.
  • If j≥2 then
    • For s from 0 to j, if γ=βs then return Tj+1-s
  • Add βjj-1⊕g to β

I’ve implemented this generic DLP algorithm and an order-finding version (which requires you to catch g=idG) as Maple procedures for arbitrary groups- details on that will be in the next post! BSGS-based approaches become impractical at finite field sizes well within the grasp of SEA for counting on elliptic curves; but are of interest in the higher genus case which lacks an Elkies procedure.

Type Flaw Attacks

Tuesday, March 6th, 2007

Another day, another cryptographic protocol analysis writeup for Everything2. This time I covered the type flaw attack; a particularly devious vulnerability that is invisible to many formal analytic techniques.