Archive for November, 2007

Addition Chains

Thursday, November 29th, 2007

Suppose you have a rule for addition of some objects (most generally, elements of a semigroup), and you wish to compute the sum of n copies of the same object. How could you achieve this?

The primary school answer is to do just that: given g, compute 2g=g+g; then compute 3g=2g+g and so on, for n steps. But this rapidly becomes tedious, and isn’t (I hope) how we’d perform such a calculation in our heads: instead, we’d try to take bigger jumps. For instance, to compute the number 16g, we can find 2g, add that to itself to reach 4g, and again for 8g, hitting 16g after just 4 steps. Had we wanted 17, we’d then just add g again.

The sequence of intermediate values correspond to an addition chain for n: this is a sequence starting at a0=1 and ending at al=n with the property that for any term ak there exist terms ai,aj with ak=ai+aj. That is, every term is the sum of two earlier terms (not necessarily distinct), so we can reach ng by computing each akg in turn without needing anything fancier than our addition law.

As a side effect, an addition chain also tells us how to exponentiate if we know how to multiply, since xa+b=xaxb so we can build up to xn by finding the powers xak in turn.

Thus our tedious sequence for 16 is 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16, with the more sophisticated one being 1,2,4,8,16. If I asked you to compute 16x for a number x, you’d probably do something different: compute 10x (easy in base 10!) and 6x, then add those. With access to a multiplication procedure, this is definitely more efficient: but without that luxury it can still be used to give an addition chain for 16, by finding chains for 10 and 6. These are 1,2,4,5,10 and 1,2,3,6 respectively, so we can merge them into the chain 1,2,3,4,5,6,10,16.

We describe the length of the chain as l: that is, there are l terms after the 1. Notice that at worst, therefore, we need n-1 terms. If we move as quickly as possible by taking ak+1=ak+ak=2ak then the greatest value we can reach after m terms is 2m, as demonstrated with the chain for 24=16. However, if we can compute n and m in ln,lm terms, it needn’t take as many as ln+lm+1 terms to compute n+m, since there may be common terms in the chains. Finally, as defined an addition chain needn’t make use of all it’s terms: for example, 1,2,4,6,8,16 is still a chain for 16, although the 6 isn’t needed. We will obviously be interested in chains without any such wasted terms.

The observation for powers of two motivates a first attempt at an efficient algorithm for building addition chains. if n is 1, then we have nothing to do: the chain 1 suffices. Otherwise, we can work recursively: for an even number n=2k, we ask for the chain for k and then add 2k to the end; whereas for an odd number n=2k+1, we ask for the chain for k and add 2k then 2k+1 to the end. This will require at most log2n calls, and each step adds 1 or 2 entries to the chain.

This can also be captured by considering the binary representation of n, which will also indicate whether we hit an even or odd number at each stage. Suppose n is of the form

where each di is either 0 or 1, and dk=1 so there are no leading zeros. Then we start the chain with 1, and read from left to right, ignoring the leading 1: if we see a 0, then we double the last term of the chain; if we see a 1, add 1 to this new term as well. More formally:

Input: the binary expansion of n.

Output: an addition chain for n.

Set a0=1
Set t=1
for i from k-1 down to 0:
    Set at=2at-1 and t=t+1.
    If d_i=1:
        Set at=at-1+1 and t=t+1
return a0…at

For instance, with n a power of 2 we will see only a run of zeros, and thus perform the appropriate number of doublings, for a chain of length log2n. For n one less than a power of two, the expansion consists entirely of 1s, costing 2log2n. For an ‘average’ number where each digit has around a 50% chance of being a 1 or a 0, the chain will therefore be of approximate length (3/2)log2n. More formally, we can define the Hamming Weight v(n) to be the number of 1s in the binary expansion of n; then the chain will have length floor(log2n) + v(n) -1.

Is this optimal? A counterexample suffices to show it is not, the first being 15, which is 1111 in binary. Using the binary expansion we get the sequence 1,2,3,6,7,14,15 of length 6 (=3+4-1 as claimed). But the sequence would be 1,2,3,5,10,15 of length 5 and hence 1 shorter.

Nonetheless, our binary chain gives a much better upper bound than the naive chain of (n-1) steps, and is very simple to implement both in logic and memory. The idea also generalises to arbitrary bases, as follows.

Suppose n is of the form

where each di is in the range 0…m-1, dk non-zero. Then we can compute an m-ary addition chain for n:

Input: the m-ary expansion of n.

Output: an addition chain for n.

Set a=0
Start the chain with 1,2,3,…,m-1.
for i from k down to 0:
    Extend the chain to m*a (if not already present) and set a=m*a.
    Add a+di to the chain (if not already present), and set a=a+di.
return chain.

Notice that a tracks the last term computed, and that extending the chain to m*a is itself an addition chain problem: this makes some 2k a good choice for m, since this extension can then be accomplished in k additions.

The shorter chain for 15 arises from its ternary expansion: considered as (120)3 1*9+2*3+0*1, we have an initial chain of 1,2 then apply the loop. With a=0,d2=1 we set a=3a=0 then a=a+1=1; neither of these need to be added to the chain; for a=1, d1=2 we set a=3a=3 which requires the addition of 3 to the chain; then a=a+2=5 which extends the chain to 1,2,3,5; finally for a=5,d2=0 we set a=3a=15, adding first 10 then 15 to the chain; then a=a+0=15 so we terminate with chain 1,2,3,5,10,15.

The optimal choice of m depends on n, since although there will be less steps (logmn) for greater m, each step will require more additions in the computation of m*a. Nor are m-ary expansions the end of the story- there are various other tricks available to finesse addition chains. However, there is no known algorithm for finding shortest addition chains- and a generalisation of the problem to finding shortest addition sequences (chains that contain a desired list of values n1,n2,…,nk) is NP-complete. Searching for short chains is therefore an interesting challenge, and is constrained by the fact that unless you are precomputing them for future use, any shorter chain must offer a speedup in computation greater than the time taken to find it.

There are, however, bounds for the length of the chain:

Where v(n) is the Hamming weight as before.

However, if you have access to basic operations other than addition, all bets are off. In some contexts (such as elliptic curves), finding the inverse of an element is easy and so we can achieve subtraction, giving rise to addition-subtraction chains. For the pairing computations I’ve been interested in, I’ve never actually described a block addition. It is possible, following an algorithm of Shipsey, to do so, although it is much more expensive than the block double. But even if it cost no more, and we had an oracle for optimal chains, such an approach would only rarely beat the current algorithm. This is because access to the double-add procedure makes the Hamming weight irrelevant: for instance, with 15 we could form the chain 1,3,7,15. This means that we can compute in log2n block operations, and comparing to the lower bound for addition chains above shows that this is almost always better.

Still, ideas from addition chains can motivate construction of other operations: access to triple, tripleadd and tripleadd2 procedures would allow for chain of log3n length, although it is likely that these block operations would be more expensive than the double and doubleadds. Even if they don’t turn out to be useful for my work, the subtleties of this innocent-looking problem still makes for interesting mathematics!

References/Further Reading

Knuth- The Art of Computer Programming Vol. 2: Seminumerical Algorithms.

Bos and Coster- Addition Chain Heuristics in Advances in Cryptology- Crypto’89 (Lecture Notes in Computer Science) (Available via SpringerLink.)

Gordon- A Survey of Fast Exponentiation Methods. (Available via citeseer.)

Tate pairing computation in SAGE II

Thursday, November 22nd, 2007

A couple of comments now that I’ve tested my Tate pairing procedures on some random input.

First, it turns out to be not too difficult to make use of PARI scripts from within SAGE. I spent ages trying to figure out how to call them through the gp interface, until I discovered that new functions simply become additional methods of the gp object; although some SAGE objects have to be converted to their PARI equivalent before you can use them as input. For instance, assuming you have Stange’s tate_via_nets script in your working directory, you can attach it with gp.read(”tate_via_nets.gp”). Then for SAGE points P,Q on a curve E with the order of P stored as n, the following will compute the Tate pairing via PARI:

Egp=gp(E)
Pgp=gp([gp(P.xy()[0]),gp(P.xy()[1])])
Qgp=gp([gp(Q.xy()[0]),gp(Q.xy()[1])])
gp.tate_pairing_alg(Egp,Pgp,Qgp,n)

It’s hard to get accurate measures of the time taken by a PARI subprocess (it’s invisible to SAGE’s cputime, and walltime will be influenced by any other running processes), and these calculations are pretty quick, but it seemed the PARI implementation was faster. So I’ve developed a SAGE version which avoids the overhead of dictionary read/writes by not caching intermediate terms; I’ve also set it up as a pyrex file, which despite containing purely python code (I’ve not ported any of the internals to C) is faster than both the old ellnet2d and the above PARI approach after a one-time compilation. You can download that here; then use attach “ellnet2d_lowmem.spyx” instead of attach “ellnet2d.sage” before proceeding as before.

The good news is that in every test I’ve run (over both prime and prime-power fields), the various implementations all agree. The bad news is that it doesn’t seem possible to compute elliptic curve point orders beyond about 64bits in SAGE (or rather in PARI, upon which it depends), and before that stage the order calculation is significantly slower than the pairing computation. It also seems that a random curve tends not to feature particularly high powers of 2 in its order, so the speed-up I suggested earlier doesn’t offer much advantage at all. Hopefully I can find a class of curves where it is of merit!

Tate pairing computation in SAGE

Thursday, November 8th, 2007

Eventually, then, we’re in a position to compute Tate pairings in SAGE. The necessary ingredients are an elliptic net class (the more general one or a faster version for pairings only), an eds class (this one based on Stange’s algorithm is recommended and used in the examples below; the old one should also work), and supporting procedures tate.

Load these in the usual fashion:

sage: attach "tate.sage"
sage: attach "ellnet1d.sage"
sage: attach "ellnet2d.sage"

We now need to introduce a curve and some points. Stange gives a simple example of y^2+y=x^3+x^2-2x over the field with 5 elements corresponding to points P=(0,0) and Q=(1,0). We enter the field, curve and points and set up an elliptic net:

sage: K=GF(5)
sage: E=EllipticCurve(K,[0,1,1,-2,0])
sage: E
Elliptic Curve defined by y^2 + y = x^3 + x^2 + 3*x over Finite Field of size 5
sage: P=E([0,0])
sage: Q=E([1,0])
sage: X=netFromPoints(P,Q)
sage: X

Elliptic net with (1,0) block:
                1       1       1
4       4       0       1       1       2       1       3

The block centred at 1 is displayed; this has been generated from the initial data, and suffices to compute any term of the form (x,0) or (x,1). If such a term is known, it is simply retrieved from the hash table; otherwise, the algorithm is employed. Requests for unreachable points give a ValueError:

sage: X.has_element((5,0))
True
sage: X[(5,0)]
3
sage: X.has_element((100,1))
False
sage: time X[(100,1)]
Computing terms
CPU times: user 0.01 s, sys: 0.00 s, total: 0.01 s
Wall time: 0.01
1
sage: X.has_element((100,1))
True
sage: time X[(100,1)]
CPU times: user 0.00 s, sys: 0.00 s, total: 0.00 s
Wall time: 0.00
1
sage: X[(100,2)]
---------------------------------------------------------------------------
<type 'exceptions.ValueError'>            Traceback (most recent call last)

/home/s0677951/SAGE/eds/<ipython console> in <module>()

/home/s0677951/SAGE/eds/ellnet2d.py in __getitem__(self, A)
     41                                 # Term not entered and cannot be computed; give an error.
     42                                 raise ValueError, \
---> 43                                 "Term unavailable, cannot compute unless of form (x,0) or (x,1)"
     44
     45         def has_element(self, A):

<type 'exceptions.ValueError'>: Term unavailable, cannot compute unless of form (x,0) or (x,1)

You can also test for and print entire (small)blocks; this illustrates the route taken during computation of elements, for instance in computing the 18th term, the blocks with centres 1, 2, 4, 9 are constructed, then the small block centred on 18:

sage: X.has_block((9,0))
False
sage: X[(18,0)]
Computing terms
0
sage: X.has_block((9,0))
True
sage: X.print_block(9,0)
                1       1       3
4       3       2       0       3       2       1       2
sage: X.has_block((18,0))
False
sage: X.has_small_block((18,0))
True
sage: X.print_small_block(18,0)
                2       4       3
3       4       4       0       1       1       2

To compute the pairing of P and Q, use TatePairing(m,P,Q), where m is the order of P (determining this left as an exercise!):

sage: TatePairing(9,P,Q)
Computing terms
3

Finally, a call to TatePairing with P=Q will construct an eds rather than an ellnet. We can verify an example of Stange’s (see slides from ECC):

sage: K=GF(73)
sage: E=EllipticCurve(K,[0,1,1,-2,0])
sage: P=E([0,0])
sage: TatePairing(9,P,P)
Computing terms
24

Stange’s Algorithm for EDS

Thursday, November 8th, 2007

Since EDS are the rank 1 case of elliptic nets, Stange’s approach to the rank 2 case can be specialised to them: we are only concerned with terms of the form (x,0), so only the bottom row of a block need be calculated. Such sequences are considered when pairing a point with itself.

Thus I’ve simplified the ellnet class to give another eds class. I’ve changed the internals so that this class can be initialized and interacted with in the same way as my previous eds class based on Shipsey’s algorithm. (You retrieve the n term from a rank 1 net X using X[n] rather than the X[(n,0)] required for a general elliptic net, for instance.) There will probably be a few discrepencies to be ironed out, but I’ve thrown in the ward_point methods necessary for height calculation, and hope to have the two versions interchangeable in due course.

This requires only 1 initial inversion, and continues to use the step-by-step precomputation speedups given by Stange. It doesn’t support the larger jumps possible with Shipsey’s algorithm, although my existing code doesn’t exploit these as fully as it could anyway. Final doublings are still performed with smaller blocks, so for height estimation purposes it’s probably wise to choose a power of two for the term used.

The next post illustrates pairing computation with rank 1 nets.

Stange’s Algorithm for Elliptic Nets

Thursday, November 8th, 2007

In Shipsey’s algorithm1 for EDS, the septuples centred on terms mk and (m+1)k can be used to construct the septuple about 2mk, (2m+1)k or (2m+2)k from that about k. With k=1 this gives rise to a double-and-add algorithm, such that term n can be obtained in logarithmic time; knowledge of larger k introduces an additional speedup.

However, Shipsey’s approach makes use of various inverses, potentially giving division-by-zero errors for a sequence corresponding to a point of finite order (any point, in the finite field cases of interest for pairing computations). Further, the recurrence relation for a rank 2 elliptic net is more complicated than for an EDS, and we ultimately wish to find terms (m,0) and (m,1). Stange provides an algorithm to achieve this, replacing the septuple with an 11-element block centred on k as follows:


(k-1,1) (k,1) (k+1,1)
(k-3,0) (k-2,0) (k-1,0) (k,0) (k+1,0) (k+2,0) (k+3,0) (k+4,0)

Notice that the blocks centred at any of m, m-1 or m+1 will contain the information needed to compute the pairing. Thus we generate such a block by a double-and-add algorithm.

Algorithm

Double-and-add Algorithm

INPUT: Integer n and block centred at 1.

OUTPUT: Block centred at n.

  1. Compute binary digits di of n such that n=(dkdk-1…d1)2 with dk=1.
  2. Set c=1 (centre)
  3. For i=k-1 down to 1 do:
    • If d_i=0 Compute block with centre 2c; Set c to 2c
    • Else, Compute block with centre 2c+1; Set c to 2c+1
  4. Return final block.

Clearly, this requires procedures to generate new blocks from old- Stange2 gives formulae for these ((12)-(17)). A speed-up (Id. S5.1) can be achieved by computing commonly used squares/products at each step, denoted by the A(i) (squares) and B(i) (products) below.

Double

DoubleAdd

The inversions in the recurrence formulae are independent of k, so are precomputed and made available to Double and DoubleAdd as multipliers. Including multiplication by these values, the total cost of a block double/double-and-add is 35 multiplications, 7 squarings, 11 additions and no inversions.

A minor refinement

Define a 10-element short block centred at k by


(k-1,1) (k,1) (k+1,1)
(k-3,0) (k-2,0) (k-1,0) (k,0) (k+1,0) (k+2,0) (k+3,0)

shortDouble

Notice from the Double dependencies that the (k+4,0) term is necessary only to generate (2k+4,0). Thus given a short block centred at k, we can double it to obtain the short block centred at 2k, with the following dependencies:

The inversions in the recurrence formulae are independent of k, so are precomputed and made available to shortDouble as multipliers. Including multiplication by these values, the total cost of a short block double is 31 multiplications, 6 squarings, 10 additions and no inversions.

shortDoubleAdd

From the DoubleAdd dependencies it can be seen that the missing (k+4,0) is required for both (2k+4,0) and (2k+5,0)- although the latter can be dropped for a short block, the former cannot. Thus it is not possible to adapt this choice of recurrence formulae to give a DoubleAdd for short blocks. Nonetheless, the basic algorithm can be ammended to exploit the simpler short block Double once no further DoubleAdds are required; as the short block is sufficient for finding the Tate Pairing.

Algorithm

Amended Double-and-add Algorithm

INPUT: Integer n and block centred at 1.

OUTPUT: Block centred at n.

  1. Compute s,p such that p is odd and n=2sp.
  2. Compute binary digits di of p such that p=(dkdk-1…d1)2 with dk=1.
  3. Set c=1 (centre)
  4. For i=k-1 down to 1 do:
    • If d_i=0 Compute block with centre 2c using Double; Set c to 2c
    • Else, Compute block with centre 2c+1 using DoubleAdd; Set c to 2c+1
  5. This gives the block with centre p, and thus the short block with centre p.
  6. For i from 1 to s do:
    • Compute short block with centre 2c using shortDouble; Set c to 2c
  7. Return final block.

The efficiency gain of this variant depends on n- if n is a power of 2, then we may proceed entirely by shortDouble with 47 instead of 53 operations, a saving of around 11 percent (this will vary depending on the precise cost of multiplication, squaring and addition). If, however, n is odd then there is no gain. Thus pairing computation can be improved in the case of points with order a high power of two.

Implementation

A SAGE ellnet class is available, which implements all these ideas for rank 2 (Stange’s algorithm, with precomputation of inverses, caching of commonly-used factors at each step, and finishing by short blocks). See this post for how to generate nets from points and compute pairings.

References

1: Shipsey- Elliptic Divisibility Sequences (Download)

2: Stange- The Tate Pairing via Elliptic Nets (Preprint)

Elliptic Nets

Thursday, November 8th, 2007

As always in mathematics, when something turns out to be useful, it’s worth asking whether it can be generalised to a broader setting. For Elliptic Divisibility Sequences, that generalisation is Elliptic Nets. For a free abelian group of finite rank, this is a function W satisfying the condition:

for any p,q,r,s from the group.
(EDS are then recovered as the special case where the chosen group is the integers).

The theory of elliptic nets has been developed extensively by Katherine Stange, and the papers/presentations from her website provide a wealth of information. I’ve been writing a SAGE implementation of her results, from which we need a few key ideas.

What’s remarkable is that it is not just that sequences can be generalised to arrays: the connection to elliptic curves is also preserved. That is, given n distinct points of an elliptic curve, there is a rank n elliptic net corresponding to those points. The terms of the EDS corresponding to each point will appear in the array, but more complicated interactions of the points are also captured. In the rank 1 case, it was possible to estimate heights purely by manipulation of terms of an EDS; in a similar way, routine manipulation of elliptic nets allows for computation of the Tate pairing.

The use of pairings is currently a hot topic in cryptography, with both destructive and constructive applications. The Tate Pairing allows pairs of elements of an elliptic curve group to be mapped to a finite field. In certain cases, and provided the pairing can be computed efficiently, this transplants the Discrete Logarithm Problem (DLP) into an easier environment, thus weakening the cryptosystem. On the other hand, the gap in difficulty between the two DLPs can itself be used to design cryptosystems and protocols, such as Identity based cryptography.

For the theoretical background, I recommend again Stange’s work. The take-home message however, is as follows:

  • For a point P of order m on an elliptic curve E, to compute the pairing of P with itself it suffices to recover terms m+1 and m+2 of the EDS associated with P.
  • For distinct points P,Q on an elliptic curve E with P of order m, to compute the pairing of P with Q it suffices to recover elements (m,0) and (m,1) of the elliptic net associated with P and Q.

Thus I have been interested in the sequence/net calculations necessary to recover these quantities. Notice that it is not necessary to compute general (i,j) elements, but only the rows (i,0) and (i,1). The former is in fact the EDS associated with P. Like Shipsey’s algorithm, Stange presents a double-and-add algorithm for the general case of reaching elements (m,0) and (m,1) via successive blocks of elements. For details on this, a possible refinement of my own, and a SAGE implementation, see the next post. This would solve the first problem as well, by choosing a random Q, but the (i,1) terms introduce an unnecessary overhead. Eliminating this gives rise to an alternative to Shipsey’s algorithm for EDS, so I’ve implemented this case separately with an equivalent interface to my existing eds class. It’s unclear which approach is preferable where (although Stange’s is almost definitely the one to use over finite fields), so I hope to furnish both with equivalent functionality and compare their performance for height computation (details to follow). Finally, the wrappers to move from points to nets and compute pairings have been implemented, although I need a lot more test cases to identify potential shortcomings! Ultimately, I’d like to bundle all these ideas together into a more friendly, rank-aware elliptic net class for both roles (heights and pairings).

Some SAGE bits and bobs

Tuesday, November 6th, 2007

I’ve made a few cosmetic and minor functional tweaks to some of my SAGE offerings. The latest version of eds suppresses some diagnostic information, and the representation of an eds object now gives the defining terms rather than the entire dictionary (which runs to scores of pages for sequences over the integers). Further, heights will now catch the zero division exception that sometimes arises when you use Shipsey’s algorithm on a sequence corresponding to a point of finite order. It will now give a warning to this effect, and return a height of zero as required.
This, coupled with the singular curve catch introduced last time should mean that batch testing of large numbers of curves shouldn’t now fail- but I’m sure I’ll find some new exception in due course!

I’ve taken stock of my various coding efforts listed on my university space and trimmed out the least useful / developed / efficient ones from the past year. What’s left generally offers features not otherwise available for the given CAS- generic group operations and their use for [hyper]elliptic curve groups in Maple; EDS/height calculations for SAGE. The bug I found in hyperelliptic curve divisor class addition over finite fields has since been fixed, making my alternative class redundant; and the NTL interface seems to have changed, rendering my faster version broken; so those have been dropped from the site.

Finally, I’ll be back in the West Country for the SAGE Days 6 talks this weekend.