Niels Möller | 29 Oct 14:17 2014

Side-channel silence

I've written a simple program to setup valgrind to check for bad uses of
secret data. Maybe you have heard of Adam Langley's ctgrind
(, but it seems to work fine to use plain
valgrind and mark the secret data with VALGRIND_MAKE_MEM_UNDEFINED.

Non-silent algorithms include aes, camellia, cast128, twofish, arctwo, des,
blowfish, gcm (the hashing), arcfour. Almost anything using sboxes.

Silent algorithms include serpent (sboxes, but with nice bit-slicing
instead of tables), salsa20, chacha, poly1305.

Not sure if and how this testing could be added to a plain

  make check EMULATOR='$(VALGRIND)'

As for AES, an implementations using the aesni instructions ought to be
side-channel silent. And if one is concerned about side channel attacks
on AES, but too conservative to jump to salsa20 or chacha, serpent might
be a good alternative.

(I'll be doing a short talk on side-channel attacks on Southpole's 15
year anniversary party on November 7, so that's why I'm looking into
this now).


/* cipher-sidechannel-test.c

(Continue reading)

Nikos Mavrogiannopoulos | 26 Oct 09:59 2014


 I was checking what is required for the chacha-poly1305 implementation
to be kept up to date with the current draft [0], on Last-Call. My
understanding is that the current implementation:
1. Is missing support for 96-bit nonce Chacha (could be solved by adding
a chacha_set_nonce96 function)
2. Misses the optimization which you proposed to CFRG (and was

It seems however, that if nettle is changed for the latter (i.e., to pad
AAD), then using chacha_poly1305_update() becomes tricky. It could only
be called once. Would in that case make sense to rename it to
chacha_poly1305_set_aad() rather than update?


Niels Möller | 22 Oct 10:03 2014

Fat binaries

I think I will leave the curve25519 and eddsa code for now, even though
there are several important optimizations left to do (see the just

I think it's getting time to do fat binaries. To make progress, I think
it's best to start with something simple, relying on
__attribute__((constructor) and/or __attribute__(ifunc ...)).

For the case of memxor (where on x86_64, the fat binary mechanism needs
to select between sse2 and non-sse2 code), I'm also considering some

 * Use smaller assembly routines doing one case each, and let the main
   entry point always be C code which can sort out the different cases
   and handle bytes at the beginning and end of the buffer.

 * Fix the cases where the current current code reads a few bytes
   outside of input buffers (but luckily without crossing word
   boundaries, iirc).

 * Add some internal entry points, for cases where alignment is known by
   the caller.

I think some additional overhead is acceptable for the cases of small
badly aligned buffers, if one can gain cleaner or more efficient
handling of the other cases.


(Continue reading)

Niels Möller | 24 Sep 21:10 2014

ecc_curve and ecc_modulo

I just pushed some reorganization of the ecc code. It introduces an
internal struct ecc_modulo, which keeps the data and function pointers
needed for modulo p (or modulo q) arithmetic.

Immediate benefit was that a couple of wrapper functions could be
deleted. E.g, ecc_generic_modp and ecc_generic_modq both called ecc_mod,
but with different constants taken from different fields of struct
ecc_curve). Now, one can instead call ecc_mod (&ecc->p, ...) and
ecc_mod(&ecc->q, ...), respectively.

I also added an invert function pointer, and wrote a specialized modp
invert function for curve25519, which gives a nice little speedup. The
code can be shared with sqrt, since the main part of the addition chain
is the same for p-2 = 2^{255} - 21 (for invert) and (p-5)/8 = 2^{252} -
3 (for sqrt). Similar functions for the secp curves also make sense, at
least for the mod p inversion; modq is less structured.

The plan is to expand struct ecc_modulo with add, mul and sqr function
pointers, to make it possible to have a different internal
representation. In particular, using radix-51 for curve25519 modp
arithmetic, on 64-bit machines.



Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26.
Internet email is subject to wholesale government surveillance.
Niels Möller | 31 Aug 21:12 2014

Hashing with EdDSA

I'm looking into EdDSA. According to the paper, signing of a message M,
using private key (a, k), corresponding to public key A, is essentially

  r = H(k | M),    with k the second half of the private key
  R = rB,          with B the specified generator of the curve,
  S = ((r + H(R | A | M) a) mod l, l is the curve order

with some rules to encode R, A, S as strings. H is typically sha-512.

If M is the original, arbitrarily long, message to be signed, this
breaks the common structure that you can first compute a message digest,
and then apply the secret key to produce a signature. But this doesn't
work above, because the complete message has to be hashed twice, first
with the secret prefix k, next with the prefix R | A, and any hashing
without the private key available is useless. And even worse, one has to
buffer the complete message because the prefix of the second hash
depends on the output of the first hash.

Or should M itself be a digest of the message to be signed? That will
enable a more main-stream signature interface, where the inputs to the
signature function are the private key and the short message digest.



Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26.
Internet email is subject to wholesale government surveillance.
Niels Möller | 29 Jul 16:51 2014

curve25519 progress

I've now pushed some crude code (on the curve25519 branch) which agrees
with the test vectors in draft-josefsson-tls-curve25519-05.

It uses the equivalent Edwards curve for the internal operations. For
scalar multiplication of the fix generator, it uses Pippenger's
algorithm and tables very similar to the other curves, just with
different point operations and no special caes (since the Edwards
operations are "complete"). At the end, the x-coordinate of the
corresponding point on the Montgomery-form curve25519 is computed.

For scalar multiplication of an arbitrary point (with only x coordinate
provided), I first have to compute the y-coordinate using
Shanks-Tonelli (this could be used to implement "point compression") also
for other curves). Then transform to a point on the Edwards curve, using
homogeneneous/projective coordinates. Then the actual scalar multiply is
currently done with the binary algorithm; I have code for window-based
scalar multiply, but it needs a bit more debugging. All this is very
similar to the other corves, but without special cases.



Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26.
Internet email is subject to wholesale government surveillance.
Daniel Kahn Gillmor | 11 Jul 19:27 2014

[PATCH] fix typo in ecc-mod.c

 ecc-mod.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/ecc-mod.c b/ecc-mod.c
index 7876d02..3301506 100644
--- a/ecc-mod.c
+++ b/ecc-mod.c
 <at>  <at>  -40,7 +40,7  <at>  <at> 
 #include "ecc-internal.h"

 /* Computes r mod m, where m is of size mn. bp holds B^mn mod m, as mn
-   limbs, but the upper mn - bn libms are zero. */
+   limbs, but the upper mn - bn limbs are zero. */
 ecc_mod (mp_limb_t *rp, mp_size_t rn, mp_size_t mn,
 	 const mp_limb_t *bp, mp_size_t bn,

Niels Möller | 2 Jul 10:24 2014


I'v started to look closer at curve25519, and I've added support for it
in the eccdata program.

For the ecc operations, my current thinking is that one should use the
Edwards curve equivalence outlined in, rather than the Montgomery
x-coordinate method suggested in the curve25519 paper. The x-coordinate
method needs an addition chain with additional values, which is a bit
alien to all other scalar ecc multiplication in Nettle. The equivalent
Edwards curve is

   x^2 + y^2 = 1 + (121665/121666) x^2 y^2 (mod p).

Computations should be about as fast (according to the paper), and since
the constant (121665/121666) is a non-square, the formulas are
"complete", with no need to handle any special cases.

One needs conversion of the coordinates, and one also needs a square
root to get the y coordinate, but I hope that shouldn't be too



Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26.
Internet email is subject to wholesale government surveillance.
Martin Storsjö | 26 Jun 09:30 2014

[PATCH 1/3] Don't hardcode the -lgmp linker flag in the hogweed pkg-config file

--- | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/ b/
index 457f5f2..839f7d0 100644
--- a/
+++ b/
 <at>  <at>  -13,6 +13,6  <at>  <at>  URL:
 Version:  <at> PACKAGE_VERSION <at> 
 Requires.private: nettle
 Libs: -L${libdir} -lhogweed
-Libs.private: -lgmp
+Libs.private:  <at> LIBS <at> 
 Cflags: -I${includedir}


-- (Apple Git-48)
Niels Möller | 24 Jun 21:32 2014

Simplifying pic vs non-pic object files

Currently, the nettle Makefile creates two object files for each source
file, .o for inclusion in the static library, and .po ("pure object")
for the shared library. By default, both are compiled as pic code, but
--disable-pic drops the pic flags when compiling the .o files, to get
non-pic code into the static library.

I'm considering dropping this complication. Just build a single .o file,
which is pic by default, and non-pic if --disable-pic is given.

To build a non-pic static library, one would configure with
"--disable-pic --disable-shared" (since just --disable-pic would produce
a shared library with non-pic code, which is highly undesirable). And to
produce a static non-pic library and a shared pic library, one would
need to use separate build trees.

Does that make sense? It would make things simpler, shorten build time,
and eliminate the problem of naming two types of object files.

One might also have --disable-pic imply --disable-shared 
(unless explicitly overridden by the user).



Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26.
Internet email is subject to wholesale government surveillance.
Nikos Mavrogiannopoulos | 20 Jun 14:50 2014

OCB mode

 What do you think of having OCB mode in nettle? I particularly like
OCB due to it's simplicity and performance comparing to GCM/CCM, but
was always patented. However in [0] there is license1 which states:
"Under this license, you are authorized to make, use, and distribute
open-source software implementations of OCB. This license terminates
for you if you sue someone over their open-source software
implementation of OCB claiming that you have a patent covering their

What do you think? Should the FSF be consulted on that?