Niels Möller | 17 May 2013 09:39
Picon
Picon
Picon
Favicon

Bug in nettle-2.7 ecc code

Testing on the debian build machines revealed a bug in the new ecc code.
Curiously enough, causing failures only on 32-bit sparc (see
https://buildd.debian.org/status/fetch.php?pkg=nettle&arch=sparc&ver=2.7-2&stamp=1368128015),
but the bug is not really platform-specific.

It turned out that ecc_j_to_a called GMP:s mpn_mul_n (via ecc_modp_mul)
with overlapping input and output arguments, which is not supported.

The following patch seems to solve the problem:

diff --git a/ecc-j-to-a.c b/ecc-j-to-a.c
index df8b876..26c1a03 100644
--- a/ecc-j-to-a.c
+++ b/ecc-j-to-a.c
 <at>  <at>  -46,6 +46,7  <at>  <at>  ecc_j_to_a (const struct ecc_curve *ecc,
 #define up   (scratch + ecc->size)
 #define iz2p (scratch + ecc->size)
 #define iz3p (scratch + 2*ecc->size)
+#define izBp (scratch + 3*ecc->size)
 #define tp    scratch

   mp_limb_t cy;
 <at>  <at>  -72,11 +73,11  <at>  <at>  ecc_j_to_a (const struct ecc_curve *ecc,
       if (flags & 1)
 	{
 	  /* Divide this common factor by B */
-	  mpn_copyi (iz3p, izp, ecc->size);
-	  mpn_zero (iz3p + ecc->size, ecc->size);
-	  ecc->redc (ecc, iz3p);
-      
(Continue reading)

Nikos Mavrogiannopoulos | 8 May 2013 09:19

arm compilation

Hello,
 Trying to build nettle 2.7 for an arm10 system of mine using its (old)
toolchain fails with assembler errors.

arm-linux-gcc -Os  -I[...] -I.  -DHAVE_CONFIG_H -g -O2 -ggdb3
-Wno-pointer-sign -Wall -W   -Wmissing-prototypes -Wmissing-declarations
-Wstrict-prototypes   -Wpointer-arith -Wbad-function-cast
-Wnested-externs -fpic -MT aes-decrypt-internal.o -MD -MP -MF
aes-decrypt-internal.o.d -fpic -c aes-decrypt-internal.s
aes-decrypt-internal.s: Assembler messages:
aes-decrypt-internal.s:81: Error: bad instruction `push
{r4,r5,r6,r7,r8,r10,r11,lr}'
aes-decrypt-internal.s:87: Error: register or shift expression expected
-- `orr r4,r8,lsl#8'
aes-decrypt-internal.s:89: Error: register or shift expression expected
-- `orr r4,r8,lsl#16'
aes-decrypt-internal.s:91: Error: register or shift expression expected
-- `orr r4,r8,lsl#24'
aes-decrypt-internal.s:93: Error: bad arguments to instruction -- `eor
r4,r8'
[...]
make[2]: *** [aes-decrypt-internal.o] Error 1

$ arm-linux-as -v
GNU assembler version 2.16.1 (arm-linux-uclibc) using BFD version 2.16.1

Trying with the latest buildroot another error is issued:

arm-buildroot-linux-uclibcgnueabi-as -v
GNU assembler version 2.21.1 (arm-buildroot-linux-uclibcgnueabi) using
(Continue reading)

Niels Möller | 3 May 2013 12:47
Picon
Picon
Picon
Favicon

Portable rotates

I'm rewriting the cast128 key schedule, to get rid of false warnings, and
avoid lots of conditions, and to separate the rotation and the mask
subkeys.

Then I noticed a portability problem with the rotation macros,

  #define ROTL32(n,x) (((x)<<(n)) | ((x)>>(32-(n))))

For n == 0, this will work on most machines, but it's not portable,
since x >> 32 gives undefined behaviour according to the C spec (when x
is a 32-bit type). (On typical hardware, the result of x >> 32 will be
either x or 0, and the rotation macro gives the intended result in
either case).

In most of nettle, there's no problem, because rotation counts are
constant and non-zero.

cast128 is an exception, with key-dependent rotation counts, which can
well be zero (don't know if that's exercised by the test suite, though).

A fix is to redefine the macro as

  #define ROTL32(n,x) (((x)<<(n)) | ((x)>>((-(n))&31)))

It should make no difference when n is constant, but for cast128, this
portability fix makes the code almost 20% slower. Apparently, gcc,
doens't recognize this as a rotate. I just filed a bug report at

  http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57157

(Continue reading)

Niels Möller | 30 Apr 2013 15:33
Picon
Picon
Picon
Favicon

raspberry pi

I finally got a Raspberry Pi setup. Took almost 20 minutes to compile
nettle, and the configure script figured out not to use neon
instructions. It passes the testsuite. Benchmark for the public key
functions:

           name size   sign/ms verify/ms
            rsa 1024    0.1073    1.9099
            rsa 2048    0.0167    0.5539
            dsa 1024    0.1992    0.1014
          ecdsa  192    0.5297    0.1922
          ecdsa  224    0.3623    0.1354
          ecdsa  256    0.2621    0.0971
          ecdsa  384    0.1125    0.0391
          ecdsa  521    0.0551    0.0190

So 3 to 5 times slower than the pandaboard (although that's not a
completely fair comparison, since the raspberry pi is using gmp-5.0.5,
not the gmp bleeding edge).

Regards,
/Niels

--

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26.
Internet email is subject to wholesale government surveillance.
Florian Pritz | 27 Apr 2013 14:29
Picon

Problems when compiling 32bit version on 64bit host (multilib)

Hi,

I'm compiling nettle for the Arch Linux multilib repository (32bit
libraries for a 64bit system so people can run skype and wine without
needing a chroot). Looks like the new ecc code doesn't properly handle
integer type sizes.

Here[1] is the output of make, make check along with the PKGBUILD used
to build this and a sample of the generated ecc-*.h files.

[1]: http://flo.server-speed.net/tmp/lib32-nettle/

Can I fix that by setting some env variable or does it need code changes?

If you need any more files, please tell me.

_______________________________________________
nettle-bugs mailing list
nettle-bugs@...
http://lists.lysator.liu.se/mailman/listinfo/nettle-bugs
Niels Möller | 24 Apr 2013 16:17
Picon
Picon
Picon
Favicon

ANNOUNCE: Nettle-2.7

I'm happy to annnounce a new version of GNU Nettle, a low-level
cryptographics library. The Nettle home page can be found at
http://www.lysator.liu.se/~nisse/nettle/.

NEWS for the 2.7 release

	This release includes an implementation of elliptic curve
	cryptography (ECC) and optimizations for the ARM architecture.
	This work was done at the offices of South Pole AB, and
	generously funded by the .SE Internet Fund.

	Bug fixes:

	* Fixed a bug in the buffer handling for incremental SHA3
	  hashing, with a possible buffer overflow. Patch by Edgar
	  E. Iglesias.

	New features:

	* Support for ECDSA signatures. Elliptic curve operations over
	  the following curves: secp192r1, secp224r1, secp256r1,
	  secp384r1 and secp521r1, including x86_64 and ARM assembly
	  for the most important primitives.
	  
	* Support for UMAC, including x86_64 and ARM assembly.

	* Support for 12-round salsa20, "salsa20r12", as specified by
	  eSTREAM. Contributed by Nikos Mavrogiannopoulos.
	
	Optimizations:
(Continue reading)

Martin Storsjö | 23 Apr 2013 16:54

[PATCH 1/2] sha: Add the missing W64_EXIT epilogue macro

---
 x86_64/sha256-compress.asm |    1 +
 x86_64/sha512-compress.asm |    1 +
 2 files changed, 2 insertions(+)

diff --git a/x86_64/sha256-compress.asm b/x86_64/sha256-compress.asm
index 6bfb7a7..385654c 100644
--- a/x86_64/sha256-compress.asm
+++ b/x86_64/sha256-compress.asm
 <at>  <at>  -192,5 +192,6  <at>  <at>  PROLOGUE(_nettle_sha256_compress)
 	mov	112(%rsp),%r15

 	add	$120, %rsp
+	W64_EXIT(3, 0)
 	ret
 EPILOGUE(_nettle_sha256_compress)
diff --git a/x86_64/sha512-compress.asm b/x86_64/sha512-compress.asm
index 21df82a..663e68e 100644
--- a/x86_64/sha512-compress.asm
+++ b/x86_64/sha512-compress.asm
 <at>  <at>  -192,5 +192,6  <at>  <at>  PROLOGUE(_nettle_sha512_compress)
 	mov	176(%rsp),%r15

 	add	$184, %rsp
+	W64_EXIT(3, 0)
 	ret
 EPILOGUE(_nettle_sha512_compress)
--

-- 
1.7.9.5
(Continue reading)

Martin Storsjö | 23 Apr 2013 16:12

[PATCH 1/2] Use movdqu instead of movdqa for saving xmm registers

The stack is not guaranteed to be 16-byte aligned on win64.
---
 x86_64/machine.m4 |   40 ++++++++++++++++++++--------------------
 1 file changed, 20 insertions(+), 20 deletions(-)

diff --git a/x86_64/machine.m4 b/x86_64/machine.m4
index dc23dde..d5d5b37 100644
--- a/x86_64/machine.m4
+++ b/x86_64/machine.m4
 <at>  <at>  -71,34 +71,34  <at>  <at>  define(<W64_ENTRY>, <
   ifelse(W64_ABI,yes,[
     ifelse(eval($2 > 6), 1, [
       sub	[$]eval(8 + 16*($2 - 6)), %rsp
-      movdqa	%xmm6, 0(%rsp)
+      movdqu	%xmm6, 0(%rsp)
     ])
     ifelse(eval($2 > 7), 1, [
-      movdqa	%xmm7, 16(%rsp)
+      movdqu	%xmm7, 16(%rsp)
     ])
     ifelse(eval($2 > 8), 1, [
-      movdqa	%xmm8, 32(%rsp)
+      movdqu	%xmm8, 32(%rsp)
     ])
     ifelse(eval($2 > 9), 1, [
-      movdqa	%xmm9, 48(%rsp)
+      movdqu	%xmm9, 48(%rsp)
     ])
     ifelse(eval($2 > 10), 1, [
-      movdqa	%xmm10, 64(%rsp)
(Continue reading)

Martin Storsjö | 23 Apr 2013 13:19

[PATCH 1/2] Consistently use EXEEXT_FOR_BUILD

This fixes cross building for cases where EXEEXT differs from
EXEEXT_FOR_BUILD, such as when building for windows from unix.
---
 Makefile.in |   19 ++++++++++---------
 1 file changed, 10 insertions(+), 9 deletions(-)

diff --git a/Makefile.in b/Makefile.in
index c96e3a3..cbc001c 100644
--- a/Makefile.in
+++ b/Makefile.in
 <at>  <at>  -31,8 +31,9  <at>  <at>  getopt_TARGETS = $(getopt_SOURCES:.c=.$(OBJEXT))
 internal_SOURCES = nettle-internal.c
 internal_TARGETS = $(internal_SOURCES:.c=.$(OBJEXT))

-TARGETS = aesdata$(EXEEXT) desdata$(EXEEXT) twofishdata$(EXEEXT) \
-	  shadata$(EXEEXT) gcmdata$(EXEEXT) \
+TARGETS = aesdata$(EXEEXT_FOR_BUILD) desdata$(EXEEXT_FOR_BUILD) \
+          twofishdata$(EXEEXT_FOR_BUILD) shadata$(EXEEXT_FOR_BUILD) \
+          gcmdata$(EXEEXT_FOR_BUILD) \
 	  $(getopt_TARGETS) $(internal_TARGETS) \
 	  $(LIBTARGETS) $(SHLIBTARGETS)
 IMPLICIT_TARGETS =  <at> IF_DLL <at>  $(LIBNETTLE_FILE) $(LIBHOGWEED_FILE)
 <at>  <at>  -273,7 +274,7  <at>  <at>  des.$(OBJEXT): des.c des.h $(des_headers)
 #	k =  7, c = 6, 320 entries, ~15 KB
 #	k =  9, c = 7, 512 entries, ~24 KB
 ecc-192.h: eccdata.stamp
-	./eccdata$(EXEEXT) 192 7 6 $(GMP_NUMB_BITS) > $ <at> T && mv $ <at> T $ <at> 
+	./eccdata$(EXEEXT_FOR_BUILD) 192 7 6 $(GMP_NUMB_BITS) > $ <at> T && mv $ <at> T $ <at> 
 # Some possible choices for 224:
 #	k = 18, c = 4,  64 entries,  ~4 KB
(Continue reading)

Niels Möller | 21 Apr 2013 21:50
Picon
Picon
Picon
Favicon

nettle-2.7 release candidate

I've generated a release candidate, available at

  http://www.lysator.liu.se/~nisse/archive/nettle-2.7rc1.tar.gz

Testing appreciated.

Regards,
/Niels

--

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26.
Internet email is subject to wholesale government surveillance.
Shaun Murphy | 20 Apr 2013 21:52

Support for via quadcore SHA512 hw acceleration‏

The limited literature for the newer VIA QuadCore E-Series embedded platform says that it now supports
"Secure Hash Algorithm: SHA-1, SHA-256, SHA-384, SHA-512" but I'm not seeing any acceleration for
SHA512 in the kernel modules or nettle / gnutls. I would appreciate some pointers on what I need to do to
access that SHA512 acceleration in nettle.
Here's my setup:Via Artigo A1250 Ubuntu 12.04 x86_64Gnutls - built from git Nettle - built from 2.6 source
Kernel modules: padlock_aes, padlock_sha
Here's my dmesg output for the loaded modules:[    2.345061] padlock_aes: Using VIA PadLock ACE for AES
algorithm.[    2.364105] padlock_sha: Using VIA PadLock ACE for SHA1/SHA256 algorithms.
gnutls Benchmark Soft Ciphers:Checking SHA1 (16kb payload)...  Processed 464.73 MB in 5.00 secs: 92.95
MB/secChecking SHA256 (16kb payload)...  Processed 180.04 MB in 5.00 secs: 36.01 MB/secChecking SHA512
(16kb payload)...  Processed 267.39 MB in 5.00 secs: 53.48 MB/sec
gnutls Benchmark Ciphers:Checking SHA1 (16kb payload)...  Processed 1.51 GB in 5.00 secs: 0.30
GB/secChecking SHA256 (16kb payload)...  Processed 1.30 GB in 5.00 secs: 0.26 GB/secChecking SHA512
(16kb payload)...  Processed 267.45 MB in 5.00 secs: 53.49 MB/sec
The SHA256 numbers are great but I really need SHA512 for my application.
Thank you. 		 	   		  

Gmane