And Sch | 28 Aug 20:02 2014

[PATCH 1/1] Improved whirlpool hash performance

* cipher/whirlpool.c (whirlpool_transform, sbox, added macro): Added macro and rearranged round
function to alternate between reading to and writing from different state and key variables. Two
whirlpool_context_t variables removed, two were replaced, the sizes of state and key doubled, so
overall the burn stack stays the same. buffer_to_block and block_xor were combined into one operation.
The sbox was converted to one large table, because it is faster than many small tables.
--

Benchmark on different systems:

Intel(R) Atom(TM) CPU N570    <at>  1.66GHz
before:
Hash:
                |  nanosecs/byte   mebibytes/sec   cycles/byte
 WHIRLPOOL      |     63.40 ns/B     15.04 MiB/s         - c/B
after:
Hash:
                |  nanosecs/byte   mebibytes/sec   cycles/byte
 WHIRLPOOL      |     46.21 ns/B     20.64 MiB/s         - c/B

Intel(R) Core(TM) i5-4670 CPU  <at>  3.40GHz
before:
Hash:
                |  nanosecs/byte   mebibytes/sec   cycles/byte
 WHIRLPOOL      |      7.75 ns/B     123.0 MiB/s         - c/B
after:
Hash:
                |  nanosecs/byte   mebibytes/sec   cycles/byte
 WHIRLPOOL      |      6.70 ns/B     142.3 MiB/s         - c/B

This one actually shows greater improvement on the Atom system.
(Continue reading)

Jussi Kivilinna | 28 Aug 18:35 2014
Picon
Picon

[PATCH] Add new Poly1305 MAC test vectors

* tests/basic.c (check_mac): Add new test vectors for Poly1305 MAC.
--

Patch adds new test vectors for Poly1305 MAC from Internet Draft
draft-irtf-cfrg-chacha20-poly1305-01.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna <at> iki.fi>
---
 tests/basic.c |   66 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 66 insertions(+)

diff --git a/tests/basic.c b/tests/basic.c
index 6d70cfd..e406db4 100644
--- a/tests/basic.c
+++ b/tests/basic.c
 <at>  <at>  -6008,6 +6008,72  <at>  <at>  check_mac (void)
         "\xf3\x47\x7e\x7c\xd9\x54\x17\xaf\x89\xa6\xb8\x79\x4c\x31\x0c\xf0",
         NULL,
         0, 32 },
+      /* draft-irtf-cfrg-chacha20-poly1305-01 */
+      /* TV#5 */
+      { GCRY_MAC_POLY1305,
+        "\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF",
+        "\x02\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00"
+        "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00",
+        "\x03\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00",
+        NULL,
+        16, 32 },
+      /* TV#6 */
+      { GCRY_MAC_POLY1305,
(Continue reading)

And Sch | 28 Aug 19:45 2014

[PATCH 1/1] Improved ripemd160 performance

* cipher/rmd160.c (transform): Interleave the left and right lane rounds to introduce more instruction
level parallelism.
--

The benchmarks on different systems:

Intel(R) Atom(TM) CPU N570    <at>  1.66GHz
before:
Hash:
                |  nanosecs/byte   mebibytes/sec   cycles/byte
 RIPEMD160      |     13.07 ns/B     72.97 MiB/s         - c/B
after:
Hash:
                |  nanosecs/byte   mebibytes/sec   cycles/byte
 RIPEMD160      |     11.37 ns/B     83.84 MiB/s         - c/B

Intel(R) Core(TM) i5-4670 CPU  <at>  3.40GHz
before:
Hash:
                |  nanosecs/byte   mebibytes/sec   cycles/byte
 RIPEMD160      |      3.31 ns/B     288.0 MiB/s         - c/B
after:
Hash:
                |  nanosecs/byte   mebibytes/sec   cycles/byte
 RIPEMD160      |      2.08 ns/B     458.5 MiB/s         - c/B

Signed-off-by: Andrei Scherer <andsch <at> inbox.com>

---

(Continue reading)

Stephan Mueller | 25 Aug 12:37 2014
Picon

Cipher FIPS flag enforcement

Hi,

all of the the cipher definitions contain and define the flag

	unsigned int fips:1;

in their _spec_t types.

Up to 1.4.x, that field was enforced in the cipher init functions (i.e. 
if the FIPS mode is set, only fips=1 ciphers are allowed).

The random/ code still contains such logic. But all other fips flag 
enforcement code is gone.

Is this intentional? Note, FIPS can live without such restrictions, but 
then why keep the fips flag lingering?

Ciao
Stephan
Stephan Mueller | 22 Aug 22:45 2014
Picon

DCO signed

Libgcrypt Developer's Certificate of Origin.  Version 1.0
=========================================================

By making a contribution to the Libgcrypt project, I certify that:

(a) The contribution was created in whole or in part by me and I
    have the right to submit it under the free software license
    indicated in the file; or

(b) The contribution is based upon previous work that, to the
    best of my knowledge, is covered under an appropriate free
    software license and I have the right under that license to
    submit that work with modifications, whether created in whole
    or in part by me, under the same free software license
    (unless I am permitted to submit under a different license),
    as indicated in the file; or

(c) The contribution was provided directly to me by some other
    person who certified (a), (b) or (c) and I have not modified
    it.

(d) I understand and agree that this project and the contribution
    are public and that a record of the contribution (including
    all personal information I submit with it, including my
    sign-off) is maintained indefinitely and may be redistributed
    consistent with this project or the free software license(s)
    involved.

Signed-off-by: Stephan Mueller <smueller <at> chronox.de>
(Continue reading)

And Sch | 22 Aug 18:32 2014

[PATCH 1/1] Improved ripemd160 performance

Here is the ripemd160 performance patch, signed.
before:
Hash:
                |  nanosecs/byte   mebibytes/sec   cycles/byte
 RIPEMD160      |      3.31 ns/B     288.0 MiB/s         - c/B

after:
Hash:
                |  nanosecs/byte   mebibytes/sec   cycles/byte
 RIPEMD160      |      2.08 ns/B     458.5 MiB/s         - c/B

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

diff -ruNp libgcrypt-1.6.2/cipher/rmd160.c libgcrypt-1.6.3/cipher/rmd160.c
- --- libgcrypt-1.6.2/cipher/rmd160.c	2014-08-21 07:50:39.000000000 -0500
+++ libgcrypt-1.6.3/cipher/rmd160.c	2014-08-21 17:15:13.678664524 -0500
 <at>  <at>  -178,8 +178,7  <at>  <at>  static unsigned int
 transform ( void *ctx, const unsigned char *data )
 {
   RMD160_CONTEXT *hd = ctx;
- -  register u32 a,b,c,d,e;
- -  u32 aa,bb,cc,dd,ee,t;
+  register u32 al, ar, bl, br, cl, cr, dl, dr, el, er;
   u32 x[16];
   int i;

 <at>  <at>  -201,196 +200,186  <at>  <at>  transform ( void *ctx, const unsigned ch
 #define F2(x,y,z)   ( ((x) | ~(y)) ^ (z) )
 #define F3(x,y,z)   ( ((x) & (z)) | ((y) & ~(z)) )
(Continue reading)

And Sch | 22 Aug 18:27 2014

[PATCH 1/1] Improved whirlpool hash performance

Hello again, I have signed the DCO now. Here is the whirlpool patch again, signed. I uploaded my public key to
the keyserver as well.
before:
Hash:
                |  nanosecs/byte   mebibytes/sec   cycles/byte
 WHIRLPOOL      |      7.75 ns/B     123.0 MiB/s         - c/B
after:
Hash:
                |  nanosecs/byte   mebibytes/sec   cycles/byte
 WHIRLPOOL      |      6.70 ns/B     142.3 MiB/s         - c/B
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

diff -ruNp libgcrypt-1.6.2/cipher/whirlpool.c libgcrypt-1.6.3/cipher/whirlpool.c
- --- libgcrypt-1.6.2/cipher/whirlpool.c	2014-08-21 07:50:39.000000000 -0500
+++ libgcrypt-1.6.3/cipher/whirlpool.c	2014-08-22 11:17:06.496754032 -0500
 <at>  <at>  -87,6 +87,17  <at>  <at>  typedef struct {
   for (i = 0; i < 8; i++) \
     block_dst[i] ^= block_src[i];

+/* XOR lookup boxes with index SRC [(SHIFT + n) & 7] >> x. */
+#define WHIRLPOOL_XOR(src, shift) \
+	C[((unsigned int)(src[ (shift)         ] >> 56)       )          ] ^ \
+	C[((unsigned int)(src[((shift) + 7) & 7] >> 48) & 0xff) +  256   ] ^ \
+	C[((unsigned int)(src[((shift) + 6) & 7] >> 40) & 0xff) + (256*2)] ^ \
+	C[((unsigned int)(src[((shift) + 5) & 7] >> 32) & 0xff) + (256*3)] ^ \
+	C[((unsigned int)(src[((shift) + 4) & 7] >> 24) & 0xff) + (256*4)] ^ \
+	C[((unsigned int)(src[((shift) + 3) & 7] >> 16) & 0xff) + (256*5)] ^ \
+	C[((unsigned int)(src[((shift) + 2) & 7] >>  8) & 0xff) + (256*6)] ^ \
+	C[((unsigned int)(src[((shift) + 1) & 7]      ) & 0xff) + (256*7)] \
(Continue reading)

And Sch | 22 Aug 18:11 2014

DCO signed


Libgcrypt Developer's Certificate of Origin.  Version 1.0
=========================================================

By making a contribution to the Libgcrypt project, I certify that:

(a) The contribution was created in whole or in part by me and I
    have the right to submit it under the free software license
    indicated in the file; or

(b) The contribution is based upon previous work that, to the
    best of my knowledge, is covered under an appropriate free
    software license and I have the right under that license to
    submit that work with modifications, whether created in whole
    or in part by me, under the same free software license
    (unless I am permitted to submit under a different license),
    as indicated in the file; or

(c) The contribution was provided directly to me by some other
    person who certified (a), (b) or (c) and I have not modified
    it.

(d) I understand and agree that this project and the contribution
    are public and that a record of the contribution (including
    all personal information I submit with it, including my
    sign-off) is maintained indefinitely and may be redistributed
    consistent with this project or the free software license(s)
    involved.

Signed-off-by: Andrei Scherer <andsch <at> inbox.com>
(Continue reading)

Meta Schima | 22 Aug 03:51 2014

[PATCH 1/1] Improved whirlpool hash performance

This patch improves whirlpool hash performance.
before:
Hash:
                |  nanosecs/byte   mebibytes/sec   cycles/byte
 WHIRLPOOL      |      7.74 ns/B     123.1 MiB/s         - c/B
after:
Hash:
                |  nanosecs/byte   mebibytes/sec   cycles/byte
 WHIRLPOOL      |      6.63 ns/B     143.8 MiB/s         - c/B

Note that I changed around some variables, I removed 4 x whirlpool_block_t sized variables
(BLOCK_SIZE/8), but added 2 x u64 [2][BLOCK_SIZE / 8], so that's still the same size to be burned from the
stack as before. I also added a macro and made the lookup tables into one large array because this improves
lookup time.

~ Andrei

----
diff -ruNp libgcrypt-1.6.2/cipher/whirlpool.c libgcrypt-1.6.3/cipher/whirlpool.c
--- libgcrypt-1.6.2/cipher/whirlpool.c	2014-08-21 07:50:39.000000000 -0500
+++ libgcrypt-1.6.3/cipher/whirlpool.c	2014-08-21 20:43:46.446406268 -0500
 <at>  <at>  -87,6 +87,17  <at>  <at>  typedef struct {
   for (i = 0; i < 8; i++) \
     block_dst[i] ^= block_src[i];

+/* XOR lookup boxes with index SRC [(SHIFT + n) & 7] >> x. */
+#define WHIRLPOOL_XOR(src, shift) \
+	C[((unsigned int)(src[ (shift)         ] >> 56)       )          ] ^ \
+	C[((unsigned int)(src[((shift) + 7) & 7] >> 48) & 0xff) +  256   ] ^ \
+	C[((unsigned int)(src[((shift) + 6) & 7] >> 40) & 0xff) + (256*2)] ^ \
(Continue reading)

Meta Schima | 22 Aug 00:34 2014

[PATCH 1/1] Improved ripemd160 perfromance

Hello, I am new to the mailing list. I have a patch here to improve ripemd160 performance by interleaving
left and right lanes. On my system before:
Hash:
                |  nanosecs/byte   mebibytes/sec   cycles/byte
 RIPEMD160      |      3.31 ns/B     288.1 MiB/s         - c/B
after:
Hash:
                |  nanosecs/byte   mebibytes/sec   cycles/byte
 RIPEMD160      |      2.08 ns/B     458.3 MiB/s         - c/B

Note that I removed one temporary u32 variable, and this reduces the stack burn from 108 to 104, hope this is right.

~ Andrei

----
diff -ruNp libgcrypt-1.6.2/cipher/rmd160.c libgcrypt-1.6.3/cipher/rmd160.c
--- libgcrypt-1.6.2/cipher/rmd160.c	2014-08-21 07:50:39.000000000 -0500
+++ libgcrypt-1.6.3/cipher/rmd160.c	2014-08-21 17:15:13.678664524 -0500
 <at>  <at>  -178,8 +178,7  <at>  <at>  static unsigned int
 transform ( void *ctx, const unsigned char *data )
 {
   RMD160_CONTEXT *hd = ctx;
-  register u32 a,b,c,d,e;
-  u32 aa,bb,cc,dd,ee,t;
+  register u32 al, ar, bl, br, cl, cr, dl, dr, el, er;
   u32 x[16];
   int i;

 <at>  <at>  -201,196 +200,186  <at>  <at>  transform ( void *ctx, const unsigned ch
 #define F2(x,y,z)   ( ((x) | ~(y)) ^ (z) )
(Continue reading)

Stephan Mueller | 21 Aug 21:56 2014
Picon

SP800-90A Deterministic Random Bit Generator

Hi,

I created the bug tracker entry https://bugs.g10code.com/gnupg/issue1701 
holding an updated patch set for the DRBG.

This implementation of the DRBG shares large portions of the DRBG 
implementation now present in the Linux kernel 3.17 RC1. Note, both were 
developed with the same code base.

This DRBG implementation is required for a successful validation of libgcrypt 
according to FIPS 140-2. Although the previous submissions of the DRBG patch 
set seem to have not been noticed, I am asking for a review of the code and 
for an includion of the code into libgcrypt.

--

-- 
Ciao
Stephan

Gmane