bnis-subscribe | 29 Mar 2010 01:16
Picon

A year ago you came to Russia, I remember you, write me!

Still single?look at my profile, Olga from Russia http://slip225.spaces.live.com

jerry | 5 Jun 2008 01:19
Picon

Your email requires verification verify#WpyWNOT7o2yCj1uSmZVmPqQMJ0pC1zWJ

The message you sent requires that you verify that you 
are a real live human being and not a spam source.

To complete this verification, simply reply to this message and leave
the subject line intact.

The headers of the message sent from your address are show below:

From bnis <at> list.cr.yp.to Wed Jun 04 19:19:09 2008
Received: from pool-96-248-40-198.pghkny.fios.verizon.net ([96.248.40.198])
 by TK1.MJWEBHOSTING1.COM with esmtp (Exim 4.68)
 (envelope-from <bnis <at> list.cr.yp.to>)
 id 1K42G5-0007Hk-4Y
 for jerry <at> groovecollection.nl; Wed, 04 Jun 2008 19:19:09 -0400
Message-ID: <000401c8c699$04a77ce5$48523fb8 <at> bwsqgihu>
From: "jeth avraham" <bnis <at> list.cr.yp.to>
To: <jerry <at> groovecollection.nl>
Subject: Your video file jerry
Date: Wed, 04 Jun 2008 21:31:44 +0000
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook Express 6.00.2900.3138
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.3198

info | 5 Jun 2008 01:17

MESSAGE NOT DELIVERED: Your video file info

Your message could not be delivered. The User is out of space. Please try to send your message again at a later time.

Nicolas Higgins | 17 Feb 2008 19:01

Compass

Why should God make people knowing they are going to hell forever? 

http://patricabostqc.blogspot.com

Online tracking, safe, secure purchase.

D. J. Bernstein | 12 Aug 2007 18:55
Picon

The tangent FFT

If you're using the complex-floating-point split-radix FFT, you can gain
a constant factor in total floating-point operations (asymptotically
more than 5%!) by switching to the tangent FFT:

   http://cr.yp.to/papers.html#tangentfft

The operation count is due to Van Buskirk. The algorithm in my paper has
the virtue of being simultaneously simple, in-place, and cache-friendly.

I haven't seen a serious analysis of the impact of Van Buskirk's ideas
for cost measures more complicated than counting arithmetic operations.
The Johnson-Frigo paper on the topic claims that practical performance
of FFTs is "unpredictable"; I see no justification for this claim, and
my own FFT optimization experience tells me that the opposite is true.
Has anyone tried a serious implementation of Van Buskirk's FFT?

---D. J. Bernstein, Professor, Mathematics, Statistics,
and Computer Science, University of Illinois at Chicago

D. J. Bernstein | 28 Apr 2007 21:27
Picon

fuerer multiplication

The Schoenhage-Strassen O(n log n log log n) bound is obsolete: Martin
Fuerer, in a paper "Faster integer multiplication" to appear at STOC
2007, has proven a bound of n (log n) 2^(log* n), where log* is the
number of log iterations required to reach 1.

This was stated as a purely asymptotic result, and an improvement from
log log n to 2^(log* n) might not sound noticeable for any reasonable
value of n; but I think that Fuerer's idea is worth investigating for
multiplications in practice.

The core idea, which I'll choose some concrete parameters to illustrate,
is to perform an FFT of size 1048576 over the ring \C[x]/(x^16 + 1),
using a 32768th root of x in the ring as the 1048576th root of 1. This
involves

   10485760 additions in the ring (so 335544320 additions in \R),
   10485760 subtractions in the ring, and
   10485760 multiplications in the ring.

You're supposed to know that---as in the split-radix FFT---you can
arrange for most of the multiplications in an FFT to be by "easy" roots
of 1. In this case, most of the multiplications are by powers of x, and
are thus simple coefficient reshufflings, with negations absorbed into
subsequent operations. A small fraction of the multiplications, only
about 2 million, are general multiplications in \C[x]/(x^16+1), which
asymptotically should be embedded into smaller integer multiplications,
with precomputations on the constant inputs.

Actually, \C is unnecessary. The ring \R[x]/(x^16 + 1) has

(Continue reading)

Felix von Leitner | 24 Mar 2007 22:20
Picon

Re: big number arithmetic

Thus spake Paul Zimmermann (Paul.Zimmermann <at> loria.fr):
> I cannot reproduce your GMP timings. Using gmp-4.2.1 on a Core 2, I get with
> the default installation:

You can get my code (including the tfm and gmp test programs) here:

  cvs -d :pserver:cvs <at> cvs.fefe.de:/cvs co crypto

The fast multiplication code is comba_16.s (for amd64) or comba32_s (for
i386), generated when you run make.

See gmp.c and tomfastmath.c for the benchmarks, t.c for the benchmark
and exercise of my own code.

It's all very experimental, so please don't expect anything polished.

Felix

Felix von Leitner | 24 Mar 2007 10:21
Picon

Re: big number arithmetic

> > My hunch is that that's because gmp's schoolbook multiplication is not
> > very fast.  I haven't polished mine yet, but it's significantly faster
> > than even tomfastmath's one, which in turn beats gmp.
> Can you prove those assertions with some figures? Here are some cycles
> with gmp-4.2.1 on a Pentium M:

I don't have a Pentium M handy right now, but here are my measurements
for a 1024 bit multiplicate (1024x1024->2045 bits) on my Core 2 Duo
notebook (all 32-bit mode):

my code:	7716 cycles
gmp:		9600 cycles
tomfastmath:	9156 cycles

> Can you provide cycle numbers for tomfastmath and your code?

1024 bits == 32 words.

I called mpz_mul for gmp and fp_mul for tomfastmath.  In my code I'm
cheating a little because I directly call the 1024x1024 bit multiply
routine.  Once I finish my library, there will be added overhead to
check that there is enough space allocated.

I implemented Karatsuba, too, but that implementation is comparatively
unoptimized so it's not fair to use it for comparison.

Felix

Felix von Leitner | 22 Mar 2007 21:38
Picon

Re: big number arithmetic

Thus spake Paul Zimmermann (Paul.Zimmermann <at> loria.fr):
> > Don't.  Karatsuba starts to be useful (depending on the architecture) at
> > 1024 bit numbers and up.  Do the school method for smaller numbers, as
> > someone else already suggested.
> I do not agree. On a Pentium M, "make tune" in gmp-4.2.1 gives:

The point was: it's not useful at 128 bits. :-)

> #define MUL_KARATSUBA_THRESHOLD          22

> which means that Karatsuba is faster than the schoolbook method up from
> 22 words, i.e., 704 bits.

My hunch is that that's because gmp's schoolbook multiplication is not
very fast.  I haven't polished mine yet, but it's significantly faster
than even tomfastmath's one, which in turn beats gmp.

> PS: it's funny that after several years without any message, this list
> suddenly wakes up. I also recommend the gmp-discuss list.

:-)

Felix

John Nietzsche | 22 Mar 2007 12:40
Picon

big number arithmetic

Dear list members,

i am trying to imlement multiplication function for two 64 bit number.
I wondered which would it be the best possible algorithm for such a
feat? May someone suggest one?
After have implemented, what about make it generic, i.e., for n bits?

Thanks in advance.

Parerga | 16 Nov 2006 03:01
Picon
Favicon

Analysis on Toom multiply.


Dear developers,
We started some analysis on toom-cook inversion phase and preprinted a short
paper. We are looking for actual implementations of Toom-Cook 3-way, 4-way,
5-way and possibly higher to compare. If you know some, please let us know.
Peference: http://bodrato.it/papers/#CIVV2006
Marco & Alberto
--

-- 
View this message in context: http://www.nabble.com/Analysis-on-Toom-multiply.-tf2640267.html#a7370311
Sent from the cr.yp.to - bnis mailing list archive at Nabble.com.


Gmane