David A. Ventimiglia | 7 Aug 2009 19:16
Picon
Favicon

how to optimize for speed in SBCL

Hi!

How exactly does one use (declare (optimize (speed 3) (safety0))), as in
the following code example, which is adapted from the last chapter in
"Practical Common Lisp"?

(defun add (x y)
  (declare (optimize (speed 3) (safety 0)))
  (declare (double-float x y))
  (the double-float (+ x y)))

In Peter Seibel's book, x and y are fixnums, which seems to work ok in
SBCL.  With double-floats, however, it emits this note:

note: doing float to pointer coercion (cost 13) to "<return value>"

Not a huge deal for me, but I'm just trying to figure out how all these
things work.

All the best,
David

 
--

-- 
David A. Ventimiglia <ventimig <at> msu.edu>
Michigan State University

------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
(Continue reading)

Tobias C. Rittweiler | 7 Aug 2009 23:13
Picon

Re: how to optimize for speed in SBCL

"David A. Ventimiglia" <ventimig <at> msu.edu> writes:

> Hi!
>
> How exactly does one use (declare (optimize (speed 3) (safety0))), as in
> the following code example, which is adapted from the last chapter in
> "Practical Common Lisp"?
>
> (defun add (x y)
>   (declare (optimize (speed 3) (safety 0)))
>   (declare (double-float x y))
>   (the double-float (+ x y)))
>
> In Peter Seibel's book, x and y are fixnums, which seems to work ok in
> SBCL.  With double-floats, however, it emits this note:
>
> note: doing float to pointer coercion (cost 13) to "<return value>"
>
> Not a huge deal for me, but I'm just trying to figure out how all these
> things work.

The note means that the return value of ADD will be boxed, i.e. a kind
of wrapper around the actual floating point number will be allocated on
the heap. If you know Java, it's the same difference between double, the
intermediate value, and Double, the object.

The reason for boxing is that all values in Lisp pertain their type
until and during run-time (for example, so the garbage collector knows
how to deal with them.) This is most often achieved by tagging, that
is by reserving a few bits of each pointer to be used for type
(Continue reading)

Martin Cracauer | 8 Aug 2009 00:59

Re: how to optimize for speed in SBCL

David A. Ventimiglia wrote on Fri, Aug 07, 2009 at 10:16:50AM -0700: 
> Hi!
> 
> How exactly does one use (declare (optimize (speed 3) (safety0))), as in
> the following code example, which is adapted from the last chapter in
> "Practical Common Lisp"?
> 
> (defun add (x y)
>   (declare (optimize (speed 3) (safety 0)))
>   (declare (double-float x y))
>   (the double-float (+ x y)))
> 
> In Peter Seibel's book, x and y are fixnums, which seems to work ok in
> SBCL.  With double-floats, however, it emits this note:
> 
> note: doing float to pointer coercion (cost 13) to "<return value>"

Return values and function arguments will have to be type-tagged and
in the case of floats that means full boxing.  Even in fast mode as
this fast-mode module might be called from safe-mode code.

The canonical way around this is to pass around arrays or struct
instances that have slots of the desired type.  Of course that is not
worth it if you have to allocated an array or struct instance on every
call, you'll have to do reuse.

We have seen extremely fast cryptographic code for Common Lisp using a
specially allocated array of 32 bit integers as a kind of manually
managed heap and pass that around between functions as argument.

(Continue reading)

David A. Ventimiglia | 8 Aug 2009 19:08
Picon
Favicon

Re: how to optimize for speed in SBCL

Thanks for the detailed explanations. So then, should something like
this be sufficient, provided inlining is supported?

(declaim (inline add))

(defun add (x y)
  (declare (optimize (speed 3) (safety 0)))
  (declare (double-float x y))
  (the double-float (+ x y)))

On Fri, 2009-08-07 at 18:59 -0400, Martin Cracauer wrote:
> David A. Ventimiglia wrote on Fri, Aug 07, 2009 at 10:16:50AM -0700: 
> > Hi!
> > 
> > How exactly does one use (declare (optimize (speed 3) (safety0))), as in
> > the following code example, which is adapted from the last chapter in
> > "Practical Common Lisp"?
> > 
> > (defun add (x y)
> >   (declare (optimize (speed 3) (safety 0)))
> >   (declare (double-float x y))
> >   (the double-float (+ x y)))
> > 
> > In Peter Seibel's book, x and y are fixnums, which seems to work ok in
> > SBCL.  With double-floats, however, it emits this note:
> > 
> > note: doing float to pointer coercion (cost 13) to "<return value>"
> 
> Return values and function arguments will have to be type-tagged and
> in the case of floats that means full boxing.  Even in fast mode as
(Continue reading)

Thomas F. Burdick | 8 Aug 2009 19:44
Picon

Re: how to optimize for speed in SBCL

2009/8/8 David A. Ventimiglia <ventimig <at> msu.edu>:
> Thanks for the detailed explanations. So then, should something like
> this be sufficient, provided inlining is supported?

You're talking about SBCL, so yes, inlining is supported.

> (declaim (inline add))

This should suffice.

> (defun add (x y)
>  (declare (optimize (speed 3) (safety 0)))
>  (declare (double-float x y))
>  (the double-float (+ x y)))

Two things here. First, you probably don't want to use (safety 0). It
has its uses, mostly when you want to lie to the compiler and you know
that it's okay to do so. Where you might get efficiency notes at
(safety 1), you'll likely get incorrect runtime behavior at (safety
0). Better to figure out what the notes are trying to tell you, and to
ask here if you can't.

Second, you want to be careful with using (the ...). It will turn into
a run-time assertion if the compiler can't figure out at runtime that
what you're asserting is true. In this case, it's a no-op, since
adding two double-floats results in a double-float.

What you probably want to write here is this:

(declaim (inline add))
(Continue reading)

David A. Ventimiglia | 10 Aug 2009 18:33
Picon
Favicon

Re: how to optimize for speed in SBCL

Thanks, again.  I've another question about these optimization
strategies, then.  What's your (or others') advice on attempting these
optimizations when working with interdependent functions.  What I mean
is this.  I have this function, which calculates a Gaussian conditional
probability density function of l given m, and uses the GNU Scientific
Library via GSLL (http://common-lisp.net/project/gsll/) to do it.  It's
one of the factors in an integrand that gets evaluated many, many times.

(declaim (inline P-l-given-m))
(defun P-l-given-m (l m z omega-m-0 omega-l-0 w log-Ax beta sig-l)
  (declare (optimize (speed 3) (safety 1))
	   (double-float  l m z omega-m-0 omega-l-0 w log-Ax beta sig-l))
  (let* ((l-expected (l-of-m m z omega-m-0 omega-l-0 w log-Ax beta))
	 (del-l (- l l-expected)))
    (gaussian-pdf 
     (coerce del-l 'double-float)
     (coerce sig-l 'double-float))))

I get compiler notes for the 6th line, with this form

(del-l (- l l-expected)

and on the 8th line, with this form

(coerce del-l 'double-float)

essentially because it is not known to the compiler that l-expected is a
double-float.  

So, what are ways to deal with this?  Do I go to the function l-of-m and
(Continue reading)

Leslie P. Polzer | 10 Aug 2009 18:42
Picon

Re: how to optimize for speed in SBCL


David A. Ventimiglia wrote:

> So, what are ways to deal with this?  Do I go to the function l-of-m and
> declare its parameters and return type?  I can do that, but that
> function itself calls other functions, so won't I have to chase them
> down as well?  Is it possible to pick out one function, which may not be
> a leaf function in the tree of calls, and insert declarations into it in
> isolation, or do you have to visit every function in a chain of calls,
> declaring types for each in turn, and keep doing that until you've
> mopped up all the compiler notes?

Have you tried to DECLARE l-expected to be a double-float after its LET
line?

  Leslie

--

-- 
http://www.linkedin.com/in/polzer

------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
Martin Cracauer | 10 Aug 2009 18:56

Re: how to optimize for speed in SBCL

David A. Ventimiglia wrote on Mon, Aug 10, 2009 at 09:33:20AM -0700: 
> Thanks, again.  I've another question about these optimization
> strategies, then.  What's your (or others') advice on attempting these
> optimizations when working with interdependent functions.  What I mean
> is this.  I have this function, which calculates a Gaussian conditional
> probability density function of l given m, and uses the GNU Scientific
> Library via GSLL (http://common-lisp.net/project/gsll/) to do it.  It's
> one of the factors in an integrand that gets evaluated many, many times.
> 
> (declaim (inline P-l-given-m))
> (defun P-l-given-m (l m z omega-m-0 omega-l-0 w log-Ax beta sig-l)
>   (declare (optimize (speed 3) (safety 1))
> 	   (double-float  l m z omega-m-0 omega-l-0 w log-Ax beta sig-l))
>   (let* ((l-expected (l-of-m m z omega-m-0 omega-l-0 w log-Ax beta))
> 	 (del-l (- l l-expected)))
>     (gaussian-pdf 
>      (coerce del-l 'double-float)
>      (coerce sig-l 'double-float))))

I'm afraid this general style of coding for doing floating point work
gets you nowhere in Common Lisp, any existing implementation.  Since
we are not allowed to cut a couple of bits off floats (like we do for
integers) any kind of cross-function boundary will kill you sooner or
later.  Inlining helps but obviously you will either have to put a
stop on inlining and hence suffer horrendously from boxing, or you
will drive your whole code package into a hell of cache misses from
too much inlining.

If you really need this to run fast you will have to operate
exclusively on struct members or on floats in arrays.
(Continue reading)

David A. Ventimiglia | 10 Aug 2009 19:35
Picon
Favicon

Re: how to optimize for speed in SBCL

I see.  And structs are one way of (possibly) achieving constant-space,
compact data?  Maybe that's within reach for me, given that, what I have
is a tree of function calls, most of which use mostly the same
parameters.  Most of them need some or all members of a set of model
parameters, theta_1 through theta_N, which are constant throughout the
chain of calls.  A few of the functions are integrands, so they take
dummy variables which obviously are not constant, in addition to the
constant model parameters.  Finally, there are a few functions
(typically very small, utility, and often simply converting functions)
that don't take any of the model parameters.  

I could start by stuffing my model parameters, theta_1 through theta_N,
as double-floats into a struct, making an instance of that struct at the
root of the tree of calls, then propagating that same instance (without
allocating copies) throughout the tree.  Is that a reasonable way to
start?

Then what do I do about return values?  Typically, as in the functions
I'm most concerned about, which are in the integrands, a function will
take some model parameters plus some dummy parameters, do some math, and
return a number.  I guess I want that return value to be unboxed, just
as I want my parameters to be unboxed.  Could I reserve a slot (or
slots) in my struct for return values?

All the best,
David  

On Mon, 2009-08-10 at 12:56 -0400, Martin Cracauer wrote:
> David A. Ventimiglia wrote on Mon, Aug 10, 2009 at 09:33:20AM -0700: 
> > Thanks, again.  I've another question about these optimization
(Continue reading)

David A. Ventimiglia | 10 Aug 2009 19:49
Picon
Favicon

Re: how to optimize for speed in SBCL

I did, after Martin made that recommendation in an earlier post.  Thanks
to both of you.

Best,
David

On Mon, 2009-08-10 at 18:42 +0200, Leslie P. Polzer wrote:
> David A. Ventimiglia wrote:
> 
> > So, what are ways to deal with this?  Do I go to the function l-of-m and
> > declare its parameters and return type?  I can do that, but that
> > function itself calls other functions, so won't I have to chase them
> > down as well?  Is it possible to pick out one function, which may not be
> > a leaf function in the tree of calls, and insert declarations into it in
> > isolation, or do you have to visit every function in a chain of calls,
> > declaring types for each in turn, and keep doing that until you've
> > mopped up all the compiler notes?
> 
> Have you tried to DECLARE l-expected to be a double-float after its LET
> line?
> 
>   Leslie
> 
--

-- 
David A. Ventimiglia <ventimig <at> msu.edu>
Michigan State University

------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
(Continue reading)


Gmane