Joshua Haberman | 2 Aug 08:50 2009

equivalent of EXPLAIN PLAN with GHC?

Hello, I'm quite new to Haskell, but experienced in other languages (C,
Python, Ruby, SQL, etc).  I am interested in Haskell because I've heard
that the language is capable of lots of optimizations based on laziness,
and I want to learn more about that.

I dug in with Project Euler problem #1, and wrote:

main = print (show (sum [x | x <- [3..999], x `mod` 3 == 0 || x `mod` 5 == 0]))

So far so good, but I want to have some way of observing what
optimizations GHC has performed.  Most notably, in this example I want
to know if the list was ever actually constructed in memory.  The "sum"
function only needs the elements one at a time, in order, so if Haskell
and GHC are everything I've heard about them, I would fully expect the
list construction to be optimized out.  :)

Unfortunately I was not able to see any way of examining ghc's output to
determine whether it had performed this optimization.  The C it produced
with '-fvia-C -C' was totally unreadable -- it looked like something
from the IOCCC. :(  I couldn't find any way to match up any of the code
it had generated with code I had written.  My attempts to objdump the
binaries was similarly unproductive.

Is there any kind of intermediate form that a person can examine to see
how their code is being optimized?  Anything like EXPLAIN PLAN in SQL?
It would make it much easier to understand the kinds of optimizations
Haskell can perform.

I'm not looking so much for profiling -- obvious this program is trivial
and takes no time.  I just want to better understand what kind of
(Continue reading)

Thomas DuBuisson | 2 Aug 09:12 2009
Picon

Re: equivalent of EXPLAIN PLAN with GHC?

Josh,

In general you'll find the haskell-cafe (haskell-cafe <at> haskell.org) to
be a more lively place for this type of discussion, but since we're
here I might as well mention that memory use of a Haskell function is
one of the hardest things to gain an understanding about.

> main = print (show (sum [x | x <- [3..999], x `mod` 3 == 0 || x `mod` 5 == 0]))
> I want
> to know if the list was ever actually constructed in memory.

For such a simple program I suggest you test with lists of various lengths:

---------------- main with n == 999 -------------------
[tommd <at> Mavlo Test]$ ghc --make -O2 opt.hs
[1 of 1] Compiling Main             ( opt.hs, opt.o )
Linking opt ...
[tommd <at> Mavlo Test]$ ./opt +RTS -sstderr
./opt +RTS -sstderr
"233168"
          94,604 bytes allocated in the heap
             700 bytes copied during GC
          17,144 bytes maximum residency (1 sample(s))
          23,816 bytes maximum slop
               1 MB total memory in use (0 MB lost due to fragmentation)

  Generation 0:     0 collections,     0 parallel,  0.00s,  0.00s elapsed
  Generation 1:     1 collections,     0 parallel,  0.00s,  0.00s elapsed

  INIT  time    0.00s  (  0.00s elapsed)
(Continue reading)

Joshua Haberman | 2 Aug 09:45 2009

Re: equivalent of EXPLAIN PLAN with GHC?

Hi Thomas, thanks for the reply!

Thomas DuBuisson <thomas.dubuisson <at> gmail.com> writes:
> Josh,
> 
> In general you'll find the haskell-cafe (haskell-cafe <at> haskell.org) to
> be a more lively place for this type of discussion

Good to know, I just wasn't sure if it was appropriate for
GHC-specific questions.

> but since we're
> here I might as well mention that memory use of a Haskell function is
> one of the hardest things to gain an understanding about.

Hmm, that's unfortunate.  :(

> > main = print (show (sum [x | x <- [3..999], x `mod` 3 == 0 ||
> > x `mod` 5 == 0]))
> > I want
> > to know if the list was ever actually constructed in memory.
> 
> For such a simple program I suggest you test with lists of various lengths:

Cool, thanks for the example.

>           94,604 bytes allocated in the heap

Is there any way I could find out what these 94kb of RAM were
allocated for?  This seems high -- my entire program's working set
(Continue reading)

Don Stewart | 2 Aug 09:50 2009

Re: equivalent of EXPLAIN PLAN with GHC?

joshua:
> Hello, I'm quite new to Haskell, but experienced in other languages (C,
> Python, Ruby, SQL, etc).  I am interested in Haskell because I've heard
> that the language is capable of lots of optimizations based on laziness,
> and I want to learn more about that.
> 
> I dug in with Project Euler problem #1, and wrote:
> 
> main = print (show (sum [x | x <- [3..999], x `mod` 3 == 0 || x `mod` 5 == 0]))
> 
> So far so good, but I want to have some way of observing what
> optimizations GHC has performed.  Most notably, in this example I want
> to know if the list was ever actually constructed in memory.  The "sum"
> function only needs the elements one at a time, in order, so if Haskell
> and GHC are everything I've heard about them, I would fully expect the
> list construction to be optimized out.  :)
> 
> Unfortunately I was not able to see any way of examining ghc's output to
> determine whether it had performed this optimization.  The C it produced
> with '-fvia-C -C' was totally unreadable -- it looked like something
> from the IOCCC. :(  I couldn't find any way to match up any of the code
> it had generated with code I had written.  My attempts to objdump the
> binaries was similarly unproductive.
> 
> Is there any kind of intermediate form that a person can examine to see
> how their code is being optimized?  Anything like EXPLAIN PLAN in SQL?
> It would make it much easier to understand the kinds of optimizations
> Haskell can perform.
> 
> I'm not looking so much for profiling -- obvious this program is trivial
(Continue reading)

Bulat Ziganshin | 2 Aug 10:07 2009
Picon

Re[2]: equivalent of EXPLAIN PLAN with GHC?

Hello Joshua,

Sunday, August 2, 2009, 11:45:57 AM, you wrote:

>>           94,604 bytes allocated in the heap

> Is there any way I could find out what these 94kb of RAM were
> allocated  for?  This seems high -- my entire program's working set
> is <6kb.

as Don said, compiled code works on Int# (which is the same as C int)
but probably not via registers. instead, each intermediate value
allocated in heap, so there are total 90 bytes per iteration = about
22 integers

--

-- 
Best regards,
 Bulat                            mailto:Bulat.Ziganshin <at> gmail.com
Don Stewart | 2 Aug 10:10 2009

Re: equivalent of EXPLAIN PLAN with GHC?

bulat.ziganshin:
> Hello Joshua,
> 
> Sunday, August 2, 2009, 11:45:57 AM, you wrote:
> 
> >>           94,604 bytes allocated in the heap
> 
> > Is there any way I could find out what these 94kb of RAM were
> > allocated  for?  This seems high -- my entire program's working set
> > is <6kb.
> 
> as Don said, compiled code works on Int# (which is the same as C int)
> but probably not via registers. instead, each intermediate value
> allocated in heap, so there are total 90 bytes per iteration = about
> 22 integers

In my example, there's no heap allocation. In the version with
build/foldr though, there will be heap nodes passed to sum.

The fused one results in,

          32,160 bytes allocated in the heap

But there's no heap allocation in the filtering code, only once we have
to construct the string to print.

-- Don
Don Stewart | 2 Aug 10:35 2009

Re: equivalent of EXPLAIN PLAN with GHC?

dons:
> Showing what transformations happened. Notably, 2 occurences of the "streamU/unstreamU"
> transformation, to remove intermediate structures.
> 
> The final code looks like:
> 
>     $s$wfold :: Int# -> Int#
>    $s$wfold =
>   \ (sc_s19l :: Int#) ->
>     case modInt# (-9223372036854775807) 3 of wild21_a14L {
>       __DEFAULT ->
>         case modInt# (-9223372036854775807) 5 of wild211_X159 {
>           __DEFAULT -> $wfold sc_s19l (-9223372036854775806);
>           0 ->
>             $wfold
>               (+# sc_s19l (-9223372036854775807)) (-9223372036854775806)
>         };
>       0 ->
>         $wfold
>           (+# sc_s19l (-9223372036854775807)) (-9223372036854775806)
>     }
>     $wfold :: Int# -> Int# -> Int#

Oh, this is also a good illustration of why you should use `rem` instead
of `mod` , if you're not expecting negative numbers:

The core with filterU (\x -> x `rem` 3 == 0 || x `rem` 5 == 0) avoids all the
checks for -9223372036854775806.

    $s$wfold :: Int# -> Int#
(Continue reading)

Serge D. Mechveliani | 2 Aug 11:24 2009
Picon

Re to `optimization for list'

Joshua Haberman <joshua <at> reverberate.org>  writes on  2 Aug 2009 

> Hello, I'm quite new to Haskell, but experienced in other languages (C,
> Python, Ruby, SQL, etc).  I am interested in Haskell because I've heard
> that the language is capable of lots of optimizations based on laziness,
> and I want to learn more about that.
>
> I dug in with Project Euler problem #1, and wrote:
> 
> main = 
>  print (show (sum [x | x <- [3..999], x `mod` 3 == 0 || x `mod` 5 == 0]))
>
> So far so good, but I want to have some way of observing what
> optimizations GHC has performed.  Most notably, in this example I want
> to know if the list was ever actually constructed in memory.  The "sum"
> function only needs the elements one at a time, in order, so if Haskell
> and GHC are everything I've heard about them, I would fully expect the
> list construction to be optimized out.  :)
> [..]

Your example is rather complex to start with, because `sum' has rather a 
complex definition in the Haskell library.
Consider first a more pure example:

  main = putStr (shows (or' ys) "\n")
         where
         n  = 10^8
         ys = replicate n False

         or' :: [Bool] -> Bool
(Continue reading)

Bertram Felgenhauer | 2 Aug 20:32 2009

Re: Working with GHC HEAD

Antoine Latter wrote:
> I was trying to see what GHC head was like, but I've run into a few
> snags compiling packages.

There's a discrepancy between ghc and ghc-pkg that causes this.
See http://hackage.haskell.org/trac/ghc/ticket/3410

> My existing binary for cabal-install can install quite a few packages,
> but then starts giving me strange errors eventually:

I believe cabal-install goes through ghc-pkg rather than manipulating
the package configuration directly. So this should work in theory.

> - Does anyone have a version of 'network' which builds against GHC
> head? I could bludgeon in the new GHC.IO.FD.FD type myself, but I'd
> thought I'd ask around first.

http://int-e.home.tlink.de/haskell/network-ghc-6.11.dpatch

works for me.

Bertram
Joshua Haberman | 3 Aug 09:05 2009

Re: equivalent of EXPLAIN PLAN with GHC?

Don Stewart <dons <at> galois.com> writes:
> joshua:
> > Is there any kind of intermediate form that a person can examine to see
> > how their code is being optimized?  Anything like EXPLAIN PLAN in SQL?
> > It would make it much easier to understand the kinds of optimizations
> > Haskell can perform.
> > 
> > I'm not looking so much for profiling -- obvious this program is trivial
> > and takes no time.  I just want to better understand what kind of
> > optimizations are possible given Haskell's language model.
> > 
> 
> So the optimization you're looking for here is fusion of some kind.
> GHC ships with build/foldr fusion, and there are libraries for an
> alternative system, stream fusion. 
> 
> GHC uses an intermediate representation called 'Core', which is a
> mini-Haskell, essentially, that is optimized repeatedly via
> type-preserving transformations. You can inspect this with a number of
> tools, including "ghc-core" (available on Hackage).

Excellent Don, thanks a lot for taking the time to spell this
out for me.  I don't understand it all yet, but I know what to
dig into next time I want to observe the compiler's behavior.

Josh

Gmane