Optimising a combinatorial algorithm?
Mario Lang <mlang <at> delysid.org>
2015-09-24 14:11:29 GMT
TL;DR: My implementation looks elegant (to me), but is rather slow
compared to a well optimized C++ impl. of the same algorithm. Runtime
differences are currently huge. Looking for input on what I could
change to match C++ performance, without loosing too much expressiveness
of the code.
I am writing software to deal with Braille music code.
As an exercise for me, and as an attempt to boil the algorithm down
to a more readable form, I am in the process of translating my C++
program to Haskell. So far, so good. Given that I only came back to
Haskell (after 10 years of a break) roughly 2 weeks ago, I think I
have made quite nice progress. However, now that I reached a stage
where I can actually do a performance comparison, the dreaded thing has
happened: The performance of my Haskell implementation is rather
horrible to what my C++ implementation can deliver. While the code size
has gone down roughly by a factor of 10, runtime has gone up roughly by
the same factor...
For the impatient, here is the code I am talking about:
The first half of the code is concerend with parsing braille.
Performance is not critical here, so I opted to use Parsec, mostly
because I am interested in good quality error messages.
The fun starts with the definition of 'pvs'.
Braille music values are inherently ambiguous. The sign for a whole
note can also mean a 16th note, a sign for a half note can also
mean a 32th note, the sign for a quarter note can also mean a 64th, and
a 8th note could also mean a 128th note. This is histoical, and there
is no way around this. The time signature (meter) is used to actually
tell which note is which. This is relatively easy to do for an
experienced human, but quite complicated for a computer.
A measure (bar) of braille music can contain parallel *and* sequential
parts. The data structure is basically:
data Sign = ...
type PartialVoice = [Sign]
type PartialMeasure = [PartialVoice]
type Voice = [PartialMeasure]
type Measure = [Voice]
As a constraint, all PartialVoices in a PartialMeasure need to have the
exact same duration. Similarily, all Voices inside a Measure also need to
have the exact same duration.
So 'pvs', 'pms', 'vs' and 'ms' in my Data.Braille.Music module are concerned
with calculating all possible interpretations of the input.
All in all, the current Haskell implementation is doing what it should.
However, profiling tells me I am allocating roughly 2GB of memory
to disambiguate a single measure of music in 3/2 time.
This takes roughly 1.2s on my rather fast CPU.
The amount of allocation and the runtime are unacceptable.
If I leave it at that, I might as well ditch the code and terminate the
As a comparison, my C++ impl. takes roughly 1s to disambiguate 32 measures of
the same piece, while Haskell already takes 1.2s to disambiguate just
one of these measures.
Profiling tells me I am spending 90% of all the time in 'dur', which is my
small helper method to calculate the duration of a PartialVoice,
PartialMeasure or Voice.
The project is setup to support profiling:
git clone https://github.com/mlang/hbmc
cabal run profiling
Do you see a way to significantly speed this algorithm up, without
loosing a lot of expressiveness/elegancy?
Since I am back to Haskell since only 2 weeks, I am basically happy abut
every sort of hint/tip you can give me. Stylistic, as well as
For now, this is really just a toy project. However, if I can manage to
improve the performance significantly without destroying the readability
of the code too much, I might be interested to finish translating my
current C++ work into a more-or-less complete Haskell library.
After all, BMC is more or less a compiler.
And Haskell is supposed to be very good at this stuff
Here is a link to the original C++ project:
Beginners mailing list
Beginners <at> haskell.org