5 Dec 2002 21:01
large size hash-tables and CPU time
Lin Xu <lxu <at> cse.unl.edu>
2002-12-05 20:01:05 GMT
2002-12-05 20:01:05 GMT
We are running ACL 6.2 on Sun OS and wonder what we are wrong..
Our implementation relies on storing data in hash-tables. We do not
know a priori the number of data items to be stored (could vary from
20 data items to 60,000 even more).
We noticed (with profiler) that, with a fixed hash-table size, much
effort is spent on (re-)hashing. So, we tried to declare the size of
the hashtable as some function of the input parameters. We use the
following rehash arguments :rehash-threshold 0.9 :rehash-size 1.5.
When we run our experiments, the CPU time performance (measured with
the function time, w/o accounting for the effort spent on gc) is very
erratic: sometimes the fixed size hash-table shows a better
performance sometimes the opposite is true...
This is a serious problem as it is preventing us from drawing any
conclusions WRT to the `algorithmic' benefits of our research.
Are we doing anything wrong? how can we improve our implementation in
order to have a more `stable' results?
Thank you for any pointer.
Lin XU
.'*
:`...' `.,' '
(Continue reading)
RSS Feed