1 Jul 02:26
Re: Proposed templated integer_sort
Steven Ross <spreadsort <at> gmail.com>
2008-07-01 00:26:14 GMT
2008-07-01 00:26:14 GMT
On Mon, Jun 30, 2008 at 12:26 PM, Simonson, Lucanus J < lucanus.j.simonson <at> intel.com> wrote: > Steven Ross wrote: > >For larger n, the advantage vs. pure comparison-based algorithms > increases. > >In effect, Spreadsort smoothly transitions from pure MSB Radix sort to > >std::sort > >with its data cut up by a couple iterations of MSB Radix sort > >(reducing the size of the log(n) factor), > >depending upon the distribution and size of the data. > > The analysis seems a little too optimistic and seems to be based upon > the assumption that the input is evenly distributed. To analyze the > complexity you need to think like an adversary who knows your algorithm > and intentionally creates an input that will make it perform poorly. In > the worst case your algorithm would be slower than std::sort because a > data set with a cluster of values that all fall within the same bin and > a handful of outliers that give rise to the unfortunate binning would > result in runtime for std::sort + runtime for one or more recursive > stages of profitless binning. So, the algorithm should be expected to > be faster than std::sort alone for some inputs, but slower for others. > The worst case is actually binning that only marginally splits up the data each iteration in such a fashion that std::sort must be called on an only slightly reduced size list at the end. This corresponds to the square_root(bit_sizeof(T)) + a constant number of std::sort iterations situation. Because each radix-based sorting iteration is significantly slower than an introsort iteration, this situation may(Continue reading)

RSS Feed