Whoops! Correction to Statistical Condorcet
Kristofer Munsterhjelm <km_elmet <at> tonline.de>
20140903 18:10:47 GMT
Apparently the unconstrained MLE for the multinomial distribution with
given probabilities isn't the SainteLaguë or Webster apportionment, but
the D'Hondt or Jefferson apportionment[1].
This came as quite a surprise to me, given that the chisquared and
Gtests are claimed to approach the exact test as n goes to infinity. I
may look into this in detail later, but I suspect the situation is that
although the exact test value, call it x, and the Gtest value, call it
y, obey lim n>inf x_ny_n = 0, for any finite n, the exact test will
give the D'Hondt assignment a greater value (more likely draw) than the
SainteLaguë one, whereas it's the other way around for the Gtest (or
chisquared test).
I am not a formal statistician, though! And since I got the implications
of the convergence wrong, I might be wrong about this as well.
For clarification purposes, letting s be the number of seats, the exact
test for the multinomial distribution is: given a draw vector x, (i.e.
so many of x_1, so many of x_2, etc) and a probability vector p = (p_1,
p_2, ...), and the multinomial pmf P
Pr(x) = sum [for all y so that P(y; s, p) <= P(x; s, p)] P(y;s,p)
And letting s be the number of seats, define chisquared test statistic as
chi(x, p) = SUM i=1...n (x_i/s  p_i)^2/(p_i)
Then, say, for the following votes: (10, 9, 8, 5, 4) and 5 seats, we have:
pvector: 0.28, 0.25, 0.22, 0.14, 0.11
SainteLague: (1, 1, 1, 1, 1)
D'Hondt: (2, 1, 1, 1, 0)
Exact test value for SainteLague: 0.899
Exact test value for D'Hondt: 1.0
multinomial pmf for the SainteLague assignment: 0.028
multinomial pmf for the assignments with greater probability than this:
0.03234 for [1, 2, 1, 1, 0]
0.03622 for [2, 1, 1, 1, 0]
0.03234 for [2, 2, 1, 0, 0]
But the chisquared statistic for SainteLague is 0.13403 while the one
for the D'Hondt apportionment is 0.19896, thus ranking the former higher
than the latter.
This is true even for large s, e.g.:
p = (0.3786, 0.245265, 0.1846, 0.06637, 0.06583, 0.059335)
150 seats
SainteLague: [56, 37, 28, 10, 10, 9]
D'Hondt: [57, 37, 28, 10, 9, 9]
pmf for SainteLague: 1.64*10^5
pmf for D'Hondt: 1.66*10^5
chisquare for SainteLague: 0.00012
chisquare for D'Hondt: 0.00058

Of course, if you like D'Hondt (for stability reasons or otherwise), you
don't need to do anything to Statistical Condorcet to fix the above.
Because it's Condorcetbased, it should also favor compromise parties
rather than parties that get large numbers of first preference votes, so
it is better than ordinary D'Hondt in that respect.
But if you don't, then the elegance of maximizing the pmf falls. So we'd
have to find some way of using, say, the global optimality properties
mentioned in http://rangevoting.org/Apportion.html directly. But this is
tricky because they are all minimization properties, which means that
the optimizer might just decide to set zero voters to participate and
thus get a perfect zero every time.
That is again something to investigate later. Perhaps taking the area of
the chisquared distribution above the point given by the chisquared or
Gtest would work: that turns it into a maximization problem again. But
since the cdf for chisquared involves gamma functions, optimizing that
might be rather difficult.
Alternatively, we might go deeper. Why choose SainteLaguë to begin
with? Because it's unbiased: it doesn't consistently favor small or
large parties (and, because it's a divisor method, it has certain
favorable properties we'd like to carry over). So find something that is
unbiased. But the problem with that is that we might lose the "reduction
to Webster when everybody plumps" property.

[1] I uncovered this when reading "A fast and simple algorithm for
finding the modes of a multinomial distribution" by White and Hendy. It
gives an algorithm for finding a mode of the multinomial, i.e. an
apportionment that maximizes the exact test value. The paper is
paywalled, but the algorithm is essentially a combination of Jefferson
and D'Hondt: first they get a Jefferson solution for a number of seats
that's close enough to the number of seats specified, and then they run
D'Hondt either forwards or in reverse until they get the number of seats
you want. The authors don't appear to recognize the solution as D'Hondt,
though.

ElectionMethods mailing list  see http://electorama.com/em for list info