Re: Large Datastores, sponsorship
Matthew Toseland <toad@...
2002-12-27 03:01:28 GMT
On Fri, Dec 27, 2002 at 03:39:10AM +0100, Frank v Waveren wrote:
> On Fri, Dec 27, 2002 at 02:08:11AM +0000, Matthew Toseland wrote:
> > Now, we can solve the above problems and greatly reduce the amount of
> > memory used by:
> > * Implementing a hack, storeSize=0, which does not attempt to limit
> > store space usage, and therefore does not need the LRU list.
> Sounds good, if the sponsor (thanks btw!) is prepared to add disks
> indefinately. However, just purging the datastore (partially) once in
> a while seems far from optimal.
Yeah, the intention is to add disks indefinitely, or set the DS size to
something nonzero after a while.
> > * Using a bit-vector hash to optimize lookups. A largish datastore might
> > have 200,000 files; with a scarcity factor of 50, this is only
> > ~10,000,000 bits, i.e. just over a meg. This would be a config
> > parameter - the power of two to use for the number of bits.
> I'm not sure what you mean by a 'bit-vector hash' but I'd suggest
> using a bloom filter (if that wasn't already what you meant).
Well, I'm not familiar with all the CS terminology, but the idea is you
run the keys through a nice hash function, then take the first n bits of
the hash, and create a vector of 2^n bits, and set each bit to true if
there is a key in the store whose hash starts with that bitstring.
Bithashes are really cheap - 1,000,000 keys, 50:1 scarcity (total
overkill probably), makes 50,000,000 bits ~= 6MB (round it up to 8MB).
Compared to the reported hundreds of megabytes of RAM used by the
current structures, this is a bargain.
> A problem would however be having to regenerate the cache every time
> you remove something, which would mean having to either do store
> purges very coarsely, or not at all as you suggested.
Yeah. Eventually, we will have to deal with this properly. It will
involve messy stuff like on-disk LRU list structures.
> Frank v Waveren Fingerprint: 21A7 C7F3
> fvw <at> [var.cx|stack.nl|chello.nl] ICQ#10074100 1FF3 47FF 545C CB53
> Public key: hkp://wwwkeys.pgp.net/fvw@... 7BD9 09C0 3AC1 6DF2
Freenet/Coldstore open source hacker.
Employed full time by Freenet Project Inc. from 11/9/02 to 11/1/03