Joseph Ashwood | 2 Dec 02:59 2006
Picon

Load questions

I've only got some very old information on this, but I know there are many
newer advances (my information is pre-DHT). What kind of server load am I
looking at for BT? Assuming good clients (since I'll have significant
control over the clients as well), and the availability of DHT. I'm
particularly curious about the scaling as 1M+ users come into the system.
Due to the data being distributed there would be an exponential curve for
the popularity of torrents, that is to say that >80% of the peers will be
connected to <20% of the torrents, probably more like 90:10, or even 95:5.
For this I will obviously be throwing the compute power necessary behind it,
as well as the RAM and storage, but for estimation purposes, does anyone
have any approximations on these?

I'm expecting imperfect answers but I am hoping to get something I can
base some ballpark estimates on.
                        Joe
Olaf van der Spek | 2 Dec 11:12 2006
Picon

Re: Load questions

It depends a lot on the tracker itself.
PHP can in general handle up to 40k peers, XBT Tracker (C++) can
handle up to at least 800k peers. RAM is no issue for XBTT and the
first bottleneck most operators encounter is network bandwidth.

On 12/2/06, Joseph Ashwood <ashwood <at> msn.com> wrote:
> I've only got some very old information on this, but I know there are many
> newer advances (my information is pre-DHT). What kind of server load am I
> looking at for BT? Assuming good clients (since I'll have significant
> control over the clients as well), and the availability of DHT. I'm
> particularly curious about the scaling as 1M+ users come into the system.
> Due to the data being distributed there would be an exponential curve for
> the popularity of torrents, that is to say that >80% of the peers will be
> connected to <20% of the torrents, probably more like 90:10, or even 95:5.
> For this I will obviously be throwing the compute power necessary behind it,
> as well as the RAM and storage, but for estimation purposes, does anyone
> have any approximations on these?
>
> I'm expecting imperfect answers but I am hoping to get something I can
> base some ballpark estimates on.
>                         Joe
>
>
> _______________________________________________
> BitTorrent mailing list
> BitTorrent <at> lists.ibiblio.org
> http://lists.ibiblio.org/mailman/listinfo/bittorrent
>
Joseph Ashwood | 2 Dec 13:32 2006
Picon

Re: Load questions

----- Original Message ----- 
From: "Olaf van der Spek" <olafvdspek <at> gmail.com>
Sent: Saturday, December 02, 2006 2:12 AM
Subject: Re: [bittorrent] Load questions

> It depends a lot on the tracker itself.
> PHP can in general handle up to 40k peers,

That obviously won't cut it then.

> XBT Tracker (C++) can
> handle up to at least 800k peers.

What class of process/memory? It's one thing to say that about a BlueGene, 
another to say that about a PII, I'm thinking that's about the level of a P4 
 <at>  3GHz. Also is XBTT written in such a way that it can be clustered, and 
have multiple copies running locally, if so what are the limitations (e.g. 
shared torrents?), or would I have use virtualization for that?

> RAM is no issue for XBTT and the
> first bottleneck most operators encounter is network bandwidth.

I already know bandwidth is going to be the choking point on this (primarily 
because of the seeding of rare files), and that is why the final 
implementation for this will likely preference peer-of-peer referrals to 
tracker referrals. RAM may not normally be an issue, but when you're trying 
to make something as dense as possible it becomes one, what kind of RAM 
usage is expected, and what is a rough upper bound for XBTT?
                Joe
            Joe 
(Continue reading)

Olaf van der Spek | 2 Dec 13:49 2006
Picon

Fwd: Load questions

On 12/2/06, Joseph Ashwood <ashwood <at> msn.com> wrote:
> > XBT Tracker (C++) can
> > handle up to at least 800k peers.
>
> What class of process/memory? It's one thing to say that about a BlueGene,
> another to say that about a PII, I'm thinking that's about the level of a P4

I don't run the trackers myself and I don't have the details, but I'm
thinking about a decent K8/P4 at least.

>  <at>  3GHz. Also is XBTT written in such a way that it can be clustered, and
> have multiple copies running locally, if so what are the limitations (e.g.
> shared torrents?), or would I have use virtualization for that?

What do you want to achieve?
There's no special support for clustering, but you could run two
independent instances on two servers and have 50% of the torrents
tracked by one and the others by the other.

> > RAM is no issue for XBTT and the
> > first bottleneck most operators encounter is network bandwidth.
>
> I already know bandwidth is going to be the choking point on this (primarily
> because of the seeding of rare files), and that is why the final

But I mean, even for just tracker traffic bandwidth is usually the bottleneck.

> implementation for this will likely preference peer-of-peer referrals to
> tracker referrals. RAM may not normally be an issue, but when you're trying
> to make something as dense as possible it becomes one, what kind of RAM
(Continue reading)

Harold Feit | 2 Dec 14:28 2006

Re: Load questions


BNBT and its variants CBTT, and XBNBT have high efficency ratings.
CBTT has been tested on a dual P3 500 system running approximately 50k
peers without problems (running on linux I believe, he has since closed
it down).
A windows-based tracker host has tested CBTT into the 100k peer range.
Prior to these, the highest a BNBT core tracker has gotten (as far as
I'm aware) is 300k peers (topped out because of torrent interest, not
tracker performance)

Other than this, the tracker core has not been benchmarked, so I'm
unaware of the current capabilities of the core.

Memory footprints for these tests have stayed under 64MB as far as I'm
aware. Any newly built server (current tier equipment) should be able to
handle a tracker fine.

Joseph Ashwood wrote:
> I've only got some very old information on this, but I know there are many
> newer advances (my information is pre-DHT). What kind of server load am I
> looking at for BT? Assuming good clients (since I'll have significant
> control over the clients as well), and the availability of DHT. I'm
> particularly curious about the scaling as 1M+ users come into the system.
> Due to the data being distributed there would be an exponential curve for
> the popularity of torrents, that is to say that >80% of the peers will be
> connected to <20% of the torrents, probably more like 90:10, or even 95:5.
> For this I will obviously be throwing the compute power necessary behind it,
> as well as the RAM and storage, but for estimation purposes, does anyone
> have any approximations on these?
> 
(Continue reading)

Alan McGovern | 2 Dec 14:38 2006
Picon

Fwd: Load questions

(forgot to send to mailing list originally ;) )

It's impossible to guess how many peers a tracker can take before it grinds to a halt without doing extensive testing. If each torrent tracked takes 20kB + 200bytes * (Number of peers in that torrent) and you assume an average of 1000 peers per torrent (gross overestimation) that would be 137kB per active torrent.

That'd give 10,000+ active torrents and 10+ million active peers with a mere 2gb of ram (ignoring all overheads involved in running the tracker, and the overheads would probably be quite substantial once you start receiving large numbers of announces). So with a slightly more realistic (but still exaggerated) load of 10,000 active torrents with 50 active peers per torrent you're looking at about 100megs of ram.

Now, assuming an announce period of 30 minutes that's 5500+ announces per second. A typical announce/response would be at least 1kB in size so thats 5500kB/sec sustained transfer rate every second, that ain't so bad. Of course, i could be well off the ball here, but the only way to prove me wrong is to benchmark an active tracker.

Then of course you run into OS limitations. Each of those 5000 sockets takes ram and CPU time to create and destroy. Some OS's can't handle 5000 simultaneous connections. If the number of announces goes up, the bandwidth used goes up. Depending on the implementation in the backend, performance could deteriorate hugely the more torrents you host, or performance could decrease linearly, or maybe performance won't decrease that drastically. Who knows.

Alan.


On 12/2/06, Joseph Ashwood < ashwood <at> msn.com> wrote:
----- Original Message -----
From: "Olaf van der Spek" <olafvdspek <at> gmail.com>
Sent: Saturday, December 02, 2006 2:12 AM
Subject: Re: [bittorrent] Load questions


> It depends a lot on the tracker itself.
> PHP can in general handle up to 40k peers,

That obviously won't cut it then.

> XBT Tracker (C++) can
> handle up to at least 800k peers.

What class of process/memory? It's one thing to say that about a BlueGene,
another to say that about a PII, I'm thinking that's about the level of a P4
<at> 3GHz. Also is XBTT written in such a way that it can be clustered, and
have multiple copies running locally, if so what are the limitations (e.g.
shared torrents?), or would I have use virtualization for that?

> RAM is no issue for XBTT and the
> first bottleneck most operators encounter is network bandwidth.

I already know bandwidth is going to be the choking point on this (primarily
because of the seeding of rare files), and that is why the final
implementation for this will likely preference peer-of-peer referrals to
tracker referrals. RAM may not normally be an issue, but when you're trying
to make something as dense as possible it becomes one, what kind of RAM
usage is expected, and what is a rough upper bound for XBTT?
                Joe
            Joe


_______________________________________________
BitTorrent mailing list
BitTorrent <at> lists.ibiblio.org
http://lists.ibiblio.org/mailman/listinfo/bittorrent

_______________________________________________
BitTorrent mailing list
BitTorrent <at> lists.ibiblio.org
http://lists.ibiblio.org/mailman/listinfo/bittorrent
Arnaud Legout | 4 Dec 12:17 2006
Picon
Picon

statistics on the trackers

Hi,

is there any statistics on the type of trackers used in the Internet.
Which trackers are the most popular and for which use?
For instance, do sites like mininova or isohunt maintain statistics on 
the trackers type and version?

The problem with trackers, unlike peers,  is that you do not know its 
version and type when you
connect to it.

Thanks,

Arnaud.
Joseph Ashwood | 5 Dec 13:30 2006
Picon

Re: Fwd: Load questions

----- Original Message ----- 
From: "Olaf van der Spek" <olafvdspek <at> gmail.com>
Subject: Re: [bittorrent] Load questions

> On 12/2/06, Joseph Ashwood <ashwood <at> msn.com> wrote:
>> > XBT Tracker (C++) can
>> > handle up to at least 800k peers.
>>
>> What class of process/memory? It's one thing to say that about a 
>> BlueGene,
>> another to say that about a PII, I'm thinking that's about the level of a 
>> P4
>
> I don't run the trackers myself and I don't have the details, but I'm
> thinking about a decent K8/P4 at least.

So then for a safety margin I'll just assume it is at least 600K on the 
Xeons and Opterons I'll be using. Shouldn't be an issue, I suspect each of 
those systems could handle in the neighborhood of 1M.

>
>>  <at>  3GHz. Also is XBTT written in such a way that it can be clustered, and
>> have multiple copies running locally, if so what are the limitations 
>> (e.g.
>> shared torrents?), or would I have use virtualization for that?
>
> What do you want to achieve?

Scalability and reliability. If a server goes does for whatever reason, 
having a cluster of some kind (even multi-tracker torrents) provides 
fail-over and the system stays running. Multiple trackers per system also 
allows for software fault recovery in many cases. Virtual machines are a 
truly blessed thing for huge scalability.

> There's no special support for clustering, but you could run two
> independent instances on two servers and have 50% of the torrents
> tracked by one and the others by the other.

I'll be doing that too, but the multi-tracker torrents present something 
very attractive to me as well. Since most of the torrents will basically 
result in HTTP seeding only it shouldn't be too much of an issue to have a 
large quantity of torrents on a system, or a large quantity of trackers each 
with a few torrents virtualized onto the same system (more likely the 
former).

>> > RAM is no issue for XBTT and the
>> > first bottleneck most operators encounter is network bandwidth.
>>
>> I already know bandwidth is going to be the choking point on this 
>> (primarily
>> because of the seeding of rare files), and that is why the final
>
> But I mean, even for just tracker traffic bandwidth is usually the 
> bottleneck.

I know, actually Alan's numbers look to be pretty decent on that front, at 
the very least they have reasonable arguments behind them.

>> implementation for this will likely preference peer-of-peer referrals to
>> tracker referrals. RAM may not normally be an issue, but when you're 
>> trying
>> to make something as dense as possible it becomes one, what kind of RAM
>> usage is expected, and what is a rough upper bound for XBTT?
>
> To be honest I really don't know. I think 640k is enough. :)
>
>
>
> Just kidding. 640m may be enough though. I'll have to ask one of the 
> admins.

I'll plan on 1G, RAM is so cheap it's not worth bickering over if 640M is 
the right ballpark. I have some old numbers for a 100K user site hitting 
256MB, but I don't know what tracker.
                    Joe
                    (hopefully soon speccing out a massive setup) 
Bernard Morin | 9 Dec 14:30 2006
Picon

Bitfield data... First bits meaning?

Hi everyone...

I'm newbie on this forum and in bittorrent development too...

For a semester project, I'm writing a bittorrent client in Java.
Everything works fine, messages exchange is set up and peers communicates correctly, except that I don't understand the meaning of the first bits of the bitfield messages...
Actually, bitfield should represent the pieces that the remote peers have. But when I check this field, it always contains cleared (0) bits at the beginning and the end, even when I'm contacting a seed...

For the trailing bits, it is ok, since they are spare bits and should be ignored. But the first ones, I just don't get what they mean...
Could someone help me please...

_______________________________________________
BitTorrent mailing list
BitTorrent <at> lists.ibiblio.org
http://lists.ibiblio.org/mailman/listinfo/bittorrent
Alan McGovern | 9 Dec 14:46 2006
Picon

Re: Bitfield data... First bits meaning?

Hi,

From here: http://wiki.theory.org/BitTorrentSpecification

bitfield: <len=0001+X><id=5><bitfield>

The bitfield message may only be sent immediately after the handshaking sequence is completed, and before any other messages are sent. It is optional, and need not be sent if a client has no pieces.

The bitfield message is variable length, where X is the length of the bitfield. The payload is a bitfield representing the pieces that have been successfully downloaded. The high bit in the first byte corresponds to piece index 0. Bits that are cleared indicated a missing piece, and set bits indicate a valid and available piece. Spare bits at the end are set to zero.

A bitfield of the wrong length is considered an error. Clients should drop the connection if they receive bitfields that are not of the correct size, or if the bitfield has any of the spare bits set.



On 12/9/06, Bernard Morin <jrchukalescu <at> gmail.com> wrote:
Hi everyone...

I'm newbie on this forum and in bittorrent development too...

For a semester project, I'm writing a bittorrent client in Java.
Everything works fine, messages exchange is set up and peers communicates correctly, except that I don't understand the meaning of the first bits of the bitfield messages...
Actually, bitfield should represent the pieces that the remote peers have. But when I check this field, it always contains cleared (0) bits at the beginning and the end, even when I'm contacting a seed...

For the trailing bits, it is ok, since they are spare bits and should be ignored. But the first ones, I just don't get what they mean...
Could someone help me please...
_______________________________________________
BitTorrent mailing list
BitTorrent <at> lists.ibiblio.org
http://lists.ibiblio.org/mailman/listinfo/bittorrent



_______________________________________________
BitTorrent mailing list
BitTorrent <at> lists.ibiblio.org
http://lists.ibiblio.org/mailman/listinfo/bittorrent

Gmane