### Re: [i2p] Single Node Predecessor Attack Framework And Analysis

<closedshop <at> gmx.de>

2005-11-19 09:04:03 GMT

hi thanks for you analysis.
you can eliminate all your concerns with 2 simple measurements:
1. each i2p session a new GUID-Kex for i2p.
antsp2p uses a pair of private and public key, while the public key is each
session ne wgenerated. but if there is a good bootstrapping process for i2p
(central nodes like webcaches?), then you just can change the key each
session.
2. all attacs are obsolut, if im have end to end encryption with the key for
the end point transmitted over a second, alternative route.
Waste has a approval for security which is symmetric, this means you have to
give your client key over mail or mesenger or telephone tho the approvd
trusted person. in a p2p network you have to transfer this client key over
the network WITHIN THE SYSTEM itself. So the solution to perfect anonymity
without any attacs you just need different routes for the same destination .
one for the ncrypted packet and one for the key
So:
KEY: A B C D E F G H
PACKET: A C F D B D H
All other analysis like traffic analysis is then not needed, because the
packet is closed and the neighbour connection itself is not a crime, it is
essential for beeing in the internet.
implementing "e2e-second route" (TM) is now not done, neither in ants p2p as
well not in i2p.
The problem is to find the host over the second route. neither ants nor i2p
is scalable like this, that a destination is always found over parallel
routes. So a supernode arcitecture is needed. this could be designed with a
DHT or a g2 system maybe as well quite comfortable, if you wnat to redesign
a new network based on e2e-second route.
probably a mixtrue of a waste symmetric way plus p2p overlay network would
be a solution like this:
Create a buddylist of trusted friends, Each friend has to be approoved in a
symmetric way like waste does, so you need to transfer the Client-Key over
another system (telefon, official mesenger, email etc.).
If you have found this list of trusted friends, you only connect to those
friends directly.
After this proxy environment of your buddies, you use the p2p principle of
i2p or ants. This would redouce the found media until the network as a
certain growth. Because if you hae random direct nodes the search results
wount be redundant, otherwsie you only search the freinds of your friends.
But with a magnet link system or edonkey links and a good DHT or a god
search system you can support this process of only trusted approved direct
connections.
This is why jetiaants (Jeti= Messenger) and Ants have a very great potential
to merge in this direction. Ants has 5 direct connections and imagine 4 of
them would be trusted friends like in waste.
What you as well could do, if you do not want to implement a symmetric key
approvation like in waste is to run the messenger over i2p, so the ip adress
is spoofed. Then the direct conection is eliminated.
BTW: THIS WOULD BE A GREAT SOLUTION FOR JETIANTS, TO RUN THE DIRECT
NEIGHBOUR CONNECTIONS OVER I2P - SO DIRECT CONNECTIONS ARE BASE ON
64-BIT-Sockets FOR ANTS.
Anyway, if you have 10% or up to 100 % direct connections made to your (not,
normally, symmetric) trusted buddies, the output of searchresults would be
redundant or not, but in the end you have more security and in a growing
network it would be possible to search a lot of noded.
Summary: Ants have becasue of JETI much more potentials to be a trusted
anonymous network thant i2p, which is just a mixture of ip adresses.
So a good start to develop an e2e-secound route network is,
a) to have special and not random trusted direct neighbours
b) a secound route to destintion for end to end encryption.
Runnig direct ants connections over i2p would be a great solution or to make
direct connections only to trusted buddies (maybe symmetric key exchange
like in waste) would do.
Whatever, end to end secound route is the way to go and to make direct
connections trusted.
trusted buddies is not the problem, but to find buddies in a decentral
messenger system.
> --- Ursprüngliche Nachricht ---
> Von: The23rd Raccoon <the23rdraccoon <at> yahoo.com>
> An: i2p <at> i2p.net
> Betreff: [i2p] Single Node Predecessor Attack Framework And Analysis
> Datum: Fri, 18 Nov 2005 15:46:08 -0800 (PST)
>
> In this email I will attempt to generalize the predecessor attack in
> combination with a membership and position detector, and give a
> framework for its execution and use this same framework to analyze the
> expected wall-clock runtimes of a series of attacks.
>
> I use the methodology to prove the runtimes of four attacks on the I2P
> network as it exists now (approximately 300 nodes).
>
> Attack 1: Using the leaseSet holder as a membership AND position
> detector to perform a predecessor attack against 2 +/- 1 hop eepsites
> with an expected runtime of 3.8 days.
>
> Attack 2: Using the leaseSet predecessor as a membership AND position
> detector to perform a predecessor attack against 2 + 0-1 hop eepsites
> with an expected runtime of 5 million years.
>
> Attack 3: Using the leaseSet predecessor as a position detector and a
> timing attack as a membership detector against 2 + 0-1 hop eepsites in
> 2.04 days.
>
> Attack 4: Using just a timing attack as a membership detector to
> perform a predecessor attack against 3 hop rendezvous points in 9.62
> days.
>
>
> THE FRAMEWORK:
>
> So lets start by outlining the general framework for combining the
> predecessor attack with membership and position detectors.
>
> Shorthand:
> Alice - eepsite hoster
> Charlie - an intermediary node
> Mallory - Any one of the adversary's c nodes
> Lester - LeaseSet endpoint
>
> E - An event that happens with probability P(E)
> !E - An event with probability 1-P(E).
> A - The event that Mallory is in one of Alice's tunnels.
> P - The event that Mallory has Alice as a predecessor
> M=p - Mallory is in some detectable position p in A's tunnel
> Y - The event that our membership detector says "yes"
>
> c - number of nodes Mallory controls
> n - the number of nodes in the network
> t - number of tunnels each node participates in on average
> m - the number of tunnels Mallory is participating in
> l - the number of tunnels Lester is participating in
> b - the number of tunnels each node has created on average
> a - number of tunnels Alice has created
> h - number of hops in a tunnel
>
> To mount an effective attack, five basic prerequisites are needed:
>
> 1. You need to be chosen to be a hop in a tunnel by Alice. Note
> that this is the probability PER HOP CHOICE, not per tunnel.
> This is P(H) = (c*m*a)/(t*b*n) or approx c/n. Each hop choice is
> assumed to be independent. This is not entirely correct, but is
> a nice simplifying assumption that doesn't change things much. We
> do not care if our nodes are chosen more than once in a tunnel.
>
> You also need to determine the distribution of M's position in
> tunnel configurations Alice is choosing, written as P(M=p|A).
>
> 2. You need a tunnel membership detector that can determine your
> membership in A's tunnels with some P(A|Y), where Y is the state
> of your detector saying "yes" or no".
>
> You also need to determine P(Y), the percentage of tunnels that
> your detector says "yes" to. This is the sum of the probability
> of true positives plus the probability of false positives.
>
> 3. You need a predecessor distribution that specifies how often you
> end up next to Alice given that you are in detectable position of
> M=p, and you are in one of Alice's tunnels. P(P|M=p,A).
>
> With this result, and the distributions from step 2, you need to
> derive P(P|M=p,Y), which is the probability that you are next to
> Alice when your detector says you are in a desirable position.
>
> If the tunnel position is not detectable, you are left with just
> P(P|M=p,A) = P(P|A) and P(P|M=p,Y) = P(P|Y).
>
> 4. Determine the number of samples S that are required for an accurate
> estimator. Using P(P|M=p,Y), it is possible to derive the number S
> of good tunnel memberships before you are able to determine Alice
> as the most frequent predecessor with some confidence.
>
> I'm a bit shaky at statistics, but it seems to me that the way this
> is done is to approximate each sample as a Bernoulli trial. With
> probability p=P(P|M=p,Y) you have Alice, but without any
> bias, each sample should have probability 1/n of being any other
> node. I believe this means you can use the second formula in [2]
> with the (1/2) replaced by 1/n. So to be sure to a 1% margin of
> error, S = ln(1/sqrt(.01))/(p-1/n)^2. Please correct me if I am
> wrong. It may be possible that we have to normalize these
> probabilities to sum to 1 first somehow, but when I to that,
> the results for S are very small.
>
> Since this is shaky, an alternate method is to pick S such that
> A is expected to appear 5-10 times more often than any other node.
> For Attacks 1 and 3, this agrees with Chernoff. For Attack 2, it
> provides a much higher estimate, and for Attack 4 a much lower one.
>
> 5. Your attack must finish in a reasonable amount of time, given by
> S/P(A,Y,M=p) where P(A,Y,M=p) is the probability that you are in
> A's tunnel AND your tunnel detector has detected it AND you are
> in a detectable tunnel position. Note this is not P(A,P,Y), because
> S includes samples that you are explicitly not the predecessor.
>
> That is, you require S/P(A,Y,M=p) tunnel memberships in order
> to gather S samples using your detector, which is correct
> P(A,Y,M=p) of the time. From Bayes rule, P(A,Y,M=p)
> = P(A,M=p|Y)*P(Y).
>
> Again, if you cannot detect tunnel position, that variable is
> omitted from the analysis.
>
> So now lets walk through a couple attacks in this framework, starting
> with the most dangerous (and simplest) one: usage of the leaseSet
> information as a membership AND position detector.
>
>
> ATTACK 1:
>
> In this attack, Mallory uses the fact that he is able to look up the
> leaseSets on Alice's eepsite. He assumes Alice is using the default 2
> +/- 1 eepsite tunnel length setting. He then looks for instances when
> he is Alice's leaseSet provider.
>
>
> Step 1: Calculate P(H) and P(M=p|A)
>
> Figure c=3 and n=300, or 1% compromise. What this figure really means
> is that 1% of all tunnels go through M's nodes, and thus the
> probability of choosing M for a hop in any tunnel is P(H).
>
> Assuming 1, 2, and 3 hop tunnels show up with equal probability (33%),
> the following placements of M are possible:
>
> P(M=p|A):
>
> A - M - C - L 11%
> A - C - M - L 11%
> A - C - C - M 11% = 33% of A's tunnels
>
> A - M - L 16.6%
> A - C - M 16.6% = 33% of A's tunnels
>
> A - M 33% of A's tunnels
> = 100% of A's tunnels with M in them
>
>
>
> Step 2: Determine Membership Detector, Calculate P(A|Y) and P(Y)
>
> So figuring out the Tunnel Membership Detector is a key part of the
> attack. Lots of techniques are available that work to varying degrees,
> such as application-level timing attacks, traffic shaping, traffic
> dropping, connection closing, etc. It is likely that most detectors
> will have to be developed on a test network with values determined
> empirically (especially if there are difficult to model mitigating
> factors at work like strict host ordering), but there is one detector
> that works extremely reliably since it provides both position and
> membership information: leaseSet endpoints. We will evaluate the
> various ways a node can show up either as the leaseSet endpoint or its
> immediate neighbor.
>
> Notice that P(A|Y) = 1 since M is able to recognize every instance
> using leaseSet information. There are no false positives, since M knows
> that it is the leaseSet and knows its neighbor in that tunnel with 100%
> probability.
>
> The way to obtain P(Y) is to notice that A must choose one node as it's
> lease set for each tunnel. The probability of choosing M for a hop is
> P(H). Therefore, the probability of A choosing M as a leaseset for a
> tunnel is P(H), and thus the probability of M detecting it is the end
> of A's tunnel is P(H).
>
> P(Y) = P(A,M=xm,Y) = P(H) = 0.01
>
>
> Step 3: Calculate predecessor distributions P(P|M=p,A) and P(P|M=p,Y)
>
> So the way our estimator will work is to participate in S tunnel
> memberships, and count how many times it appears next to each host. So
> lets model the probability distributions:
>
> When M sees the ordering X - M in A's tunnel, it knows there are three
> possibilities:
>
> P(P|M=xm,A):
>
> A - C - C - M = 33%
> A - C - M = 33%
> A - M = 33%
>
> So this means 66% of the time M is in X-M of A's tunnel, it is not next
> to A, and 33% of the time, it is.
>
> If node distribution was uniform, this would be more than enough to
> find A. If, however, A selected some fast peer approximately 40% of
> the time (because say, that peer had a 45Mbit line and was providing
> 40% of the network bandwidth), this peer actually will act to protect
> A, by confusing itself with A.
>
> On the other hand, these peers would be readily identifiable by anyone
> on the network, and could be eliminated from consideration, or
> inversely weighted based on their tunnel participation statistic.
>
> Notice in this case, with no false positives and no false negatives,
> P(P|M=xm,Y) = P(P|M=xm,A).
>
>
> Step 4: Determine samples needed using S=ln(1/sqrt(.01))/(p-1/n)^2
>
> So p=P(P|M=xm,Y)=0.33
>
> Assuming either uniform distribution, or a distribution weighted to
> remove tunnel participation bias, S=ln(1/sqrt(.01))/(.33-1/300)^2 =
> 22 samples.
>
> So after S=22 tunnel participations, A should show up 22*.33 = 7 times,
> where as a uniform distribution of other nodes would yield them to be
> selected only once or twice.
>
>
> Step 5: Determine attack duration S/P(A,M=p,Y)
>
> So how long would 22 tunnel participations take? Well, first we must
> determine the number of tunnels we have to participate in in order to
> gather 22 "useful" samples. The probability of gathering a useful
> sample is how often we are in a useful position in one of A's tunnels
> AND we recognize this fact, ie P(A,M=xm,Y), which we determined is just
> P(Y)=0.01.
>
> So, 22/(.01) = 2200 tunnel creations.
>
> Since by default, A creates 4 new tunnels every 10 minutes,
> 2200*10/60/24/4 = 3.8 days before we can determine A's IP.
>
> Obviously 1 hop tunnels are deadly. Even if you do them "only
> sometimes".
>
> As a site note, 0 hop tunnels are just as bad, if not worse, because
> all someone has to do is harvest leaseSets. You don't even have to be
> actively building any tunnels, nor does the adversary have to be
> participating in the network (other than querying the netdb). Be
> afraid, I2Phex users, be very afraid. As soon as the RIAA employs
> someone with a brain, you're toast (unless you rotate I2Phex keys and
> rare content every 3 days).
>
> In an unrelated matter, I am available for contract work. ;)
>
> (That is, so long as nobody minds employing a Golden Worm :).
>
>
> -------------------------------------------------------------
>
>
> ATTACK 2: How to deal with false positives
>
> In this attack, Mallory again uses the fact that he is able to look up
> the leaseSets on Alice's eepsite. Let's assume that the reason he's
> doing this is because I2P has changed the default tunnel length to 2
> +/- 1 tunnels in favor of 2 + 0-1. (It turns out that this change
> doesn't affect Attack 2 very much at all for those who are curious).
>
> So those short on time can skip this attack entirely. It is ineffective
> by itself, but is useful as a warmup to help understand Attack 3.
>
>
> Step 1: Calculate P(H) and P(M=p|A)
>
> P(H) = 1%, same as above
>
> This time, P(M=p|A):
>
> A - M - C - L 16.6%
> A - C - M - L 16.6%
> A - C - C - M 16.6% = 50% of A's tunnels
>
> A - M - L 25%
> A - C - M 25% = 50% of A's tunnels
> = 100% of A's tunnels with M in them
>
>
> Step 2: Determine Membership Detector, Calculate P(A|Y) and P(Y)
>
> The probability of M noticing that X - M - L is present versus the
> other possibilities is P(M=xml|A) = (16.6+25) = 41.6% of all tunnels of
> A that M is in.
>
> Now we need to quantify the ability of M to notice that when it says "I
> am in an xml tunnel of A" it actually is. In this case, our detector is
> much worse, since in many cases, it falsely reports this configuration
> simply because it is a neighbor of L for some other tunnel. This is
> P(A|Y).
>
> Without any sort of timing information, P(A|Y) is 1/l, where l is the
> average number of tunnels that L is participating in. Examination of
> netDB stats shows that this figure ranges from 20-500. Most consumer
> broadband connections seem to do 100. Let's pick 200.
>
> In this case, the probability of the detector saying yes is the
> probability of M landing next to L in any given tunnel, A's or not.
>
> Every time it makes a tunnel, at each hop a node has P(H) of choosing
> us. This means that for each tunnel that L is participating in, some
> node made a choice for the position before they choose to be one of L's
> 200 participants. This means that Mallory can expect to observe
> p=P(H)*l nodes with L as a hop after him at any given time. This is
> .01*200=2.
>
> Note that this result includes A's tunnels (the true positives) and the
> false positives.
>
> The probability P(Y) for any given tunnel of M's being a positive is
> p/m, where m is the total number of tunnels M has. For the sake of
> simplification, lets say that m has c*l tunnels. This means that P(Y)
> = (P(H)*l)/(c*l) = P(H)/c = 0.01/3 = 0.00333.
>
> P(Y) = P(H)/c = .00333
>
>
> Step 3: Calculate predecessor distributions P(P|M=p,A) and P(P|M=p,Y)
>
> When M sees the ordering X - M - L, it knows there are two
> possibilities if it is in Alice's tunnel.
>
> P(P|M=xml,A):
>
> A - C - M - L 16.6/(25+16.6) ~= 40%
> A - M - L 25/(25+16.6) ~= 60%
>
>
> Since we have false positives, we must decondition (sum out) A from
> this distribution to determine P(P|M=xml) = P(P|Y), the probability of
> being next to A when our detector says "yes".
>
> P(P|Y) = P(P|M=xml) = P(P|M=xml,A)*P(A) + P(P|M=xml,!A)*(1-P(A))
>
> P(A) = P(H)*h, where h=2 is the average number of hop decisions in A's
> tunnels.
>
> P(P|M=xml,!A) is the distribution of finding A versus a random node in
> our tunnels when we ARE NOT in one of A's leaseset tunnels (but could
> be in some other tunnel A and L just happen to be participating in).
>
> P(P|M=xml,!A):
>
> C - A - M - L P(H)*P(H)
> C - M - L 1 - P(H)*P(H)
>
> P(H)*P(H) is the probability of being chosen as a neighbor of both A
> and L in the same tunnel of some random node.
>
> So now we can do the sum:
>
> P(P|Y):
>
> C - M - L .40*P(A) + (1 - P(H)*P(H))*(1-P(A)) = .988
> A - M - L .60*P(A) + P(H)*P(H)*(1-P(A)) = .012
>
>
> Step 4: Determine samples needed using S=ln(1/sqrt(.01))/(p-1/n)^2
>
> As you can see, we're going to need a shitload more samples for this
> detector to work by itself.
>
> p=P(P|Y) = 0.012
>
> S=ln(1/sqrt(.01))/(.012 - 1/300)^2 = 30,655 tunnel durations. In 30,655
> tunnel durations, we can expect to see A 367 times, but we can also
> expect to see any other node 30655/n = 102 times.
>
> This seems like an awful lot of samples. Perhaps the Chernoff bound is
> too weak, and/or we modified it incorrectly.. I expected this attack to
> be bad, but not this bad.
>
>
> Step 5: Determine attack duration S/P(A,M=p,Y)
>
> So how long does it take to gather 1000 tunnel participations?
> S/P(A,M=xml,Y). Again, from Bayes, P(A,M=xml,Y) = P(A,M=xml|Y)*P(Y)
> = (1/l)*P(H)/c
>
> P(A,M=xml,Y) = 1/200*0.01/3 = .00001
>
> S/P(A,M=xml,Y) = 30655/.00001 = 3x10^9 tunnel durations! Now that's
> more like it! This is 5 million years without an auxiliary detector.
>
> But don't rejoice yet, things are about to get worse. MUCH worse.
>
>
> -------------------------------------------------------------
>
>
> ATTACK 3: Dealing with multiple detectors and false negatives
>
> Ok, so obviously Mallory doesn't like the sound of that attack. So he
> needs some kind of tunnel membership detector. He consults the
> anonymity literature, and discovers [1], which seems like it should do
> quite nicely.
>
> It should be noted that on networks without cover traffic, the
> researchers experienced false positive and negative rates as low as
> 0.04% (yes, that is 4/100 of 1 percent), even on extremely high latency
> networks.
>
> Let's say for whatever reason that he's not quite as good at
> implementing the attack on the I2P network, and gets results as high as
> 5% instead. This is the error rate the researchers experienced with
> full cover traffic on a medium loss, mid-latency network.
>
> Step 1: Calculate P(H) and P(M=p|A)
>
> P(H) = 0.01
>
> Again, P(M=p|A):
>
> A - M - C - L 16.6%
> A - C - M - L 16.6%
> A - C - C - M 16.6% = 50% of A's tunnels
>
> A - M - L 25%
> A - C - M 25% = 50% of A's tunnels
> = 100% of A's tunnels with M in them
>
>
> Step 2: Determine Membership Detector, Calculate P(A|Y) and P(Y)
>
> So the best way to use these detectors is to use the timing detector
> for membership, and the leaseset detector for position. Obviously the
> leaseset detector buys us nothing as far as membership determination
> (unless we know L is participating in only a few tunnels), since
> P(A|M=xml) = 1/l, but P(A) = P(H)*h. With 1% compromise and 2.5 hops, l
> would have to be reliably 25 or less for P(A|M=xml) to be of any use as
> a membership detector.
>
>
> So, our only membership detector is the timing detector, which does not
> have any dependence on position.
>
> Using the results at the bottom of page 11 of [1],
>
> P(Y|A) = (1-fn) = Probability of Y detecting we are in A
> P(Y|!A) = fp = False positive, Y wrongly saying we are in A
>
> P(Y) = P(Y|A)*P(A) + P(Y|!A)*(1-P(A))
> = (1 - fn)*P(A) + fp*(1-P(A))
> = (1 - fn - fp)*P(A) + fp
> = 7.25%
>
> P(A|Y) = P(A,Y)/P(Y)
> = P(Y|A)*P(A)/P(Y)
> = (1-fn)*P(A)/((1 - fn - fp)*P(A) + fp)
> = 32.75%
>
> Again, P(A)=P(H)*h, where h=2.5 is the average hop length
>
>
> Step 3: Calculate predecessor distributions P(P|M=p,A) and P(P|M=p,Y)
>
> Now, we will still use our M=xml detector to determine position
> information. Again:
>
> P(P|M=xml,A):
>
> A - C - M - L 16.6/(25+16.6) = 40%
> A - M - L 25/(25+16.6) = 60%
>
> So now we need to determine P(P|M=xml,Y) using the M=xml detector for
> position, and the Y detector for membership.
>
> So this can be approximated by observing that the probability of being
> next to A and L in some tunnel not owned by A is insignificant, and so
> therefore all our detector really does is reduce that 60% probability
> if being next to A in one of A's tunnels by the probability of
> detecting we are in A given the fact that it says "yes".
>
> P(P|M=xml,Y) ~= P(P|M=xml,A)*P(A|Y)
>
> P(P|M=xml, Y) ~= 60% * 32.75% = 19.65%.
>
> Therefore P(!P|M=xml, Y)) ~= 80.35%.
>
> If anyone requires a more rigorous proof, I can provide one using
> Bayesian Networks that fully takes into account false positives and
> false negatives. I actually completed this proof before realizing the
> simplification.. In this case, it turns out to be almost exactly the
> same value, but a Bayesian Network analysis might be useful to someone
> working on an alternate attack. Perhaps I will post it as a separate
> email, but I think it would just add needless confusion if I included
> it here, since it is complicated.
>
>
> Step 4: Determine samples needed using S=ln(1/sqrt(.01))/(p-1/n)^2
>
> >>From here we can see that p=19.65% of the time we should expect to see A
> as our neighbor in a sample we liked.
>
> S = ln(sqrt(.01))/(.1965-1/300)^2 = 61.7 -> 62 samples.
>
> After S=62 samples, A should show up 12 times, where as every other node
> should show up once.
>
>
> Step 5: Determine attack duration S/P(A,M=p,Y)
>
> This time we don't have Bayes Rule to help us, so we have to go back to
> the joint distribution from the Bayesian Network, and sum out P:
>
> P(M=xml,Y,A) = P(P|M=xml,A)*P(M=xml|A)*P(A|Y)*P(Y)
> + P(!P|M=xml,A)*P(M=xml|A)*P(A|Y)*P(Y)
>
> P(M=xml,Y,A) = .60*(.166+.25)***.325***.0725 + .40*(.166+.25)***.325***.0725
> = .0102
>
> So after S/.0102 = 1177 samples, we compromise A in 2.04 days.
>
>
> -----------------------------------------------------
>
>
> ATTACK 4: Rendezvous Analysis
>
> Ok, so one thing we can attempt to do to mitigate the attack is to
> remove the position detector probability derived from the leaseSet.
> One way to do this is to implement tor style rendezvous points.
>
> For the sake of argument, lets say that we do not mind creating a new
> rendezvous tunnel for each connection. Furthermore, lets say that the
> rendezvous point itself is chosen by the client, and the eepsite hoster
> sends a garlic routed message to make a connection to the RP. In this
> way, none of the eepsite tunnel members can reliably recognize their
> position, or even that they are the endpoint.
>
> Step 1: P(H) = 0.01
>
> So in 3 hop rendezvous points, we have the following setup:
>
> P(M=p|A)
>
> A - M - C - C 33%
> A - C - M - C 33%
> A - C - C - M 33%
>
>
> Step 2: Determine Membership Detector, Calculate P(A|Y) and P(Y)
>
> Now this time, since R is not published in the NetDB, we have no
> additional help as far as tunnel membership goes. So all we have to go
> on is the membership detector from [1].
>
> Again, let us say that we have 5% false positives and 5% false
> negatives. P(Y|A) = (1-fn) = 0.95 and P(Y|!A) = fp = 0.05.
>
> However, this time we must be careful to note that
> P(A)=P(H)*(h=3) = .03
>
> Using the result at the bottom of page 11 of [1],
>
> P(Y) = P(Y|A)*P(A) + P(Y|!A)*(1-P(A))
> = (1 - fn)*P(A) + fp*(1-P(A))
> = (1 - fn - fp)*P(A) + fp
> = 7.7%
>
> P(A|Y) = P(A,Y)/P(Y)
> = P(Y|A)*P(A)/P(Y)
> = (1-fn)*P(A)/((1 - fn - fp)*P(A) + fp)
> = 37.01%
>
> Again, P(A)=P(H)*h, where h=2.5 is the average hop length
>
>
> Step 3: Calculate predecessor distributions P(P|M=p,A) and P(P|M=p,Y)
>
> So now we have NO given information about M's position given, so
> P(P|M=p,A) and P(P|M=p,Y) reduce to P(P|A) and P(P|Y) respectively.
>
> Now our Bayes net simplifies to a straight line:
>
> Y -----> A -----> P
>
> So P(P,Y,A) = P(P|A)*P(A|Y)*P(Y).
>
> P(P|A) is 33%. When in A's tunnel, 33% of the time we will be in
> position M=amcc. The probability of being in A's tunnel given the fact
> that our detector says yes is 37.01%. And the probability that it says
> yes is 7.7%.
>
> Now we need to sum out A:
>
> P(P,Y) = P(P|A)*P(A|Y)*P(Y) + P(P|!A)*(1-P(A|Y))*P(Y)
>
> P(P|!A) is P(H)/c. This is
>
> P(P,Y) = .33***.3701***.077 + (.01/3)*(1-.3701)*.077
> = .009565
> P(!P,Y) = .66***.3701***.077 + (1-.01/3)*(1-.3701)*.077
> = .06715
>
> P(P|Y) = P(P,Y)/P(Y)
> = .124
> P(!P|Y) = .872
>
> Sums to 99.6%. Perhaps the P(P|!A) = P(H)/c assumption?
>
>
> Step 4: Determine samples needed using S=ln(1/sqrt(.01))/(p-1/n)^2
>
> p=P(P|Y)=.124
>
> S=ln(1/sqrt(.01))/(.124-1/300)^2 = 158 samples.
>
> So now we need S=158 samples to gather 19 hits for A, which should be
> 19 times as many as any other node.
>
> Again, this seems a bit weak. Intuition tells me that if we were to
> gather only 100 samples, we should see A 12 times as often as every other
> node, which is what we saw in Attack 3.
>
> So perhaps I'm using the Chernoff bounds improperly somehow.
>
>
> Step 5: Determine attack duration S/P(A,M=p,Y)
>
> So again, since we have no position information, P(A,M=p,Y)=P(A,Y).
> >>From Bayes, this is P(A|Y)*P(Y) = .3701*.077=.0285
>
> 158/.0285 = 5543 tunnel durations, or 9.62 days.
>
> 100/.0285 = 1754 tunnel durations, or 6.09 days.
>
>
>
> CONCLUSION
>
> So there are a couple of immediately obvious points:
>
> 1. 0 and 1 hop tunnels should never be used, except in cases of rapid
> key rotation and/or an entirely unmotivated adversary (for example,
> most clients are not at risk for reading content and their
> relationship as a client to a site is typically on the order of
> minutes or at most hours, not days).
>
> 2. Picking low bandwidth leaseSet endpoints is very bad, especially if
> statistics are published. leaseSet endpoints should be chosen from
> the absolute fastest nodes available, and tunnel membership
> statistics probably should not be published in 1.0, or else the
> detector in Attack 2 is able to use this information to its
> advantage.
>
> 3. Increasing tunnel duration by a factor of X automatically
> increases the attack expectancy by that factor. Perhaps we want to
> consider longer-lived tunnels? At least to give people sufficient
> time to notice the repeated requests that would be evident of a
> timing attack.
>
> Other than that, there are a few possible options for mitigating
> factors that are of less certain utility.
>
> Rendezvous points are certainly an option, but it could just be it only
> looks that way because my Chernoff bound usage is improper. The other
> factor to consider is if it is wise to create a new tunnel for every
> request to an eepsite. If for some reason this is a bad idea (it does
> introduce another class of application-level timing attacks that may or
> may not be as effective as the packet-based one in [1]), the
> alternative is to use a pool of rendezvous points. Unfortunately, in
> this case you reveal information about tunnel position to nodes in that
> they will be able to tell they are not the tunnel endpoint. This
> eliminates 1/3 of the guesswork malicious nodes have to do.
>
> One thing that would certainly curtail the effectiveness of the timing
> attacks would be to make all tunnel data messages source routed garlic
> messages, so that hops would not be able to tell which packets are part
> of the same tunnel. jrandom informs me that this is too expensive
> computationally to be practical, however, because it would require use
> of asymmetric keys to decrypt every message.
>
> One possible alternative might be to create smaller secondary
> asymmetric keys that are changed frequently (say a 1024bit key rotated
> once or twice a day, or even once per netdb republishing). This smaller
> key could be used to encrypt a symmetric key at the top of each
> message. The symmetric key could then be used to decrypt the rest of
> the message, which would include next hop information, but no
> tunnel-specific characteristics. This might be fast enough to be make
> full garlic routing feasible.
>
> However, it is not clear how to garlic route inbound tunnel messages,
> since the inbound gateway should not know the rest of the hops in the
> tunnel. The upshot is that it is much harder to upload the the amount
> of data required to mount a proper timing attack to an eepsite. So it
> may be the case that garlic routing is only necessary for outbound
> tunnels.
>
> Additionally, key and content rotation for I2P applications is a
> possibility, especially for applications like i2p-bt and i2phex, where
> multiple trackers and/or searching mechanisms make it feasible to
> rotate keys frequently without affecting the network. In the case of
> i2phex, it probably is also a good idea to rotate content, so your
> keys can't be correlated by your possession of certain rare files.
>
> It should also be noted that a proper restricted routes implementation
> where peers can actually be trusted effectively eliminates the
> predecessor attack, because your trusted peers would be the only ones
> chosen to be your neighbor. An alternate "russian-roulette" version
> where you "trust" only k nodes for a certain period of time could slow
> down the expected runtime, but if you happened to select a malicious
> node among those k, they will rapidly be able to detect you selecting
> them again and again to be your neighbor, especially if they have an
> auxiliary tunnel membership detector.
>
> Finally, jrandom hinted at some possibilities for strict tunnel
> ordering that damage a node's ability to reliably appear as your
> neighbor. jrandom, could you elaborate on those here?
>
>
>
> [1] http://www.cs.umass.edu/~mwright/papers/levine-timing.pdf
> [2] http://en.wikipedia.org/wiki/Chernoff_bound
>
>
>
>
> __________________________________
> Start your day with Yahoo! - Make it your home page!
> http://www.yahoo.com/r/hs
> _______________________________________________
> i2p mailing list
> i2p <at> dev.i2p.net
> http://dev.i2p.net/mailman/listinfo/i2p
>
--
--
Telefonieren Sie schon oder sparen Sie noch?
NEU: GMX Phone_Flat http://www.gmx.net/de/go/telefonie
-------------------------------------------------------
This SF.Net email is sponsored by the JBoss Inc. Get Certified Today
Register for a JBoss Training Course. Free Certification Exam
for All Training Attendees Through End of 2005. For more info visit:
http://ads.osdn.com/?ad_id=7628&alloc_id=16845&op=click