Christian Grothoff | 1 Jan 2004 03:37
Picon
Favicon

GNUnet 0.6.1a released


A couple of bugs that were found in 0.6.0 have been fixed.  In particular, 
some remaining performance problems (low bandwidth utilization, collisions in 
routing table stall downloads and others) have been addressed (special thanks 
to Igor for helpful plots and discussions).  There are no new features and no 
changes to the configuration, so any update from 0.6.1 should be trivial.

As a special note at the end of the year, I will not have access to an 
RPM-based distribution in 2004 and will thus no longer provide RPM binaries.  
If someone wants to step up to package GNUnet for RPM-based distributions, 
please let me know (reply to grothoff AT cs.purdue.edu, not to any of the 
lists).

Christian
Hendrik Pagenhardt | 7 Jan 2004 10:58
Picon

Some small patches

Hello,

when roaming through the source trying to find the reasons for some
minor annoyances I patched a few places:

Index: src/applications/afs/esed2/block.c
===================================================================
RCS file: /var/cvs/GNUnet/GNUnet/src/applications/afs/esed2/block.c,v
retrieving revision 1.10
diff -r1.10 block.c
77c77
<     if ((0 == STAT(filename, &st))) { /* if file exists, try truncate */
---
>     if ((0 == STAT(filename, &st)) && getFileSize(filename) > filesize) { /* if file exists, try truncate */

This fixes a bug I encountered trying to continue a download on a
mounted windows partition, when the "truncation" of a file to a bigger
size failed. As gnunet grows downloaded files automatically the
"truncation" in this case is not really necessary...

Index: src/applications/afs/module/high_mysql.c
===================================================================
RCS file: /var/cvs/GNUnet/GNUnet/src/applications/afs/module/high_mysql.c,v
retrieving revision 1.11
diff -r1.11 high_mysql.c
574c574
<         "REPLACE INTO data%uof%u "
---
>         "REPLACE DELAYED INTO data%uof%u "

(Continue reading)

Hendrik Pagenhardt | 7 Jan 2004 13:39
Picon

Usage of super queries

Hello,

watching the stats of my gnunetd (which runs for some months now, with
interruptions) I never once saw a value different from 0 for "# lookup
(super query)". Does this mean that my server never got asked for a
super query block? If yes, what are those used for then? Shouldn't they
enable more efficient downloads by bundling queries? Maybe there is a
bug preventing the usage of super queries?

Ciao,
     Hendrik
Igor Wronsky | 7 Jan 2004 20:49
Picon

Re: Some small patches

On Wed, 7 Jan 2004, Hendrik Pagenhardt wrote:

> Index: src/applications/afs/module/high_mysql.c
> ===================================================================
> RCS file: /var/cvs/GNUnet/GNUnet/src/applications/afs/module/high_mysql.c,v
> retrieving revision 1.11
> diff -r1.11 high_mysql.c
> 574c574
> <         "REPLACE INTO data%uof%u "
> ---
> >         "REPLACE DELAYED INTO data%uof%u "
> This speeds up insertion/indexation a little when using mysql as database.

Looking at 4.0.17 reference manual, I'm not so sure
if this is a good idea. Particularly, I suspect
that errors might go unnoticed or that the database
size might be incorrectly reported when delayed
inserts are used. The manual also says that insert
delayed should only be used if you're really sure
you need it.

All in all, I don't think gnunet is generally so
faultless that we could afford adding any potential
troublemakers.

Igor
Igor Wronsky | 7 Jan 2004 20:32
Picon

Re: Usage of super queries

On Wed, 7 Jan 2004, Hendrik Pagenhardt wrote:

> watching the stats of my gnunetd (which runs for some months now, with
> interruptions) I never once saw a value different from 0 for "# lookup
> (super query)". Does this mean that my server never got asked for a
> super query block? If yes, what are those used for then? Shouldn't they
> enable more efficient downloads by bundling queries? Maybe there is a
> bug preventing the usage of super queries?

Nope. That is a misleading debug statement I once put there
as I tracked data flow inside gnunetd. The value should never
be nonzero as super-queries are not something that should
be looked up from disk.

I'll replace that particular entry with a more proper,
incrementing counter.

Igor
Hendrik Pagenhardt | 8 Jan 2004 09:12
Picon

Re[2]: Some small patches


>> [proposal of delayed inserts]

> Looking at 4.0.17 reference manual, I'm not so sure if this is a good
> idea. Particularly, I suspect that errors might go unnoticed or that
> the database size might be incorrectly reported when delayed inserts
> are used. The manual also says that insert delayed should only be used
> if you're really sure you need it.

I'm not sure how many more uncertainties delayed inserts would
introduce. I tested it on my machine and it worked really well. I even
ran gnunet-check -a after a big insertion session (not really to check
this, but because I suspected inconsistencies from a not gnunetd related
crash of my machine) without any errors showing up. So I think the use
of delayed inserts could be made an option for the more dangerous living
folks among us...

> All in all, I don't think gnunet is generally so faultless that we
> could afford adding any potential troublemakers.

You're probably right at this stage of development.

I thought a bit about the topic, and I think a good way to increase
insertion throughput might be the bundling of inserts within gnunet.
This probably could even be more efficient than the delayed inserts. The
abysmal performance of inserts is IMHO closely related to the sequential
nature of the insertion process (correct me if I'm wrong). And it's not
helping that we can't profit from the potentially parallel select and
insert capabilities of the database, because every bucket is locked with
a semaphore when a request is in progress. BTW which threads can run in
(Continue reading)

Christian Grothoff | 8 Jan 2004 21:37
Picon
Favicon

Re: Some small patches


On Wednesday 07 January 2004 04:58 am, Hendrik Pagenhardt wrote:
> cvs server: Diffing src/applications/afs/tools
> Index: src/applications/afs/tools/gnunet-insert.c
> ===================================================================
> RCS file:
> /var/cvs/GNUnet/GNUnet/src/applications/afs/tools/gnunet-insert.c,v
> retrieving revision 1.85
> diff -r1.85 gnunet-insert.c
> 38c38
> <     printf("%8u of %8u bytes inserted\n",
> ---
>
> >     printf("%8u of %8u bytes inserted\r",
>
> This allows using the -V option with gnunet-insert without cluttering
> the screen with progress messages...

Probably a good idea (applied).

> As I looked through the sources I saw a few places where MALLOCs were
> used for temporary variables (string buffers etc.). The only reasons I
> can imagine for doing so are:
> - replacing buffer overflows on the stack by corruptions in the heap,
> which are harder to exploit successfully (not really a good reason)
> - maximum stack size is reduced during runtime (should not be that big
> a problem)
>
> If I haven't overlooked another justification for this, I think it would
> be better to replace those MALLOCs by stack allocations wherever
(Continue reading)

Christian Grothoff | 8 Jan 2004 21:27
Picon
Favicon

Re: Re[2]: Some small patches


On Thursday 08 January 2004 03:12 am, Hendrik Pagenhardt wrote:
> I thought a bit about the topic, and I think a good way to increase
> insertion throughput might be the bundling of inserts within gnunet.
> This probably could even be more efficient than the delayed inserts. The
> abysmal performance of inserts is IMHO closely related to the sequential
> nature of the insertion process (correct me if I'm wrong). And it's not
> helping that we can't profit from the potentially parallel select and
> insert capabilities of the database, because every bucket is locked with
> a semaphore when a request is in progress.

Again, this type of optimization is likely to cause some form of trouble
(like the asynchronous errors that you noted) and somehow sounds even worse 
than 'DELAYED' to me (but I don't know enough about MySQL to truely 
comprehend the extend of trouble DELAYED may or may not cause, so I'll leave 
that decision to Igor). Furthermore, I am not sure that insertion speed would 
be so much of an issue once we have the insertion/download-manager (far far 
in the future) where all of these things would just go into the background.  
And even now, why is it a problem to run 'gnunet-insert' overnight (assuming 
your machine is on 24/7)?

> BTW which threads can run in
> parallel when gnunetd is running? I would hope that at least one thread
> for each connection (local or remote) is used?

We never used a thread per peer-connection (2 total, always) and since 0.6.1 
we only use one thread for all local clients (before one thread per client).  
The rules for all of these threads are that they must not block (more than 
for a bounded amount of disk-IO), and I don't see anything wrong with that.

(Continue reading)

Igor Wronsky | 9 Jan 2004 18:16
Picon

Re[2]: Some small patches

On Thu, 8 Jan 2004, Hendrik Pagenhardt wrote:

> I'm not sure how many more uncertainties delayed inserts would
> introduce. I tested it on my machine and it worked really well. I even
> ran gnunet-check -a after a big insertion session (not really to check
> this, but because I suspected inconsistencies from a not gnunetd related
> crash of my machine) without any errors showing up.

I'm not saying it would cause anything in normal operation. What
I'm saying is that if something goes wrong by some reason in some
system, we might not get to know that in the delayed setup and
might even go on believing that everything is proceeding fine.

> So I think the use
> of delayed inserts could be made an option for the more dangerous living
> folks among us...

We might make it selectable by a 'wizzard' .conf option
but thats about it at the moment.

> nature of the insertion process (correct me if I'm wrong). And it's not
> helping that we can't profit from the potentially parallel select and
> insert capabilities of the database, because every bucket is locked with
> a semaphore when a request is in progress.

I don't actually know how necessary that locking is. I initially
put it there to be certain of being on the safe side.

> For mysql this might be improved by collecting the inserts per bucket in
> a separate thread and when a threshold number or a timeout is reached
(Continue reading)

Hendrik Pagenhardt | 9 Jan 2004 14:15
Picon

MYSQL module

Hello,

always searching for easy hacks to improve performance of GNUnet, I
looked at the source of "estimateAvailableBlocks" in the MYSQL module.
What I really want is to reduce the number of calls to this function
(I'll try that at the weekend), but for now I have a question about the
specific implementation.

First I noticed, that the average row length is multiplied with the
number of rows (costing two SQL statements), when MYSQL returns the size
of the data file in the "Data_length" column of SHOW TABLE STATUS. So
row counting should not be necessary. Secondly this calculation omits
the space the table indexes use. As those currently add another 20% of
the table data size to the space consumed by the content storage, I
would suggest to either document that the quota is used for the data
files only, or, preferably, to adjust the calculation to include the
index size. That could be easily done by adding the "Data_length" and
"Index_length" columns of SHOW TABLE STATUS, still saving the row count.

A way to shrink the indexes, and possibly speed up data access at a
whole, might be to not use a primary key on the bucket tables. It could
be replaced by an index over the first 3 bytes of the hash. Randomness
assumed, this would be sufficient to distinctly identify 256^3 rows,
which is an order of magnitude higher than what could be placed in a
bucket. I tested this on my system with a copied GNUnet table (125MB
data file, 25 MB index) and the index shrank to less than half (9 MB).
Query performance was as fast or even faster this way. And the reduced
index size should help caching.

The only drawback is, that the "writeContent" method must be rewritten
(Continue reading)


Gmane