Mike Orr | 1 Sep 2006 01:26
Picon

Re: Organizing a Quixote/Durus multiprocess application

Here's what I've got so far.  Does it look reasonable?

I first divert sys.stdout and sys.stderr to a logfile unless envvar
"NO_LOG" is set.  This will be used by all subprocesses.  I don't want
some output sent to the console or swallowed up depending on where the
exception hits.  Then I fork and (in the child) start the Durus server
for the sessions db, and (in the parent) start the SCGI server.  I
catch KeyboardInterrupt in both cases to avoid a traceback in the log
file.  I'm using None for both the Quixote error log and the Durus
error log to prevent those packages from redirecting it further.

It took a while to get the file ownerships and umask correct, but I
finally got it working as user 'apache' from the command line.
But if I start it as a daemon and then stop it, it doesn't kill the
Durus server.  That must be because it's using SIGTERM instead of
SIGINT.  I added some code to kill it manually if it's still running,
but that doesn't help.

I may switch to a TCP socket if it gets too frustrating managing the
Unix socket, but I don't think that will help the current problem

Another issue is getting rid of Durus's "sys.stdout already customized.
sys.stderr already customized." messages in the log file.  I set
logginglevel to 101 but that doesn't help because the messages are
sent before that gets set.  I may replace durus.logging.direct_output
to get rid of them.

Another issue is in scgi_server.  I sometimes get a KeyError on
"self.reap_children(self.children[pid])".  I assume that means the
SCGI subprocess has already died, because one of them gave a
(Continue reading)

Mike Orr | 1 Sep 2006 18:55
Picon

Re: Organizing a Quixote/Durus multiprocess application

On 8/31/06, Mike Orr <sluggoster <at> gmail.com> wrote:
> It took a while to get the file ownerships and umask correct, but I
> finally got it working as user 'apache' from the command line.
> But if I start it as a daemon and then stop it, it doesn't kill the
> Durus server.  That must be because it's using SIGTERM instead of
> SIGINT.  I added some code to kill it manually if it's still running,
> but that doesn't help.

I remembered that session2 has a DirectorySessionStore class that
doesn't need a subprocess, so I went with that instead. That shrunk
the size of my startup script by 25%. :)

--

-- 
Mike Orr <sluggoster <at> gmail.com>
Jesus Cea | 21 Sep 2006 20:43
Picon

Re: RELEASED: Durus-3.5


David Binger wrote:
>     * Fix a bug introduced in version 3.4 that could, under certain
> conditions, allow conflicts to be missed.

Nice fix.

>  * Revise the cache code.  It now uses a WeakValueDict instead of a
> plain dict to hold the references.  This simplifies the code because we no
> longer need to call the weakref instances directly.  It also helps the cache
> shrinking loop because the weakref callbacks have an immediate impact on the
> size of the mapping.

I just discovered a regressión problem with this approach, related to
multithreading.

In previous Durus release, you could play safe in a multithreaded
environment if a) you don't share durus connections between threads or,
more interesting, b) you take great care when using persistent objects.

Let me explain last point:

You can share persistent objects/connection between threads if you take
steps to serialize access to persistent objects. The usual approach is
to use a global lock, and be sure you touch persistent objects ONLY when
having the lock under your wing.

This worked very nicely until Durus 3.5.

Previously the connection cache management was involved when a) you have
(Continue reading)

Mike Orr | 21 Sep 2006 21:26
Picon

Optional prepack file

It would be nice if connection.pack() took an optional backup argument
to prevent the prepack file from being written.  Otherwise I just have
to delete it after it's nicely been created.

--

-- 
Mike Orr <sluggoster <at> gmail.com>
David Binger | 21 Sep 2006 21:53

Re: Optional prepack file


On Sep 21, 2006, at 3:26 PM, Mike Orr wrote:

> It would be nice if connection.pack() took an optional backup argument
> to prevent the prepack file from being written.  Otherwise I just have
> to delete it after it's nicely been created.

The prepack file is really just a rename of the original file:
it isn't really written by the pack, but it is left in the
file system as a precaution.

But, I see your point.
Jesus Cea | 21 Sep 2006 21:59
Picon

multithread and implicit cache management (was: Re: RELEASED: Durus-3.5)


Jesus Cea wrote:
> I just discovered a regressión problem with this approach, related to
> multithreading.

Some crash examples:

=====
Exception in thread Thread-16566:
Traceback (most recent call last):
[...]
  File "/export/home/correo/lmtp.py", line 85, in _monitor
    persistencia.commit()
  File "/usr/local/lib/python2.4/site-packages/durus/connection.py",
line 281, in commit
    self.shrink_cache()
  File "/usr/local/lib/python2.4/site-packages/durus/connection.py",
line 216, in shrink_cache
    self.cache.shrink(self)
  File "/usr/local/lib/python2.4/site-packages/durus/connection.py",
line 396, in shrink
    heap = self._build_heap(connection.get_transaction_serial())
  File "/usr/local/lib/python2.4/site-packages/durus/connection.py",
line 374, in _build_heap
    for oid in islice(chain(all, all), start, start + len(all)):
RuntimeError: dictionary changed size during iteration
=====

Other strange error:

(Continue reading)

David Binger | 21 Sep 2006 22:21

Re: RELEASED: Durus-3.5

Would your application be crippled by dropping those
references, as below, before releasing the lock?

Better, write a lock-protected get_value()
function that returns the value, and no references
to persistent instances.

If I remember correctly, your database is cycle-free,
so the gc garbage collector probably isn't doing anything
for you, or causing trouble.  If it were, you
could call gc.disable().

On Sep 21, 2006, at 2:43 PM, Jesus Cea wrote:

> def example(s,key) :
>   global_lock.adquire()
>   conn.abort()  # Be sure your data is fresh
>   root=conn.get_root()
>   data=root["data"]
>   value=data[key] # Retrieve the data
>   conn.commit()
     root = None
     data = None
>   global_lock.release()
>   s.send(value) # Can take a lot of time
>   return
Jesus Cea | 21 Sep 2006 23:40
Picon

Re: RELEASED: Durus-3.5


David Binger wrote:
> Would your application be crippled by dropping those
> references, as below, before releasing the lock?

The code is several thousand of lines long. Moreover, the "issue" would
be still present, ready to beat any other multithread usage of Durus.

> Better, write a lock-protected get_value()
> function that returns the value, and no references
> to persistent instances.

That was my original workaround try, but see last paragraph :-(.

> If I remember correctly, your database is cycle-free,
> so the gc garbage collector probably isn't doing anything
> for you, or causing trouble.  If it were, you
> could call gc.disable().

"gc.disable()" disables the garbage collection of cycles, but this is
not an issue here. I'm talking about the reference counter when I say
"garbage collection".

I am testing a WeakValueDictionary subclass implementing my first
suggestion: take note of the deleted objects, but keep them in the
dictionary until "cache.shrink()" calls "dictionary.shrink()".

Would you be interested in the patch?. No noticeable performance hit.

We only need to add a "dictionary.shrink()" in "cache.shrink()". The
(Continue reading)

Neil Schemenauer | 22 Sep 2006 00:39

Re: RELEASED: Durus-3.5

Jesus Cea <jcea <at> argo.es> wrote:
> With durus 3.5, nevertheless, there is a new point where cache
> management comes around: object garbage collection. Worse, those points
> are "implicit". That is, you can't be really sure where your persistent
> objects will be collected.

It good that you realized this problem but I don't think it was
introduced by 3.5.  AFAIK, the Durus cache has always used weak
references.

  Neil
Jesus Cea | 22 Sep 2006 01:37
Picon

Re: multithread and implicit cache management


Jesus Cea wrote:
> * When referencing objects from the cache, keep an strong reference
> around. We can keep those references until a commit/abort for that
> thread comes. Then we delete them. Since the commit/abort is sure to be
> called with a lock adquired, we can be sure that the collection will be
> done with a lock. The code seems fairly trivial (the "recent_objects"
> could be used for this. I don't understand the logic of
> "recent_objects"). Beware, this code is not optimized for speed:

This code doesn't work if a thread does several commits/aborts without
touching persistent objects again :-/.

--
Jesus Cea Avion                         _/_/      _/_/_/        _/_/_/
jcea <at> argo.es http://www.argo.es/~jcea/ _/_/    _/_/  _/_/    _/_/  _/_/
jabber / xmpp:jcea <at> jabber.org         _/_/    _/_/          _/_/_/_/_/
                               _/_/  _/_/    _/_/          _/_/  _/_/
"Things are not so easy"      _/_/  _/_/    _/_/  _/_/    _/_/  _/_/
"My name is Dump, Core Dump"   _/_/_/        _/_/_/      _/_/  _/_/
"El amor es poner tu felicidad en la felicidad de otro" - Leibniz

Gmane