Seth Vidal | 15 Jul 2011 23:53
Favicon
Gravatar

[PATCH] switch --update to use the sqlite dbs instead of the xml files. Should massively impact memory footprint and hopefully only marginally impact performance.

---
 createrepo/__init__.py     |   40 ++------
 createrepo/readMetadata.py |  240 +++++++++++++------------------------------
 2 files changed, 83 insertions(+), 197 deletions(-)

diff --git a/createrepo/__init__.py b/createrepo/__init__.py
index 44035cc..8549188 100644
--- a/createrepo/__init__.py
+++ b/createrepo/__init__.py
 <at>  <at>  -530,39 +530,19  <at>  <at>  class MetaDataGenerator:
                 old_pkg = pkg
                 if pkg.find("://") != -1:
                     old_pkg = os.path.basename(pkg)
-                nodes = self.oldData.getNodes(old_pkg)
-                if nodes is not None: # we have a match in the old metadata
+                old_po = self.oldData.getNodes(old_pkg)
+                if old_po: # we have a match in the old metadata
                     if self.conf.verbose:
                         self.callback.log(_("Using data from old metadata for %s")
                                             % pkg)
-                    (primarynode, filenode, othernode) = nodes
-
-                    for node, outfile in ((primarynode, self.primaryfile),
-                                          (filenode, self.flfile),
-                                          (othernode, self.otherfile)):
-                        if node is None:
-                            break
-
-                        if self.conf.baseurl:
-                            anode = node.children
(Continue reading)

tim.lauridsen@gmail.com | 17 Jul 2011 11:20
Picon

Re: [PATCH] switch --update to use the sqlite dbs instead of the xml files. Should massively impact memory footprint and hopefully only marginally impact performance.

On Fri, Jul 15, 2011 at 11:53 PM, Seth Vidal <skvidal <at> fedoraproject.org> wrote:
> ---
>  createrepo/__init__.py     |   40 ++------
>  createrepo/readMetadata.py |  240 +++++++++++++------------------------------
>  2 files changed, 83 insertions(+), 197 deletions(-)
>
> diff --git a/createrepo/__init__.py b/createrepo/__init__.py
> index 44035cc..8549188 100644
> --- a/createrepo/__init__.py
> +++ b/createrepo/__init__.py
>  <at>  <at>  -530,39 +530,19  <at>  <at>  class MetaDataGenerator:
>                 old_pkg = pkg
>                 if pkg.find("://") != -1:
>                     old_pkg = os.path.basename(pkg)
> -                nodes = self.oldData.getNodes(old_pkg)
> -                if nodes is not None: # we have a match in the old metadata
> +                old_po = self.oldData.getNodes(old_pkg)
> +                if old_po: # we have a match in the old metadata
>                     if self.conf.verbose:
>                         self.callback.log(_("Using data from old metadata for %s")
>                                             % pkg)
> -                    (primarynode, filenode, othernode) = nodes
> -
> -                    for node, outfile in ((primarynode, self.primaryfile),
> -                                          (filenode, self.flfile),
> -                                          (othernode, self.otherfile)):
> -                        if node is None:
> -                            break
> -
> -                        if self.conf.baseurl:
(Continue reading)

seth vidal | 18 Jul 2011 07:35
Favicon
Gravatar

Re: [PATCH] switch --update to use the sqlite dbs instead of the xml files. Should massively impact memory footprint and hopefully only marginally impact performance.

On Sun, 2011-07-17 at 11:20 +0200, tim.lauridsen <at> gmail.com wrote:
> > _______________________________________________
> > Rpm-metadata mailing list
> > Rpm-metadata <at> lists.baseurl.org
> > http://lists.baseurl.org/mailman/listinfo/rpm-metadata
> >
> 
> ACK, Looks good to me
> 

I'm going to do some more testing on it today to make sure the memory
footprint is sane before I commit it.

-sv
tim.lauridsen@gmail.com | 18 Jul 2011 17:14
Picon

Re: [PATCH] switch --update to use the sqlite dbs instead of the xml files. Should massively impact memory footprint and hopefully only marginally impact performance.

On Mon, Jul 18, 2011 at 7:35 AM, seth vidal <skvidal <at> fedoraproject.org> wrote:
> On Sun, 2011-07-17 at 11:20 +0200, tim.lauridsen <at> gmail.com wrote:
>> > _______________________________________________
>> > Rpm-metadata mailing list
>> > Rpm-metadata <at> lists.baseurl.org
>> > http://lists.baseurl.org/mailman/listinfo/rpm-metadata
>> >
>>
>> ACK, Looks good to me
>>
>
> I'm going to do some more testing on it today to make sure the memory
> footprint is sane before I commit it.
>

Sound like a good idea :)

Tim
seth vidal | 18 Jul 2011 21:59
Favicon
Gravatar

Re: [PATCH] switch --update to use the sqlite dbs instead of the xml files. Should massively impact memory footprint and hopefully only marginally impact performance.

On Fri, 2011-07-15 at 17:53 -0400, Seth Vidal wrote:
> ---
>  createrepo/__init__.py     |   40 ++------
>  createrepo/readMetadata.py |  240 +++++++++++++------------------------------
>  2 files changed, 83 insertions(+), 197 deletions(-)

Tested locally on repodata of 9000 pkgs.

Goes from 1.8-> 2GB of  memory in use with the old createrepo code to
325MB of memory in use - same operation - performance-wise it is not
considerably different. More testing will bear that out, though.

I think I'll push this

-sv
skvidal | 18 Jul 2011 22:30
Favicon

createrepo/__init__.py createrepo/readMetadata.py

 createrepo/__init__.py     |   40 +------
 createrepo/readMetadata.py |  240 +++++++++++++--------------------------------
 2 files changed, 83 insertions(+), 197 deletions(-)

New commits:
commit 0a67bc57a9eda626735513a4015d8087f3f4bb29
Author: Seth Vidal <skvidal <at> fedoraproject.org>
Date:   Fri Jul 15 17:50:48 2011 -0400

    switch --update to use the sqlite dbs instead of the xml files. Should massively impact
    memory footprint and hopefully only marginally impact performance.

diff --git a/createrepo/__init__.py b/createrepo/__init__.py
index 44035cc..8549188 100644
--- a/createrepo/__init__.py
+++ b/createrepo/__init__.py
 <at>  <at>  -530,39 +530,19  <at>  <at>  class MetaDataGenerator:
                 old_pkg = pkg
                 if pkg.find("://") != -1:
                     old_pkg = os.path.basename(pkg)
-                nodes = self.oldData.getNodes(old_pkg)
-                if nodes is not None: # we have a match in the old metadata
+                old_po = self.oldData.getNodes(old_pkg)
+                if old_po: # we have a match in the old metadata
                     if self.conf.verbose:
                         self.callback.log(_("Using data from old metadata for %s")
                                             % pkg)
-                    (primarynode, filenode, othernode) = nodes
-
-                    for node, outfile in ((primarynode, self.primaryfile),
(Continue reading)

tim.lauridsen@gmail.com | 19 Jul 2011 07:13
Picon

Re: [PATCH] switch --update to use the sqlite dbs instead of the xml files. Should massively impact memory footprint and hopefully only marginally impact performance.

On Mon, Jul 18, 2011 at 9:59 PM, seth vidal <skvidal <at> fedoraproject.org> wrote:
> On Fri, 2011-07-15 at 17:53 -0400, Seth Vidal wrote:
>> ---
>>  createrepo/__init__.py     |   40 ++------
>>  createrepo/readMetadata.py |  240 +++++++++++++------------------------------
>>  2 files changed, 83 insertions(+), 197 deletions(-)
>
>
> Tested locally on repodata of 9000 pkgs.
>
> Goes from 1.8-> 2GB of  memory in use with the old createrepo code to
> 325MB of memory in use - same operation - performance-wise it is not
> considerably different. More testing will bear that out, though.

1.8 GB -> 328 MB sound like a winner :)

Tim
_______________________________________________
Rpm-metadata mailing list
Rpm-metadata <at> lists.baseurl.org
http://lists.baseurl.org/mailman/listinfo/rpm-metadata
skvidal | 20 Jul 2011 23:03
Favicon

createrepo/readMetadata.py

 createrepo/readMetadata.py |   11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

New commits:
commit 7e24b1ba2c939cff52f789b599d907c8fdac9747
Author: Seth Vidal <skvidal <at> fedoraproject.org>
Date:   Wed Jul 20 17:02:42 2011 -0400

    changes to make this more robust:
    1. when making a new repo - use mkdtemp() so we don't run the possibility of a race to the repo cache
    2. try/except when we setup the old repodata so if the dir is not there or the repo is broken we don't traceback

diff --git a/createrepo/readMetadata.py b/createrepo/readMetadata.py
index a449e68..88e5d95 100644
--- a/createrepo/readMetadata.py
+++ b/createrepo/readMetadata.py
 <at>  <at>  -21,8 +21,8  <at>  <at>  from utils import errorprint, _

 import yum
 from yum import misc
-
-
+from yum.Errors import YumBaseError
+import tempfile
 class CreaterepoPkgOld(yum.sqlitesack.YumAvailablePackageSqlite):
     # special for special people like us.
     def _return_remote_location(self):
 <at>  <at>  -48,13 +48,16  <at>  <at>  class MetadataIndex(object):
         repodatadir = self.outputdir + '/repodata'
         self._repo = yum.yumRepo.YumRepository('garbageid')
(Continue reading)

skvidal | 28 Jul 2011 00:08
Favicon

2 commits - createrepo/__init__.py genpkgmetadata.py

 createrepo/__init__.py |   40 +++++++++++++++++++++++++++++++++-------
 genpkgmetadata.py      |    2 ++
 2 files changed, 35 insertions(+), 7 deletions(-)

New commits:
commit 79b7871aaba432e267a8e15a4070118a9168f9e4
Author: Seth Vidal <skvidal <at> fedoraproject.org>
Date:   Wed Jul 27 18:07:35 2011 -0400

    reset the worker cmd location, whoops

diff --git a/createrepo/__init__.py b/createrepo/__init__.py
index 769defa..8cce31a 100644
--- a/createrepo/__init__.py
+++ b/createrepo/__init__.py
 <at>  <at>  -107,9 +107,9  <at>  <at>  class MetaDataConfig(object):
                                    # read in this run of createrepo
         self.collapse_glibc_requires = True
         self.workers = 1 # number of workers to fork off to grab metadata from the pkgs
-        #self.worker_cmd = '/usr/share/createrepo/worker.py'
+        self.worker_cmd = '/usr/share/createrepo/worker.py'

-        self.worker_cmd = './worker.py' # helpful when testing
+        #self.worker_cmd = './worker.py' # helpful when testing
         self.retain_old_md = 0

 class SimpleMDCallBack(object):
commit 180e3042bf7337edba2b80dd867f3f7953622725
Author: Seth Vidal <skvidal <at> fedoraproject.org>
Date:   Wed Jul 27 18:05:17 2011 -0400
(Continue reading)

Ville Skyttä | 28 Jul 2011 20:03
Favicon

createrepo.bash

 createrepo.bash |    8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

New commits:
commit 377318ac5de5a402f8ce537b4ac9d5c64e6286af
Author: Ville Skyttä <ville.skytta <at> iki.fi>
Date:   Thu Jul 28 21:01:23 2011 +0300

    Add --retain-old-md bash completion.

diff --git a/createrepo.bash b/createrepo.bash
index 54ac8b2..4222fa0 100644
--- a/createrepo.bash
+++ b/createrepo.bash
 <at>  <at>  -30,6 +30,10  <at>  <at>  _cr_createrepo()
             COMPREPLY=( $( compgen -f -o plusdirs -X '!*.rpm' -- "$2" ) )
             return 0
             ;;
+        --retain-old-md)
+            COMPREPLY=( $( compgen -W '0 1 2 3 4 5 6 7 8 9' -- "$2" ) )
+            return 0
+            ;;
         --num-deltas)
             COMPREPLY=( $( compgen -W '1 2 3 4 5 6 7 8 9' -- "$2" ) )
             return 0
 <at>  <at>  -42,8 +46,8  <at>  <at>  _cr_createrepo()
             --cachedir --checkts --no-database --update --update-md-path
             --skip-stat --split --pkglist --includepkg --outputdir
             --skip-symlinks --changelog-limit --unique-md-filenames
-            --simple-md-filenames --distro --content --repo --revision --deltas
(Continue reading)


Gmane