Jun Wu | 26 May 00:51 2016

[PATCH 1 of 3] maps: implement sqlite revmap

# HG changeset patch
# User Jun Wu <quark@...>
# Date 1464214938 -3600
#      Wed May 25 23:22:18 2016 +0100
# Node ID 68cda55fccac3d7bd6e611f0e5d15ef4af62464a
# Parent  a794cbc174a9717efbee2c3a852b717e6f4d34d2
# Available At https://bitbucket.org/quark-zju/hgsubversion-revmap
#              hg pull https://bitbucket.org/quark-zju/hgsubversion-revmap -r 68cda55fccac
maps: implement sqlite revmap

This patch adds the SqliteRevMap, which has a same interface with RevMap
but is backed by a sqlite database.

It uses database indexes to accelerate all kinds of queries and disables
iteration to prevent slow code being written in the future.

In practise, it should be faster on large repos with millions of svn
revisions but slower on small repos due to the overhead introduced.

diff --git a/hgsubversion/maps.py b/hgsubversion/maps.py
--- a/hgsubversion/maps.py
+++ b/hgsubversion/maps.py
 <at>  <at>  -1,5 +1,7  <at>  <at> 
 ''' Module for self-contained maps. '''

+import collections
+import contextlib
 import errno
 import os
 import re
(Continue reading)

Jun Wu | 24 May 01:34 2016

[PATCH 1 of 6 V2] maps: add "lasthash" property to RevMap

# HG changeset patch
# User Jun Wu <quark@...>
# Date 1464042988 -3600
#      Mon May 23 23:36:28 2016 +0100
# Node ID 71440ebde573db5a46afad10ce808cbbf1bde3ca
# Parent  c161586a6b771b2ae6d61f0cc0152fae75b356f0
# Available At https://bitbucket.org/quark-zju/hgsubversion-revmap
#              hg pull https://bitbucket.org/quark-zju/hgsubversion-revmap -r 71440ebde573
maps: add "lasthash" property to RevMap

This is a part of the bigger plan to get rid of reading or writing rev_map
directly without going through the RevMap class.

The "lasthash" property will be used in updatemeta.

diff --git a/hgsubversion/maps.py b/hgsubversion/maps.py
--- a/hgsubversion/maps.py
+++ b/hgsubversion/maps.py
 <at>  <at>  -355,6 +355,13  <at>  <at> 
         check = lambda x: x[0][1] == branch and x[0][0] < rev.revnum
         return sorted(filter(check, self.iteritems()), reverse=True)

+     <at> property
+    def lasthash(self):
+        lines = list(self.readmapfile(self.meta.revmap_file))
+        if not lines:
+            return None
+        return bin(lines[-1].split(' ', 2)[1])
+
     def revhashes(self, revnum):
(Continue reading)

Jun Wu | 14 May 22:37 2016

[PATCH 01 of 10] maps: add the "clear" method to RevMap

# HG changeset patch
# User Jun Wu <quark@...>
# Date 1462812484 -3600
#      Mon May 09 17:48:04 2016 +0100
# Node ID bec3ad6e25aa0c9889fd16c2f5f37beaea4e3243
# Parent  94eb844fd4ab6e79f6004669b204635cb73ceb11
maps: add the "clear" method to RevMap

This is a part of the bigger plan to get rid of reading or writing rev_map
directly without going through the RevMap class.

The "clear" method is used in rebuildmeta.

diff --git a/hgsubversion/maps.py b/hgsubversion/maps.py
--- a/hgsubversion/maps.py
+++ b/hgsubversion/maps.py
 <at>  <at>  -355,6 +355,11  <at>  <at> 
         check = lambda x: x[0][1] == branch and x[0][0] < rev.revnum
         return sorted(filter(check, self.iteritems()), reverse=True)

+    def clear(self):
+        self._write()
+        dict.clear(self)
+        self._hashes = None
+
      <at> classmethod
     def readmapfile(cls, path, missingok=True):
         try:

--

-- 
(Continue reading)

Sean Farley | 6 May 23:47 2016
Gravatar

[PATCH 1 of 3 fix-templatekw-changes] templatekw: move to __init__ to prepare for newer mercurial

# HG changeset patch
# User Sean Farley <sean@...>
# Date 1462515049 25200
#      Thu May 05 23:10:49 2016 -0700
# Node ID 71293bb1156e07545f0065c5c1eb7cf5c85c7710
# Parent  94eb844fd4ab6e79f6004669b204635cb73ceb11
templatekw: move to __init__ to prepare for newer mercurial

diff --git a/hgsubversion/__init__.py b/hgsubversion/__init__.py
--- a/hgsubversion/__init__.py
+++ b/hgsubversion/__init__.py
 <at>  <at>  -163,11 +163,17  <at>  <at>  def extsetup(ui):
          lambda *args: open(os.path.join(helpdir, 'subversion.rst')).read()),
     )

     help.helptable.extend(entries)

-    templatekw.keywords.update(util.templatekeywords)
+    templatekeywords = {
+        'svnrev': svnrevkw,
+        'svnpath': svnpathkw,
+        'svnuuid': svnuuidkw,
+    }
+
+    templatekw.keywords.update(templatekeywords)

     revset.symbols.update(util.revsets)

     subrepo.types['hgsubversion'] = svnexternals.svnsubrepo

(Continue reading)

Brian O'Keefe | 2 May 16:14 2016
Picon

Support for pulling and pusing merge csets based on svn:mergeinfo properties

I recently submitted a pull request that adds support for merge changesets: https://bitbucket.org/durin42/hgsubversion/pull-requests/30/add-support-for-pulling-and-pusing-merge/diff. It's a little on the large side, so Augie asked me to send a brief overview to the list.

Essentially, there are four categories of merge events (two push, two pull) that I needed to deal with, plus the non-merge cases. I'll briefly explain what I did for each case.


A: Pulling changes

First, whenever we pull a revision from snv, I retrieve the svn:mergeinfo property for that change and its parent. Then I compare the two sets of mergeinfo. Depending on what the differences are, we're in one of three pull cases:

1: No mergeinfo difference

If both sets of mergeinfo are the same, this isn't a merge revision. Carry on with the usual pull model.

2: Pulling a "normal" merge from Subversion

In order for a changeset to be a normal merge, the difference between the two sets of mergeinfo has to be a contiguous range of changesets. This does cover merging a merge changeset, as long as the mergeinfo is right. Once we determine that we're merging contiguous changes, we set up the most recent revision from the changed mergeinfo as the second parent. It's not terribly obvious, but if you cherrypick a revision (e.g. X) into a branch, then merge the parent of the cherrypicked revision (e.g. B-H), svn will record that second merge in the same range as the cherrypicked revision (e.g. (B,X)) instead of two separate ranges (e.g. (B,H), (X,X)). That being the case, Mercurial will see the merge as a merge from the cherrypicked revision instead of a merge from its parent. I think that more accurately reflects the state of the repository, so I left it that way.

3: Pulling a "cherrypick" merge (or multiple merges) from Subversion

Everything else falls into this case. We've either pulled a revision that includes a single change that isn't adjacent to an existing range or several ranges (which could be a single revision) that aren't contiguous. In this case, the best we can do is assume changes were cherrypicked instead of merged. We can't do this sensibly as a merge, so these turn into grafts. That's logical for a single cherrypicked revision, but there isn't a good way to track multiple cherrypicks in on cset. I opted to use the most recent revision in the mergeinfo changes as the source for the graft. It's not perfect, but it looks sensible in the graph.


B: Pushing changes

Pushing is slightly more complicated because of the way branches are tracked. On the other hand, it's easy to distinguish between push cases:

4: Pushing a merge changeset

A merge changeset has two parents, so it's easy to identify. We just need to do a little transformation to get it into a Subversion-compatible format. First, we need to verify that the changeset is consistent. This requires checking that it's first parent is a descendant of a Subversion revision (we'll push the whole branch at once, so it's okay if the immediate parent isn't in svn yet), and that the second parent is a Subversion revision (since we won't push the rest of that branch. If the second parent isn't in Subversion, we abort with a message that you need to push that branch first, or you can rebase that branch onto the Subversion branch.

If we're continuing with the push, we need to do a little transformation. Mercurial records merge csets as a merge of two parents plus a list of files that changed to resolve merge conflicts (or were otherwise edited as part of the merge). Subversion, not really having merge support, needs a list of files that changed relative to the parent (on the same branch), plus changes to the svn:mergeinfo property. Mercurial has repo.status for comparing two revisions, so we compare the outgoing revision to the first parent, and use that set of added, removed, deleted, and modified files as the changed files for the Subversion revision.

Next we update the mergeinfo property on the root directory of the modified branch. We have to walk back along the second parent's history until we find a changeset that's an ancestor of the first parent. That gives us the range of changesets that were merged in. We also have to track branch names, if they change along the way, for storing in the mergeinfo. Finally, we add the new mergeinfo to the existing mergeinfo, squash adjacent ranges together, and add that to the revision properties. That gets pushed to svn, as usual, and when we pull it back, the pull rules above turn it back into a Mercurial merge.

5: Pushing a cherrypicked (grafted) changeset

Grafts are much more straightforward than merges. Since Mercurial only allows one graft per changeset, we just check to see if there is a source in the extra properties. If so, we look up the svn revision corresponding to that source. If we find one, we add it to the mergeinfo (well, the range from it's parent revision + 1 to itself). If not, I just discard the graft information and push it as a normal changeset. That lets you graft changes from private branches without having to push your private branch first.

6: Pushing normal changes

If we weren't in either of those cases, it's a normal changeset. We can continue pushing that as usual.


I'd also like to note that although subvertpy has a function for getting mergeinfo as a python dictionary, it had some weird off-by-one errors with the start of a range. It was easier to fix those by re-implementing the mergeinfo parser than figuring out how to fix subvertpy, getting those commited, requiring everyone to update subvertpy to use hgsubversion, and bailing out if that wasn't the case. I had to roll my own for the swig bindings anyway, so it wasn't too bad.

There's a basic set of tests included for pushing and pulling both merges and grafts. I've also been using this on a couple production repos. The one I use most often is around 6000 revisions and only has a few branches. I only screwed it up once pushing a merge cset before I realized I had to do the transformation I mentioned earlier. The second one is over 250k revisions with hundreds of branches - it's a mess, but I'm able to pull from it without issue. I haven't pushed any merges to that repo yet.


There's one caveat I pointed out in the pull request that I think warrants repeating: Subversion does weird things with "reintegration" merges, so if you merge the head of a branch back into it's parent, subversion treats it as closed and complains if you try to use that branch. Pushing another cset from mercurial re-opens the branch in subversion. I'm not sure if anything can be done about that.

--
You received this message because you are subscribed to the Google Groups "hgsubversion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hgsubversion+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To post to this group, send email to hgsubversion-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
Visit this group at https://groups.google.com/group/hgsubversion.
For more options, visit https://groups.google.com/d/optout.
Sean Farley | 19 Apr 01:54 2016
Gravatar

[PATCH 1 of 9] maps: use regex for better comment handling

# HG changeset patch
# User Sean Farley <sean.michael.farley@...>
# Date 1395678060 18000
#      Mon Mar 24 11:21:00 2014 -0500
# Node ID 54d1d943295539b0b29b3bedd7af2c6a5420b185
# Parent  f8b407577ba2596678eecebe2708d94c33e24ea7
maps: use regex for better comment handling

This is copied straight from mercurial's hgignore parsing.

diff --git a/hgsubversion/maps.py b/hgsubversion/maps.py
--- a/hgsubversion/maps.py
+++ b/hgsubversion/maps.py
 <at>  <at>  -1,9 +1,10  <at>  <at> 
 ''' Module for self-contained maps. '''

 import errno
 import os
+import re
 from mercurial import util as hgutil
 from mercurial.node import bin, hex, nullid

 import subprocess
 import svncommands
 <at>  <at>  -14,10 +15,12  <at>  <at>  class BaseMap(dict):
     tags.'''
     def __init__(self, meta):
         self.meta = meta
         super(BaseMap, self).__init__()

+        self._commentre = re.compile(r'((^|[^\\])(\\\\)*)#.*')
+
         # trickery: all subclasses have the same name as their file and config
         # names, e.g. AuthorMap is meta.authormap_file for the filename and
         # 'authormap' for the config option
         self.mapname = self.__class__.__name__.lower()
         self.mapfilename = self.mapname + '_file'
 <at>  <at>  -44,12 +47,18  <at>  <at>  class BaseMap(dict):
         for number, line in enumerate(f):

             if writing:
                 writing.write(line)

-            line = line.split('#')[0]
-            if not line.strip():
+            # strip out comments
+            if "#" in line:
+                # remove comments prefixed by an even number of escapes
+                line = self._commentre.sub(r'\1', line)
+                # fixup properly escaped comments that survived the above
+                line = line.replace("\\#", "#")
+            line = line.rstrip()
+            if not line:
                 continue

             try:
                 src, dst = line.split('=', 1)
             except (IndexError, ValueError):

--

-- 
You received this message because you are subscribed to the Google Groups "hgsubversion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hgsubversion+unsubscribe@...
To post to this group, send email to hgsubversion@...
Visit this group at https://groups.google.com/group/hgsubversion.
For more options, visit https://groups.google.com/d/optout.

Sean Farley | 19 Apr 00:29 2016
Gravatar

[PATCH 1 of 9 map-cleanup] maps: make author map inherit from base map

# HG changeset patch
# User Sean Farley <sean.michael.farley@...>
# Date 1395678059 18000
#      Mon Mar 24 11:20:59 2014 -0500
# Node ID b4b904ddd45203f47720b17292f48617932e5dac
# Parent  b7d44cfbb8d6799b6f7b99f76c5eb44d5c49a1a9
maps: make author map inherit from base map

diff --git a/hgsubversion/maps.py b/hgsubversion/maps.py
--- a/hgsubversion/maps.py
+++ b/hgsubversion/maps.py
 <at>  <at>  -70,11 +70,11  <at>  <at>  class BaseMap(dict):

         f.close()
         if writing:
             writing.close()

-class AuthorMap(dict):
+class AuthorMap(BaseMap):
     '''A mapping from Subversion-style authors to Mercurial-style
     authors, and back. The data is stored persistently on disk.

     If the 'hgsubversion.defaultauthors' configuration option is set to false,
     attempting to obtain an unknown author will fail with an Abort.

--

-- 
You received this message because you are subscribed to the Google Groups "hgsubversion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hgsubversion+unsubscribe@...
To post to this group, send email to hgsubversion@...
Visit this group at https://groups.google.com/group/hgsubversion.
For more options, visit https://groups.google.com/d/optout.

Sean Farley | 18 Apr 23:16 2016
Gravatar

[PATCH 1 of 3] maps: call super directly instead of self.super

# HG changeset patch
# User Sean Farley <sean.michael.farley@...>
# Date 1395678058 18000
#      Mon Mar 24 11:20:58 2014 -0500
# Node ID b047156cbca81dee73a25d6f4e55922c130358bf
# Parent  8c17a89122457df6b90683b0a060b6a93be9da3a
maps: call super directly instead of self.super

In the next few patches, we're going to remove self.super because it isn't
reliable for calling up the parent chain. Instead, we'll save ourselves the
headache and change it now.

diff --git a/hgsubversion/maps.py b/hgsubversion/maps.py
--- a/hgsubversion/maps.py
+++ b/hgsubversion/maps.py
 <at>  <at>  -112,11 +112,11  <at>  <at>  class AuthorMap(dict):
         if self.meta.caseignoreauthors:
             search_author = author.lower()

         result = None
         if search_author in self:
-            result = self.super.__getitem__(search_author)
+            result = super(AuthorMap, self).__getitem__(search_author)
         elif self.meta.mapauthorscmd:
             cmd = self.meta.mapauthorscmd % author
             process = subprocess.Popen(cmd, shell=True, stdout=subprocess.PIPE)
             output, err = process.communicate()
             retcode = process.poll()

--

-- 
You received this message because you are subscribed to the Google Groups "hgsubversion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hgsubversion+unsubscribe@...
To post to this group, send email to hgsubversion@...
Visit this group at https://groups.google.com/group/hgsubversion.
For more options, visit https://groups.google.com/d/optout.

Sean Farley | 15 Apr 00:03 2016
Gravatar

[PATCH 1 of 3] maps: add custom __setitem__ to author map

# HG changeset patch
# User Sean Farley <sean.michael.farley@...>
# Date 1395678058 18000
#      Mon Mar 24 11:20:58 2014 -0500
# Node ID 0cef9a23a06aab2a8602f6a4e5df856f6c840969
# Parent  e1619c051788692046f83b068fb063e6cef7a133
maps: add custom __setitem__ to author map

We add a custom __setitem__ that will encapsulate the meta.caseignoreauthor
logic.

diff --git a/hgsubversion/maps.py b/hgsubversion/maps.py
--- a/hgsubversion/maps.py
+++ b/hgsubversion/maps.py
 <at>  <at>  -86,10 +86,18  <at>  <at>  class AuthorMap(dict):

         f.close()
         if writing:
             writing.close()

+    def __setitem__(self, key, value):
+        '''Similar to dict.__setitem__, except we check caseignoreauthors to
+        use lowercase string or not
+        '''
+        if self.meta.caseignoreauthors:
+            key = key.lower()
+        super(AuthorMap, self).__setitem__(key, value)
+
     def __getitem__(self, author):
         ''' Similar to dict.__getitem__, except in case of an unknown author.
         In such cases, a new value is generated and added to the dictionary
         as well as the backing store. '''
         if author is None:

--

-- 
You received this message because you are subscribed to the Google Groups "hgsubversion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hgsubversion+unsubscribe@...
To post to this group, send email to hgsubversion@...
Visit this group at https://groups.google.com/group/hgsubversion.
For more options, visit https://groups.google.com/d/optout.

shunichi.goto | 23 Feb 03:26 2016
Picon

[PATCH] compat: fix some more use of repo.parents()

# HG changeset patch
# User Shun-ichi GOTO <shunichi.goto@...>
# Date 1456193799 -32400
#      Tue Feb 23 11:16:39 2016 +0900
# Node ID 6e4ca442dd88f8fb1410335e1325cbf6d72c900d
# Parent  abc87a62ff51efcc3f71ba835e08b2fdb3f30b3c
compat: fix some more use of repo.parents()

fix repo.parents() to repo[None].parents() in some places
along with changeset 4f8b1f202c90.

diff -r abc87a62ff51 -r 6e4ca442dd88 hgsubversion/util.py
--- a/hgsubversion/util.py	Mon Feb 08 17:17:19 2016 -0800
+++ b/hgsubversion/util.py	Tue Feb 23 11:16:39 2016 +0900
 <at>  <at>  -72,7 +72,7  <at>  <at> 
 def parentrev(ui, repo, meta, hashes):
     """Find the svn parent revision of the repo's dirstate.
     """
-    workingctx = repo.parents()[0]
+    workingctx = repo[None].parents()[0]
     outrev = outgoing_revisions(repo, hashes, workingctx.node())
     if outrev:
         workingctx = repo[outrev[-1]].parents()[0]
diff -r abc87a62ff51 -r 6e4ca442dd88 hgsubversion/wrappers.py
--- a/hgsubversion/wrappers.py	Mon Feb 08 17:17:19 2016 -0800
+++ b/hgsubversion/wrappers.py	Tue Feb 23 11:16:39 2016 +0900
 <at>  <at>  -128,7 +128,7  <at>  <at> 
     # split off #rev; TODO implement --revision/#rev support
     svn = other.svn
     meta = repo.svnmeta(svn.uuid, svn.subdir)
-    parent = repo.parents()[0].node()
+    parent = repo[None].parents()[0].node()
     hashes = meta.revmap.hashes()
     common, heads = util.outgoing_common_and_heads(repo, hashes, parent)
     outobj = getattr(discovery, 'outgoing', None)
 <at>  <at>  -147,7 +147,7  <at>  <at> 
     # svnurl, revs, checkout = util.parseurl(dest.svnurl, heads)
     svn = dest.svn
     meta = repo.svnmeta(svn.uuid, svn.subdir)
-    parent = repo.parents()[0].node()
+    parent = repo[None].parents()[0].node()
     hashes = meta.revmap.hashes()
     return util.outgoing_revisions(repo, hashes, parent)

 <at>  <at>  -160,7 +160,7  <at>  <at> 
     meta = repo.svnmeta()
     hashes = meta.revmap.hashes()
     if not opts.get('rev', None):
-        parent = repo.parents()[0]
+        parent = repo[None].parents()[0]
         o_r = util.outgoing_revisions(repo, hashes, parent.node())
         if o_r:
             parent = repo[o_r[-1]].parents()[0]
 <at>  <at>  -213,7 +213,7  <at>  <at> 

         # Strategy:
         # 1. Find all outgoing commits from this head
-        if len(repo.parents()) != 1:
+        if len(repo[None].parents()) != 1:
             ui.status('Cowardly refusing to push branch merge\n')
             return 0 # results in nonzero exit status, see hg's commands.py
         workingrev = repo[None].parents()[0]
 <at>  <at>  -564,7 +564,7  <at>  <at> 
         """
         extra['branch'] = ctx.branch()
     extrafn = opts.get('svnextrafn', extrafn2)
-    sourcerev = opts.get('svnsourcerev', repo.parents()[0].node())
+    sourcerev = opts.get('svnsourcerev', repo[None].parents()[0].node())
     meta = repo.svnmeta()
     hashes = meta.revmap.hashes()
     o_r = util.outgoing_revisions(repo, hashes, sourcerev=sourcerev)

--

-- 
You received this message because you are subscribed to the Google Groups "hgsubversion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hgsubversion+unsubscribe@...
To post to this group, send email to hgsubversion@...
Visit this group at https://groups.google.com/group/hgsubversion.
For more options, visit https://groups.google.com/d/optout.

Mateusz Kwapich | 9 Feb 02:26 2016

[PATCH] maps: remove python2.7ism from dynamic author mapping

# HG changeset patch
# User Mateusz Kwapich <mitrandir@...>
# Date 1454980639 28800
#      Mon Feb 08 17:17:19 2016 -0800
# Node ID abc87a62ff51efcc3f71ba835e08b2fdb3f30b3c
# Parent  a17d8874a09937d7a5fe3efb986135e21906c8e0
maps: remove python2.7ism from dynamic author mapping

diff --git a/hgsubversion/maps.py b/hgsubversion/maps.py
--- a/hgsubversion/maps.py
+++ b/hgsubversion/maps.py
 <at>  <at>  -103,8 +103,14  <at>  <at>  class AuthorMap(dict):
         if search_author in self:
             result = self.super.__getitem__(search_author)
         elif self.meta.mapauthorscmd:
-            self[author] = result = subprocess.check_output (
-                self.meta.mapauthorscmd % author, shell = True).strip()
+            cmd = self.meta.mapauthorscmd % author
+            process = subprocess.Popen(cmd, shell=True, stdout=subprocess.PIPE)
+            output, err = process.communicate()
+            retcode = process.poll()
+            if retcode:
+                msg = 'map author command "%s" exited with error'
+                raise hgutil.Abort(msg % cmd)
+            self[author] = result = output.strip()
         if not result:
             if self.meta.defaultauthors:
                 self[author] = result = '%s%s' % (author, self.defaulthost)

--

-- 
You received this message because you are subscribed to the Google Groups "hgsubversion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hgsubversion+unsubscribe@...
To post to this group, send email to hgsubversion@...
Visit this group at https://groups.google.com/group/hgsubversion.
For more options, visit https://groups.google.com/d/optout.


Gmane