Junio C Hamano | 1 Dec 2011 01:27
Picon
Picon
Favicon
Gravatar

[PATCH v2 0/5] Bulk Check-in

This is a re-roll of the earlier bulk-check-in series. The only change is
that the last one is a bit re-structured to pay attention to the packsize
limit from the get-go and also not write objects that already exist in the
repository, instead of "oops, I forgot to do that, and here is a fix".

Junio C Hamano (5):
  write_pack_header(): a helper function
  create_tmp_packfile(): a helper function
  finish_tmp_packfile(): a helper function
  csum-file: introduce sha1file_checkpoint
  bulk-checkin: replace fast-import based implementation

 Makefile               |    2 +
 builtin/add.c          |    5 +
 builtin/pack-objects.c |   62 +++--------
 bulk-checkin.c         |  275 ++++++++++++++++++++++++++++++++++++++++++++++++
 bulk-checkin.h         |   16 +++
 cache.h                |    2 +
 config.c               |    4 +
 csum-file.c            |   20 ++++
 csum-file.h            |    9 ++
 environment.c          |    1 +
 fast-import.c          |   25 ++---
 pack-write.c           |   53 +++++++++
 pack.h                 |    6 +
 sha1_file.c            |   67 +-----------
 t/t1050-large.sh       |   94 +++++++++++++++--
 zlib.c                 |    9 ++-
 16 files changed, 516 insertions(+), 134 deletions(-)
 create mode 100644 bulk-checkin.c
(Continue reading)

Junio C Hamano | 1 Dec 2011 01:27
Picon
Picon
Favicon
Gravatar

[PATCH v2 1/5] write_pack_header(): a helper function

Factor out a small logic out of the private write_pack_file() function
in builtin/pack-objects.c

Signed-off-by: Junio C Hamano <gitster <at> pobox.com>
---
 builtin/pack-objects.c |    9 +++------
 pack-write.c           |   12 ++++++++++++
 pack.h                 |    2 ++
 3 files changed, 17 insertions(+), 6 deletions(-)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index ba3705d..6643c16 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
 <at>  <at>  -571,7 +571,6  <at>  <at>  static void write_pack_file(void)
 	uint32_t i = 0, j;
 	struct sha1file *f;
 	off_t offset;
-	struct pack_header hdr;
 	uint32_t nr_remaining = nr_result;
 	time_t last_mtime = 0;
 	struct object_entry **write_order;
 <at>  <at>  -596,11 +595,9  <at>  <at>  static void write_pack_file(void)
 			f = sha1fd(fd, pack_tmp_name);
 		}

-		hdr.hdr_signature = htonl(PACK_SIGNATURE);
-		hdr.hdr_version = htonl(PACK_VERSION);
-		hdr.hdr_entries = htonl(nr_remaining);
-		sha1write(f, &hdr, sizeof(hdr));
(Continue reading)

Junio C Hamano | 1 Dec 2011 01:27
Picon
Picon
Favicon
Gravatar

[PATCH v2 5/5] bulk-checkin: replace fast-import based implementation

This extends the earlier approach to stream a large file directly from the
filesystem to its own packfile, and allows "git add" to send large files
directly into a single pack. Older code used to spawn fast-import, but the
new bulk-checkin API replaces it.

Signed-off-by: Junio C Hamano <gitster <at> pobox.com>
---
 Makefile               |    2 +
 builtin/add.c          |    5 +
 builtin/pack-objects.c |    6 +-
 bulk-checkin.c         |  275 ++++++++++++++++++++++++++++++++++++++++++++++++
 bulk-checkin.h         |   16 +++
 cache.h                |    2 +
 config.c               |    4 +
 environment.c          |    1 +
 sha1_file.c            |   67 +-----------
 t/t1050-large.sh       |   94 +++++++++++++++--
 zlib.c                 |    9 ++-
 11 files changed, 403 insertions(+), 78 deletions(-)
 create mode 100644 bulk-checkin.c
 create mode 100644 bulk-checkin.h

diff --git a/Makefile b/Makefile
index 3139c19..418dd2e 100644
--- a/Makefile
+++ b/Makefile
 <at>  <at>  -505,6 +505,7  <at>  <at>  LIB_H += argv-array.h
 LIB_H += attr.h
 LIB_H += blob.h
 LIB_H += builtin.h
(Continue reading)

Junio C Hamano | 1 Dec 2011 01:27
Picon
Picon
Favicon
Gravatar

[PATCH v2 2/5] create_tmp_packfile(): a helper function

Factor out a small logic out of the private write_pack_file() function
in builtin/pack-objects.c

Signed-off-by: Junio C Hamano <gitster <at> pobox.com>
---
 builtin/pack-objects.c |   12 +++---------
 pack-write.c           |   10 ++++++++++
 pack.h                 |    3 +++
 3 files changed, 16 insertions(+), 9 deletions(-)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 6643c16..3258fa9 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
 <at>  <at>  -584,16 +584,10  <at>  <at>  static void write_pack_file(void)
 		unsigned char sha1[20];
 		char *pack_tmp_name = NULL;

-		if (pack_to_stdout) {
+		if (pack_to_stdout)
 			f = sha1fd_throughput(1, "<stdout>", progress_state);
-		} else {
-			char tmpname[PATH_MAX];
-			int fd;
-			fd = odb_mkstemp(tmpname, sizeof(tmpname),
-					 "pack/tmp_pack_XXXXXX");
-			pack_tmp_name = xstrdup(tmpname);
-			f = sha1fd(fd, pack_tmp_name);
-		}
+		else
(Continue reading)

Junio C Hamano | 1 Dec 2011 01:27
Picon
Picon
Favicon
Gravatar

[PATCH v2 3/5] finish_tmp_packfile(): a helper function

Factor out a small logic out of the private write_pack_file() function
in builtin/pack-objects.c.

This changes the order of finishing multi-pack generation slightly. The
code used to

 - adjust shared perm of temporary packfile
 - rename temporary packfile to the final name
 - update mtime of the packfile under the final name
 - adjust shared perm of temporary idxfile
 - rename temporary idxfile to the final name

but because the helper does not want to do the mtime thing, the updated
code does that step first and then all the rest.

Signed-off-by: Junio C Hamano <gitster <at> pobox.com>
---
 builtin/pack-objects.c |   33 ++++++++++-----------------------
 pack-write.c           |   31 +++++++++++++++++++++++++++++++
 pack.h                 |    1 +
 3 files changed, 42 insertions(+), 23 deletions(-)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 3258fa9..b458b6d 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
 <at>  <at>  -617,20 +617,8  <at>  <at>  static void write_pack_file(void)

 		if (!pack_to_stdout) {
 			struct stat st;
(Continue reading)

Junio C Hamano | 1 Dec 2011 01:27
Picon
Picon
Favicon
Gravatar

[PATCH v2 4/5] csum-file: introduce sha1file_checkpoint

It is useful to be able to rewind a check-summed file to a certain
previous state after writing data into it using sha1write() API. The
fast-import command does this after streaming a blob data to the packfile
being generated and then noticing that the same blob has already been
written, and it does this with a private code truncate_pack() that is
commented as "Yes, this is a layering violation".

Introduce two API functions, sha1file_checkpoint(), that allows the caller
to save a state of a sha1file, and then later revert it to the saved state.
Use it to reimplement truncate_pack().

Signed-off-by: Junio C Hamano <gitster <at> pobox.com>
---
 csum-file.c   |   20 ++++++++++++++++++++
 csum-file.h   |    9 +++++++++
 fast-import.c |   25 ++++++++-----------------
 3 files changed, 37 insertions(+), 17 deletions(-)

diff --git a/csum-file.c b/csum-file.c
index fc97d6e..53f5375 100644
--- a/csum-file.c
+++ b/csum-file.c
 <at>  <at>  -158,6 +158,26  <at>  <at>  struct sha1file *sha1fd_throughput(int fd, const char *name, struct progress *tp
 	return f;
 }

+void sha1file_checkpoint(struct sha1file *f, struct sha1file_checkpoint *checkpoint)
+{
+	sha1flush(f);
+	checkpoint->offset = f->total;
(Continue reading)

Vitor Antunes | 1 Dec 2011 01:33
Picon

Re: [PATCHv2 0/4] git-p4: small fixes to branches and labels; tests

On Wed, Nov 30, 2011 at 10:58 PM, Pete Wyckoff <pw <at> padd.com> wrote:
> This is another fundamental disconnect between p4 and git.
> Reading
>
> http://www.perforce.com/perforce/doc.current/manuals/p4guide/07_labels.html
>
> it is clear that labels are supposed to be used exactly where
> tags cannot:  to specify a collection of files as they existed
> at _different_ points in the commit history.

Check the "Use Tag Fixup Branches" section in fast-import manual, it
might help on this. The basic concept is to create a special branch
that puts all files in the same state the P4 label would put them and
then tag it in git.

Tried to use this for my branch stuff, but with no success.

> Thus I think supporting labels is kind of pointless.  But in the
> restricted use case that perforce docs tell us not to do, namely
> using labels to identify change numbers, git can reflect that
> with tags.

I still use labels as simple tags. Telling that we should use
changelists instead of labels is the same as saying that we should use
IP addresses instead of host names. It works, but I doubt you will
ever remember it unless you write it down somewhere.

--

-- 
Vitor Antunes
(Continue reading)

Vitor Antunes | 1 Dec 2011 01:37
Picon

Re: [PATCHv2 0/4] git-p4: small fixes to branches and labels; tests

On Wed, Nov 30, 2011 at 11:00 PM, Pete Wyckoff <pw <at> padd.com> wrote:
> And avoids collision with some Vitor code that will get
> added eventually.

I'm starting to doubt I will ever be able to overcome the fast-import
limitation on not allowing branch delesetion. Sure, the code I wrote was
garbage! But they seem to be very relunctant on the concept of deleting
branches on the fly.
Did you ever take a look at the patch I sent? Maybe you could help me
shape it up a bit.

--

-- 
Vitor Antunes
Bill Zaumen | 1 Dec 2011 01:41
Picon

Re: [PATCH] Implement fast hash-collision detection

On Wed, 2011-11-30 at 01:25 -0500, Jeff King wrote:
> On Tue, Nov 29, 2011 at 01:56:28PM -0800, Bill Zaumen wrote:
> But I
> think the important attacks bypass your CRC anyway. Consider this attack
> scenario:
> 
>   1. Linus signs a tag (or a commit) and pushes it to kernel.org.
> 
>   2. kernel.org gets hacked, and the attacker replaces an object with
>      an evil colliding version[1].
> 
>   3. I clone from kernel.org, and run "git tag --verify". Git says it's
>      OK, because the signature checks out, but I have a bogus object.
> 
> How does your CRC help? If I understand your scheme correctly,
> kernel.org will have told me the CRC of all of the objects during the
> clone. But that isn't part of what Linus signed, so the attacker in step
> 2 could just as easily have overwritten kernel.org's crc file, and the
> signature will remain valid.

First, there is a misconception - the server will not tell you the CRC.
The CRC will be computed locally by the client instead.  

Aside from that, suppose the attacker does what you suggests (providing
a valid CRC so that git commands like verify-pack don't have an error
to detect).  You can't tell that something is wrong, but Linus can - 
the next time he does a fetch.  If he fetches, the server sends
some SHA-1 hashes and the client responds with 'have' or 'want' in a
reply.  If the client wants it, the client doesn't have a CRC, if the
client sends 'have', the CRCs are available so those get sent.  The
(Continue reading)

Peter Williams | 1 Dec 2011 01:56
Picon

Re: [HELP] Adding git awareness to the darning patch management system.

On 30/11/11 17:22, Jeff King wrote:
> On Wed, Nov 30, 2011 at 12:17:22PM +1000, Peter Williams wrote:
>
>> 1. presenting the file tree of the sources being patched in a way
>> that makes sense to the user including the current status of files
>> from the point of view of the underlying SCM (in this case, git), and
>
> I'm not exactly sure what this means.

If you look at the screenshots at sourceforge (which were produced on 
top of a Mercurial repo) you'll notice that file names in the left most 
tree have letters in front of them and appear in different foreground 
colours.  These letters are the same as those returned by Mercurial's 
status command and, hence, give a Mercurial user an easy to understand 
snapshot of the status of the files in the playground.  The colour 
coding is (relatively) arbitrary (and chosen by me) and is intended to 
make it easier to detect the different file statuses.

My main problem is that I can't find a git file status command (and 
there are a lot of them to choose from) that gives a snapshot of the 
statuses of all files in a directory (including those not tracked or 
ignored).  A secondary problem is that, if I could cobble together 
statuses from various commands, mapping git statuses to the Mercurial 
ones for display would not be a good solution as they would not 
necessarily make sense to a git user.  (It's fairly clear to me from my 
inability to make sense of git's CLI that git users think differently to 
me, a Mercurial user, and it's unlikely that I can, without help, make a 
file tree display that makes sense to a git user.)

>
(Continue reading)


Gmane