Dana How | 1 May 01:16 2007
Picon

[PATCH 0/8] git-repack --max-pack-size


The three most common ways of making large packfiles
are git-fast-import, the first git-repack, or git-repack -a.
The first already supports a "--max-pack-size=N" option,
which limits the resulting packfiles to N megabytes.
This patchset adds the same option, with the same
behavior, to git-repack to handle the other two cases.

This iteration fixes miscellaneous issues discussed on the list
and introduces no behavior not seen elsewhere in git.
It is based on "next" in order to incorporate Nicolas Pitre's work.
Dana How | 1 May 01:17 2007
Picon

[PATCH 1/8] git-repack --max-pack-size: add new file statics


This adds "pack_size_limit", which will contain the limit
specified by --max-pack-size, "written_list", the actual
list of objects written to the current pack, and "nr_written",
the number of objects in written_list.

Signed-off-by: Dana L. How <danahow <at> gmail.com>
---
 builtin-pack-objects.c |    4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/builtin-pack-objects.c b/builtin-pack-objects.c
index b827627..ac2c15e 100644
--- a/builtin-pack-objects.c
+++ b/builtin-pack-objects.c
 <at>  <at>  -52,7 +52,8  <at>  <at>  struct object_entry {
  * nice "minimum seek" order.
  */
 static struct object_entry *objects;
-static uint32_t nr_objects, nr_alloc, nr_result;
+static struct object_entry **written_list;
+static uint32_t nr_objects, nr_alloc, nr_result, nr_written;

 static int non_empty;
 static int no_reuse_delta;
 <at>  <at>  -64,6 +65,7  <at>  <at>  static char tmpname[PATH_MAX];
 static unsigned char pack_file_sha1[20];
 static int progress = 1;
 static int window = 10;
+static uint32_t pack_size_limit;
(Continue reading)

Dana How | 1 May 01:19 2007
Picon

[PATCH 2/8] git-repack --max-pack-size: code restructuring


Move write_index_file() call from cnd_pack_objects() to
write_pack_file() since only the latter will know how
many times to call write_index_file().  Add forward
declarations and make "base_name" file scope again.

Signed-off-by: Dana L. How <danahow <at> gmail.com>
---
 builtin-pack-objects.c |   64 ++++++++++++++++++++++++-----------------------
 1 files changed, 33 insertions(+), 31 deletions(-)

diff --git a/builtin-pack-objects.c b/builtin-pack-objects.c
index ac2c15e..bc45ca6 100644
--- a/builtin-pack-objects.c
+++ b/builtin-pack-objects.c
 <at>  <at>  -62,6 +62,7  <at>  <at>  static int incremental;
 static int allow_ofs_delta;
 static const char *pack_tmp_name, *idx_tmp_name;
 static char tmpname[PATH_MAX];
+static const char *base_name;
 static unsigned char pack_file_sha1[20];
 static int progress = 1;
 static int window = 10;
 <at>  <at>  -561,7 +562,11  <at>  <at>  static off_t write_one(struct sha1file *f,
 	return offset + size;
 }

-static off_t write_pack_file(void)
+/* forward declarations for write_pack_file */
+static void write_index_file(off_t last_obj_offset, unsigned char *sha1);
(Continue reading)

Junio C Hamano | 1 May 01:19 2007
Picon
Picon
Picon

Re: [PATCH 1/5] Introduces for_each_revision() helper

"Shawn O. Pearce" <spearce <at> spearce.org> writes:

> But in_merge_base is heavyweight if the two commits are in the
> same object database, but aren't connected at all.  You'll need
> to traverse both histories before aborting and saying there is
> no merge base.  That ain't cheap on large trees.  But its also a
> single line of code.

Who said anything about "a single line of code"?  That is quite
different from heaviness hidden in a control structure
lookalike.

Dana How | 1 May 01:20 2007
Picon

[PATCH 3/8] git-repack --max-pack-size: make close optional in sha1close()


sha1close() flushes, writes checksum, and closes.
The 2nd can be suppressed; make the last suppressible as well.

Signed-off-by: Dana L. How <danahow <at> gmail.com>
---
 csum-file.c |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/csum-file.c b/csum-file.c
index 7c806ad..993c899 100644
--- a/csum-file.c
+++ b/csum-file.c
 <at>  <at>  -35,7 +35,10  <at>  <at>  int sha1close(struct sha1file *f, unsigned char *result, int update)
 	if (offset) {
 		SHA1_Update(&f->ctx, f->buffer, offset);
 		sha1flush(f, offset);
+		f->offset = 0;
 	}
+	if (update < 0)
+		return 0;	/* only want to flush (no checksum write, no close) */
 	SHA1_Final(f->buffer, &f->ctx);
 	if (result)
 		hashcpy(result, f->buffer);
--

-- 
1.5.2.rc0.766.gba60-dirty

Junio C Hamano | 1 May 01:20 2007
Picon
Picon
Picon

Re: What's cooking in git.git (topics)

Linus Torvalds <torvalds <at> linux-foundation.org> writes:

> On Sun, 29 Apr 2007, Junio C Hamano wrote:
>>
>> * lt/objalloc (Mon Apr 16 22:13:09 2007 -0700) 1 commit
>>  - Make the object lookup hash use a "object index" instead of a
>>    pointer
>
> I think you should just drop this. 

Yeah.  I was mostly concentrating on maint/master for the past
several days, and blindly carrying it around was cheaper than
deciding to drop it in my workflow.

Dana How | 1 May 01:21 2007
Picon

[PATCH 4/8] git-repack --max-pack-size: add fixup_header_footer()


Add our own version of the one in fast-import.c here.
Needed later to correct bad object count in header for split pack.

Signed-off-by: Dana L. How <danahow <at> gmail.com>
---
 builtin-pack-objects.c |   36 ++++++++++++++++++++++++++++++++++++
 1 files changed, 36 insertions(+), 0 deletions(-)

diff --git a/builtin-pack-objects.c b/builtin-pack-objects.c
index bc45ca6..98066bf 100644
--- a/builtin-pack-objects.c
+++ b/builtin-pack-objects.c
 <at>  <at>  -562,6 +562,42  <at>  <at>  static off_t write_one(struct sha1file *f,
 	return offset + size;
 }

+static void fixup_header_footer(int pack_fd, unsigned char *pack_file_sha1,
+				char *pack_name, uint32_t object_count)
+{
+	static const int buf_sz = 128 * 1024;
+	SHA_CTX c;
+	struct pack_header hdr;
+	char *buf;
+
+	if (lseek(pack_fd, 0, SEEK_SET) != 0)
+		die("Failed seeking to start: %s", strerror(errno));
+	if (read_in_full(pack_fd, &hdr, sizeof(hdr)) != sizeof(hdr))
+		die("Unable to reread header of %s", pack_name);
+	if (lseek(pack_fd, 0, SEEK_SET) != 0)
(Continue reading)

Dana How | 1 May 01:22 2007
Picon

[PATCH 5/8] git-repack --max-pack-size: write_object() takes "limit" arg


Accept new "limit" argument and check against it
before each group of writes.  Update delta usability rules
for possibility of delta base being in a previously-
written pack.  Inline sha1write_compress() so we know
the exact size of the written data when it needs to be compressed.

Signed-off-by: Dana L. How <danahow <at> gmail.com>
---
 builtin-pack-objects.c |  110 +++++++++++++++++++++++++++++++++++++----------
 1 files changed, 86 insertions(+), 24 deletions(-)

diff --git a/builtin-pack-objects.c b/builtin-pack-objects.c
index 98066bf..d3ebe1d 100644
--- a/builtin-pack-objects.c
+++ b/builtin-pack-objects.c
 <at>  <at>  -399,12 +399,14  <at>  <at>  static int revalidate_loose_object(struct object_entry *entry,
 }

 static unsigned long write_object(struct sha1file *f,
-				  struct object_entry *entry)
+				  struct object_entry *entry,
+				  unsigned long limit)
 {
 	unsigned long size;
 	enum object_type type;
 	void *buf;
 	unsigned char header[10];
+	unsigned char dheader[10];
 	unsigned hdrlen;
(Continue reading)

Dana How | 1 May 01:23 2007
Picon

[PATCH 6/8] git-repack --max-pack-size: write_one() implements limits


If --max-pack-size is specified,  generate the appropriate
write limit for each object and pass it to write_object().
Detect and return write "failure".

Signed-off-by: Dana L. How <danahow <at> gmail.com>
---
 builtin-pack-objects.c |   10 ++++++++--
 1 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/builtin-pack-objects.c b/builtin-pack-objects.c
index d3ebe1d..b50de05 100644
--- a/builtin-pack-objects.c
+++ b/builtin-pack-objects.c
 <at>  <at>  -612,11 +612,17  <at>  <at>  static off_t write_one(struct sha1file *f,
 		return offset;

 	/* if we are deltified, write out base object first. */
-	if (e->delta)
+	if (e->delta) {
 		offset = write_one(f, e->delta, offset);
+		if (!offset)
+			return 0;
+	}

 	e->offset = offset;
-	size = write_object(f, e, 0);
+	/* pass in write limit if limited packsize and not first object */
+	size = write_object(f, e, pack_size_limit && nr_written ? pack_size_limit - offset : 0);
+	if (!size)
(Continue reading)

Dana How | 1 May 01:24 2007
Picon

[PATCH 7/8] git-repack --max-pack-size: split packs as asked by write_object/write_one


Rewrite write_pack_file() to break to a new packfile
whenever write_object/write_one request it,  and
correct the header's object count in the previous packfile.
Change write_index_file() to write an index
for just the objects in the most recent packfile.

Signed-off-by: Dana L. How <danahow <at> gmail.com>
---
 builtin-pack-objects.c |  161 ++++++++++++++++++++++++++----------------------
 1 files changed, 87 insertions(+), 74 deletions(-)

diff --git a/builtin-pack-objects.c b/builtin-pack-objects.c
index b50de05..328b3cb 100644
--- a/builtin-pack-objects.c
+++ b/builtin-pack-objects.c
 <at>  <at>  -623,6 +623,7  <at>  <at>  static off_t write_one(struct sha1file *f,
 	size = write_object(f, e, pack_size_limit && nr_written ? pack_size_limit - offset : 0);
 	if (!size)
 		return e->offset = 0;
+	written_list[nr_written++] = e;

 	/* make sure off_t is sufficiently large not to wrap */
 	if (offset > offset + size)
 <at>  <at>  -631,7 +632,7  <at>  <at>  static off_t write_one(struct sha1file *f,
 }

 static void fixup_header_footer(int pack_fd, unsigned char *pack_file_sha1,
-				char *pack_name, uint32_t object_count)
+				const char *pack_name, uint32_t object_count)
(Continue reading)


Gmane