In builtin.h, there exists the distinctly lib-ish function
prune_packed_objects(). This function can currently only be called by
built-in commands but, unlike all of the other functions in the header,
it does not make sense to impose this restriction as the functionality
can be logically reused in libgit.
Extract this function into prune-packed.c so that related definitions
can exist clearly in their own header file.
While we're at it, clean up #includes that are unused.
This patch is best viewed with --color-moved.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The for_each_loose_object() and for_each_packed_object()
functions are meant to be part of a unified interface: they
use the same set of for_each_object_flags, and it's not
inconceivable that we might one day add a single
for_each_object() wrapper around them.
Let's put them together in a single file, so we can avoid
awkwardness like saying "the flags for this function are
over in cache.h". Moving the loose functions to packfile.h
is silly. Moving the packed functions to cache.h works, but
makes the "cache.h is a kitchen sink" problem worse. The
best place is the recently-created object-store.h, since
these are quite obviously related to object storage.
The for_each_*_in_objdir() functions do not use the same
flags, but they are logically part of the same interface as
for_each_loose_object(), and share callback signatures. So
we'll move those, as well, as they also make sense in
object-store.h.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert this function to take a pointer to struct object_id and rename
it has_object_pack for consistency with has_object_file.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We used to expose the full power of the delayed progress API to the
callers, so that they can specify, not just the message to show and
expected total amount of work that is used to compute the percentage
of work performed so far, the percent-threshold parameter P and the
delay-seconds parameter N. The progress meter starts to show at N
seconds into the operation only if we have not yet completed P per-cent
of the total work.
Most callers used either (0%, 2s) or (50%, 1s) as (P, N), but there
are oddballs that chose more random-looking values like 95%.
For a smoother workload, (50%, 1s) would allow us to start showing
the progress meter earlier than (0%, 2s), while keeping the chance
of not showing progress meter for long running operation the same as
the latter. For a task that would take 2s or more to complete, it
is likely that less than half of it would complete within the first
second, if the workload is smooth. But for a spiky workload whose
earlier part is easier, such a setting is likely to fail to show the
progress meter entirely and (0%, 2s) is more appropriate.
But that is merely a theory. Realistically, it is of dubious value
to ask each codepath to carefully consider smoothness of their
workload and specify their own setting by passing two extra
parameters. Let's simplify the API by dropping both parameters and
have everybody use (0%, 2s).
Oh, by the way, the percent-threshold parameter and the structure
member were consistently misspelled, which also is now fixed ;-)
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Loose object subdirectories have hexadecimal names based on the first
byte of the hash of contained objects, thus their numerical
representation can range from 0 (0x00) to 255 (0xff). Change the type
of the corresponding variable in for_each_file_in_obj_subdir() and
associated callback functions to unsigned int and add a range check.
Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert each_loose_object_fn and each_packed_object_fn to take a pointer
to struct object_id. Update the various callbacks. Convert several
40-based constants to use GIT_SHA1_HEXSZ.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This patch puts the usage info strings that were not already in docopt-
like format into docopt-like format, which will be a litle easier for
end users and a lot easier for translators. Changes include:
- Placing angle brackets around fill-in-the-blank parameters
- Putting dashes in multiword parameter names
- Adding spaces to [-f|--foobar] to make [-f | --foobar]
- Replacing <foobar>* with [<foobar>...]
Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
Reviewed-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This saves us from manually traversing the directory
structure ourselves.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We form all of our directories in a strbuf, but never release it.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A/very/long/path/to/.git that becomes exactly PATH_MAX bytes long
after suffixed with /objects/??/??38-hex??, would have overflown
the on-stack pathname[] buffer.
Noticed-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit b60daf0 (Make git-prune-packed a bit more chatty. - 2007-01-12)
changes the meaning of prune_packed_objects()'s argument, from "dry
run or not dry run" to a bitmap.
It however forgot to update prune_packed_objects() caller in
builtin/prune.c to use new DRY_RUN macro. It's fine (for a long time!)
but there is a risk that someday someone may change the value of
DRY_RUN to something else and builtin/prune.c suddenly breaks. Avoid
that possibility.
While at there, change "opts == VERBOSE" to "opts & VERBOSE" as there
is no obvious reason why we only be chatty when DRY_RUN is not set.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Both git-prune and git-repack (and thus, git-gc) try to rmdir while
holding a DIR* handle on the directory. This can leave dangling
empty directories in the .git/objects on platforms where directory
cannot be removed while they are open.
First call closedir() and then rmdir(); that is more logical ordering.
Reported-by: John Chen <john0312@gmail.com>
Reported-by: Stefan Naewe <stefan.naewe@gmail.com>
Signed-off-by: Karsten Blees <blees@dcon.de>
Improved-and-Acked-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This shrinks the top-level directory a bit, and makes it much more
pleasant to use auto-completion on the thing. Instead of
[torvalds@nehalem git]$ em buil<tab>
Display all 180 possibilities? (y or n)
[torvalds@nehalem git]$ em builtin-sh
builtin-shortlog.c builtin-show-branch.c builtin-show-ref.c
builtin-shortlog.o builtin-show-branch.o builtin-show-ref.o
[torvalds@nehalem git]$ em builtin-shor<tab>
builtin-shortlog.c builtin-shortlog.o
[torvalds@nehalem git]$ em builtin-shortlog.c
you get
[torvalds@nehalem git]$ em buil<tab> [type]
builtin/ builtin.h
[torvalds@nehalem git]$ em builtin [auto-completes to]
[torvalds@nehalem git]$ em builtin/sh<tab> [type]
shortlog.c shortlog.o show-branch.c show-branch.o show-ref.c show-ref.o
[torvalds@nehalem git]$ em builtin/sho [auto-completes to]
[torvalds@nehalem git]$ em builtin/shor<tab> [type]
shortlog.c shortlog.o
[torvalds@nehalem git]$ em builtin/shortlog.c
which doesn't seem all that different, but not having that annoying
break in "Display all 180 possibilities?" is quite a relief.
NOTE! If you do this in a clean tree (no object files etc), or using an
editor that has auto-completion rules that ignores '*.o' files, you
won't see that annoying 'Display all 180 possibilities?' message - it
will just show the choices instead. I think bash has some cut-off
around 100 choices or something.
So the reason I see this is that I'm using an odd editory, and thus
don't have the rules to cut down on auto-completion. But you can
simulate that by using 'ls' instead, or something similar.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This matches the behavior of other git programs, and helps
keep cruft out of things like cron job output.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add long options for dry run and quiet to be more consistent with the
rest of git.
Signed-off-by: Stephen Boyd <bebarino@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This helps to notice when something's going wrong, especially on
systems which lock open files.
I used the following criteria when selecting the code for replacement:
- it was already printing a warning for the unlink failures
- it is in a function which already printing something or is
called from such a function
- it is in a static function, returning void and the function is only
called from a builtin main function (cmd_)
- it is in a function which handles emergency exit (signal handlers)
- it is in a function which is obvously cleaning up the lockfiles
Signed-off-by: Alex Riesen <raa.lkml@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A progress indicator is used to count through the 256 object fan-out
directories while unused object files are removed. (However, it becomes
visible only if this process takes long enough.)
Previously, display_progress() was only called if object files were
actually removed. But if directories towards the end (fd/, fe/, ff/) did
not exist, this could leave a strange line
Removing duplicate objects: 99% (255/256), done.
in the terminal instead of the expected "100% (256/256)".
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Most of the callers of this function except only one pass NULL to its last
parameter, ignore_packed.
Introduce has_sha1_kept_pack() function that has the function signature
and the semantics of this function, and convert the sole caller that does
not pass NULL to call this new function.
All other callers and has_sha1_pack() lose the ignore_packed parameter.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When you misuse a git command, you are shown the usage string.
But this is currently shown in the dashed form. So if you just
copy what you see, it will not work, when the dashed form
is no longer supported.
This patch makes git commands show the dash-less version.
For shell scripts that do not specify OPTIONS_SPEC, git-sh-setup.sh
generates a dash-less usage string now.
Signed-off-by: Stephan Beyer <s-beyer@gmx.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since the pack-files are now always created stably on disk, there is no
need to sync() before pruning lose objects or old stale pack-files.
[jc: with Nico's clean-up]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This reverts commit 0e54913796 so to return
to the same state as commit b5d72f0a4c.
On Wed, 31 Oct 2007, Shawn O. Pearce wrote:
> During my testing with a 40,000 loose object case (yea, I fully
> unpacked a git.git clone I had laying around) my system stalled
> hard in the first object directory. A *lot* longer than 1 second.
> So I got no progress meter for a long time, and then a progress
> meter appeared on the second directory.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since it is now OK to pass a null pointer to display_progress() and
stop_progress() resulting in a no-op, then we can simplify the code
and remove a bunch of lines by not making those calls conditional all
the time.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This allows for better management of progress "object" existence,
as well as making the progress display implementation more independent
from its callers.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The progress count is per fanout directory, so it is useless to call
it for every file as the count doesn't change that often.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rather than reimplementing the progress meter logic and always
showing 100 lines of output while pruning already packed objects
we now use a delayed progress meter and only show it if there are
enough objects to make us take a little while.
Most users won't see the message anymore as it usually doesn't take
very long to delete the already packed loose objects. This neatens
the output of a git-gc or git-repack execution, which is especially
important for a `git gc --auto` triggered from within another
command.
We perform the display_progress() call from within the very innermost
loop in case we spend more than 1 second within any single object
directory. This ensures that a progress_update event from the
timer will still trigger in a timely fashion and allow the user to
see the progress meter.
While I'm in here I changed the message to be more descriptive of
its actual task. "Removing unused objects" is a little scary for
new users as they wonder where these unused objects came from and
how they should avoid them. Truth is these objects aren't unused
in the sense of what git-prune would call a dangling object, these
are used but are just duplicates of things we have already stored
in a packfile.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Steven Grimm noticed that git-repack's verbosity is inconsistent
because pack-objects is chatty and prune-packed is not. This
makes the latter a bit more chatty and gives -q option to
squelch it.
Signed-off-by: Junio C Hamano <junkio@cox.net>
The dryrun variable was made local instead of static by the previous
commit, and local variables aren't initialized to zero.
Signed-off-by: Alexandre Julliard <julliard@winehq.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Both the git-prune manpage and everday.txt say that git-prune should also prune
unpacked objects that are also found in packs, by running git prune-packed.
Junio thought this was "a regression when prune was rewritten as a built-in."
So modify prune to call prune-packed again.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Incremental repack without -a essentially boils down to:
rev-list --objects --unpacked --all |
pack-objects $new_pack
which picks up all loose objects that are still live and creates
a new pack.
This implements --unpacked=<existing pack> option to tell the
revision walking machinery to pretend as if objects in such a
pack are unpacked for the purpose of object listing. With this,
we could say:
rev-list --objects --unpacked=$active_pack --all |
pack-objects $new_pack
instead, to mean "all live loose objects but pretend as if
objects that are in this pack are also unpacked". The newly
created pack would be perfect for updating $active_pack by
replacing it.
Since pack-objects now knows how to do the rev-list's work
itself internally, you can also write the above example by:
pack-objects --unpacked=$active_pack --all $new_pack </dev/null
Signed-off-by: Junio C Hamano <junkio@cox.net>
These commands are converted to run from a subdirectory.
commit-tree convert-objects merge-base merge-index mktag
pack-objects pack-redundant prune-packed read-tree tar-tree
unpack-file unpack-objects update-server-info write-tree
Signed-off-by: Junio C Hamano <junkio@cox.net>
The git philosophy when it comes to disk accesses is "Laugh in the face of
danger".
Notably, since we never modify an existing object, we don't really care
that deeply about flushing things to disk, since even if the machine
crashes in the middle of a git operation, you can never really have lost
any old work. At most, you'd need to figure out the proper heads (which
git-fsck-objects can do for you) and re-do the operation.
However, there's two exceptions to this: pruning and repacking. Those
operations will actually _delete_ old objects that they know about in
other ways (ie that they just repacked, or that they have found in other
places).
However, since they actually modify old state, we should thus be a bit
more careful about them. If the machine crashes and the duplicate new
objects haven't been flushed to disk, you can actually be in trouble.
This is trivially stupid about it by calling "sync" before removing the
objects. Not very smart, but we're talking about special operations than
are usually done once a week if that.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This removes the unoptimization. The previous round does not mind
missing fan-out directories, but still makes sure they exist, lest
older versions choke on a repository created/packed by it.
This round does not play that nicely anymore -- empty fan-out
directories are not created by init-db, and will stay removed by
prune-packed. The prune command also removes empty fan-out directories.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This will be removed when merging the second phase of Linus' "Create
object subdirectories on demand" change anyway, but the code to
recreate the empty .git/objects/??/ directory was confused.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This makes it possible to have a "sparse" git object subdirectory
structure, something that has become much more attractive now that people
use pack-files all the time.
As a result of pack-files, a git object directory doesn't necessarily have
any individual objects lying around, and in that case it's just wasting
space to keep the empty first-level object directories around: on many
filesystems the 256 empty directories will be aboue 1MB of diskspace.
Even more importantly, after you re-pack a project that _used_ to be
unpacked, you could be left with huge directories that no longer contain
anything, but that waste space and take time to look through.
With this change, "git prune-packed" can just do an rmdir() on the
directories, and they'll get removed if empty, and re-created on demand.
This patch also tries to fix up "write_sha1_from_fd()" to use the new
common infrastructure for creating the object files, closing a hole where
we might otherwise leave half-written objects in the object database.
[jc: I unoptimized the part that really removes the fan-out directories
to ease transition. init-db still wastes 1MB of diskspace to hold 256
empty fan-outs, and prune-packed rmdir()'s the grown but empty directories,
but runs mkdir() immediately after that -- reducing the saving from 150KB
to 146KB. These parts will be re-introduced when everybody has the
on-demand capability.]
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>