On Tue, 14 Feb 2006, Junio C Hamano wrote:
> Linus Torvalds <torvalds@osdl.org> writes:
>
> > If somebody is interested in making the "lots of filename changes" case go
> > fast, I'd be more than happy to walk them through what they'd need to
> > change. I'm just not horribly motivated to do it myself. Hint, hint.
>
> In case anybody is wondering, I share the same feeling. I
> cannot say I'd be "more than happy to" clean up potential
> breakages during the development of such changes, but if the
> change eventually would help certain use cases, I can be
> persuaded to help debugging such a mess ;-).
Actually, I got interested in seeing how hard this is, and wrote a simple
first cut at doing a tree-optimized merger.
Let me shout a bit first:
THIS IS WORKING CODE, BUT BE CAREFUL: IT'S A TECHNOLOGY DEMONSTRATION
RATHER THAN THE FINAL PRODUCT!
With that out of the way, let me descibe what this does (and then describe
the missing parts).
This is basically a three-way merge that works entirely on the "tree"
level, rather than on the index. A lot of the _concepts_ are the same,
though, and if you're familiar with the results of an index merge, some of
the output will make more sense.
You give it three trees: the base tree (tree 0), and the two branches to
be merged (tree 1 and tree 2 respectively). It will then walk these three
trees, and resolve them as it goes along.
The interesting part is:
- it can resolve whole sub-directories in one go, without actually even
looking recursively at them. A whole subdirectory will resolve the same
way as any individual files will (although that may need some
modification, see later).
- if it has a "content conflict", for subdirectories that means "try to
do a recursive tree merge", while for non-subdirectories it's just a
content conflict and we'll output the stage 1/2/3 information.
- a successful merge will output a single stage 0 ("merged") entry,
potentially for a whole subdirectory.
- it outputs all the resolve information on stdout, so something like the
recursive resolver can pretty easily parse it all.
Now, the caveats:
- we probably need to be more careful about subdirectory resolves. The
trivial case (both branches have the exact same subdirectory) is a
trivial resolve, but the other cases ("branch1 matches base, branch2 is
different" probably can't be silently just resolved to the "branch2"
subdirectory state, since it might involve renames into - or out of -
that subdirectory)
- we do not track the current index file at all, so this does not do the
"check that index matches branch1" logic that the three-way merge in
git-read-tree does. The theory is that we'd do a full three-way merge
(ignoring the index and working directory), and then to update the
working tree, we'd do a two-way "git-read-tree branch1->result"
- I didn't actually make it do all the trivial resolve cases that
git-read-tree does. It's a technology demonstration.
Finally (a more serious caveat):
- doing things through stdout may end up being so expensive that we'd
need to do something else. In particular, it's likely that I should
not actually output the "merge results", but instead output a "merge
results as they _differ_ from branch1"
However, I think this patch is already interesting enough that people who
are interested in merging trees might want to look at it. Please keep in
mind that tech _demo_ part, and in particular, keep in mind the final
"serious caveat" part.
In many ways, the really _interesting_ part of a merge is not the result,
but how it _changes_ the branch we're merging into. That's particularly
important as it should hopefully also mean that the output size for any
reasonable case is minimal (and tracks what we actually need to do to the
current state to create the final result).
The code very much is organized so that doing the result as a "diff
against branch1" should be quite easy/possible. I was actually going to do
it, but I decided that it probably makes the output harder to read. I
dunno.
Anyway, let's think about this kind of approach.. Note how the code itself
is actually quite small and short, although it's prbably pretty "dense".
As an interesting test-case, I'd suggest this merge in the kernel:
git-merge-tree $(git-merge-base 4cbf876 7d2babc) 4cbf876 7d2babc
which resolves beautifully (there are no actual file-level conflicts), and
you can look at the output of that command to start thinking about what
it does.
The interesting part (perhaps) is that timing that command for me shows
that it takes all of 0.004 seconds.. (the git-merge-base thing takes
considerably more ;)
The point is, we _can_ do the actual merge part really really quickly.
Linus
PS. Final note: when I say that it is "WORKING CODE", that is obviously by
my standards. IOW, I tested it once and it gave reasonable results - so it
must be perfect.
Whether it works for anybody else, or indeed for any other test-case, is
not my problem ;)
Signed-off-by: Junio C Hamano <junkio@cox.net>
If Git is compiled with NO_CURL=YesPlease and one tries to
clone a http repository, git-clone tries to call the curl
binary. This trivial patch prints an error instead in such
situation.
Signed-off-by: Fernando J. Pereda <ferdy@gentoo.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
With the current Makefile we don't use the shell chosen by the
platform specific defines when we invoke GIT-VERSION-GEN.
Signed-off-by: Fredrik Kuivinen <freku045@student.liu.se>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This revamps the git-status command to take the same set of
parameters as git commit. It gives a preview of what is being
committed with that command. With -v flag, it shows the diff
output between the HEAD commit and the index that would be
committed if these flags were given to git-commit command.
git-commit also acquires -v flag (it used to mean "verify" but
that is the default anyway and there is --no-verify to turn it
off, so not much is lost), which uses the updated git-status -v
to seed the commit log buffer. This is handy for writing a log
message while reviewing the changes one last time.
Now, git-commit and git-status are internally share the same
implementation.
Unlike previous git-commit change, this uses a temporary index
to prepare the index file that would become the real index file
after a successful commit, and moves it to the real index file
once the commit is actually made. This makes it safer than the
previous scheme, which stashed away the original index file and
restored it after an aborted commit.
Signed-off-by: Junio C Hamano <junkio@cox.net>
In a workflow that employs relatively long lived topic branches,
the developer sometimes needs to resolve the same conflict over
and over again until the topic branches are done (either merged
to the "release" branch, or sent out and accepted upstream).
This commit introduces a new command, "git rerere", to help this
process by recording the conflicted automerge results and
corresponding hand-resolve results on the initial manual merge,
and later by noticing the same conflicted automerge and applying
the previously recorded hand resolution using three-way merge.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This is essentially 'git whatchanged -n1 --always --cc "$@"'.
Just like whatchanged takes default flags from
whatchanged.difftree configuration, this uses show.difftree
configuration.
Signed-off-by: Junio C Hamano <junkio@cox.net>
A new option '-c' to diff-tree changes the way a merge commit is
displayed when generating a patch output. It shows a "combined
diff" (hence the option letter 'c'), which looks like this:
$ git-diff-tree --pretty -c -p fec9ebf1 | head -n 18
diff-tree fec9ebf... (from parents)
Merge: 0620db3... 8a263ae...
Author: Junio C Hamano <junkio@cox.net>
Date: Sun Jan 15 22:25:35 2006 -0800
Merge fixes up to GIT 1.1.3
diff --combined describe.c
@@@ +98,7 @@@
return (a_date > b_date) ? -1 : (a_date == b_date) ? 0 : 1;
}
- static void describe(char *arg)
- static void describe(struct commit *cmit, int last_one)
++ static void describe(char *arg, int last_one)
{
+ unsigned char sha1[20];
+ struct commit *cmit;
There are a few things to note about this feature:
- The '-c' option implies '-p'. It also implies '-m' halfway
in the sense that "interesting" merges are shown, but not all
merges.
- When a blob matches one of the parents, we do not show a diff
for that path at all. For a merge commit, this option shows
paths with real file-level merge (aka "interesting things").
- As a concequence of the above, an "uninteresting" merge is
not shown at all. You can use '-m' in addition to '-c' to
show the commit log for such a merge, but there will be no
combined diff output.
- Unlike "gitk", the output is monochrome.
A '-' character in the nth column means the line is from the nth
parent and does not appear in the merge result (i.e. removed
from that parent's version).
A '+' character in the nth column means the line appears in the
merge result, and the nth parent does not have that line
(i.e. added by the merge itself or inherited from another
parent).
The above example output shows that the function signature was
changed from either parents (hence two "-" lines and a "++"
line), and "unsigned char sha1[20]", prefixed by a " +", was
inherited from the first parent.
The code as sent to the list was buggy in few corner cases,
which I have fixed since then.
It does not bother to keep track of and show the line numbers
from parent commits, which it probably should.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Needs iconv and third party lib/headers are inside /usr/local
Signed-off-by: Alecs King <alecsk@gmail.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Alas, not all shells named sh are capable enough to run
GIT-VERSION-GEN.
Signed-off-by: Jason Riedy <ejr@cs.berkeley.edu>
Signed-off-by: Junio C Hamano <junkio@cox.net>
The recent Cygwin defines DT_UNKNOWN although it does not have d_type
in struct dirent. Give an option to tell us not to use d_type on such
platforms. Hopefully this problem will be transient.
Signed-off-by: Junio C Hamano <junkio@cox.net>
The d_ino field is only used for performance reasons in
fsck-objects. On a typical filesystem, i-number tends to have a
strong correlation with where the actual bits sit on the disk
platter, and we sort the entries to allow us scan things that
ought to be close together together.
If the platform lacks support for it, it is not a big deal.
Just do not use d_ino for sorting, and scan them unsorted.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Disable USE_SYMLINK_HEAD by default. Recommend using it only for
compatibility with older software.
Treat USE_SYMLINK_HEAD like other optional defines - check whether it's
defined, not its value.
Signed-off-by: Pavel Roskin <proski@gnu.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
The earlier change to separate $(gitexecdir) from $(bindir) had
the installation location of the git wrapper and the rest of the
commands the wrong way (right now, both of them point at the
same location so there is no real harm).
Also gitk needs to be installed in $(bindir).
Signed-off-by: Junio C Hamano <junkio@cox.net>
The git suite may not be in PATH (and thus programs such as
git-send-pack could not exec git-rev-list). Thus there is a need for
logic that will locate these programs. Modifying PATH is not
desirable as it result in behavior differing from the user's
intentions, as we may end up prepending "/usr/bin" to PATH.
- git C programs will use exec*_git_cmd() APIs to exec sub-commands.
- exec*_git_cmd() will execute a git program by searching for it in
the following directories:
1. --exec-path (as used by "git")
2. The GIT_EXEC_PATH environment variable.
3. $(gitexecdir) as set in Makefile (default value $(bindir)).
- git wrapper will modify PATH as before to enable shell scripts to
invoke "git-foo" commands.
Ideally, shell scripts should use the git wrapper to become independent
of PATH, and then modifying PATH will not be necessary.
[jc: with minor updates after a brief review.]
Signed-off-by: Michal Ostrowski <mostrows@watson.ibm.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This is not invoked by any other target (most notably, "make
install" does not), but is provided as a convenience for people
who are building from the source.
Signed-off-by: Junio C Hamano <junkio@cox.net>
When producing a release tarball, include a "version" file, which
GIT-VERSION-GEN can then use to do the right thing when building from a
tarball.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
The official maintainer is keeping up-to-date quite well, and now
the older Debian is supported with backports.org, there is no reason
for me to keep debian/ directory around here.
I have not been building and publishing debs since 1.0.4 anyway.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Note: with this commit, the GIT maintainer workflow must change.
GIT-VERSION-GEN is now the file to munge when the default
version needs to be changed, not Makefile. The tag needs to be
pushed into the repository to build the official tarball and
binary package beforehand.
Signed-off-by: Junio C Hamano <junkio@cox.net>
It shows you the most recent tag that is reachable from a particular
commit is.
Maybe this is something that "git-name-rev" should be taught to do,
instead of having a separate command for it. Regardless, I find it useful.
What it does is to take any random commit, and "name" it by looking up the
most recent commit that is tagged and reachable from that commit. If the
match is exact, it will just print out that ref-name directly. Otherwise
it will print out the ref-name, followed by the 8-character "short SHA".
IOW, with something like Junios current tree, I get:
[torvalds@g5 git]$ git-describe parent
refs/tags/v1.0.4-g2414721b
ie the current head of my "parent" branch (ie Junio) is based on v1.0.4,
but since it has a few commits on top of that, it has added the git hash
of the thing to the end: "-g" + 8-char shorthand for the commit
2414721b19.
Doing a "git-describe" on a tag-name will just show the full tag path:
[torvalds@g5 git]$ git-describe v1.0.4
refs/tags/v1.0.4
unless there are _other_ tags pointing to that commit, in which case it
will just choose one at random.
This is useful for two things:
- automatic version naming in Makefiles, for example. We could use it in
git itself: when doing "git --version", we could use this to give a
much more useful description of exactly what version was installed.
- for any random commit (say, you use "gitk <pathname>" or
"git-whatchanged" to look at what has changed in some file), you can
figure out what the last version of the repo was. Ie, say I find a bug
in commit 39ca371c45b04cd50d0974030ae051906fc516b6, I just do:
[torvalds@g5 linux]$ git-describe 39ca371c45b04cd50d0974030ae051906fc516b6
refs/tags/v2.6.14-rc4-g39ca371c
and I now know that it was _not_ in v2.6.14-rc4, but was presumably in
v2.6.14-rc5.
The latter is useful when you want to see what "version timeframe" a
commit happened in.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
We want to record the version of the tools the patch was generated with.
While these tools could be rebuilt, git-format-patch stayed the same and
report the wrong version.
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
HPA suggests it is simply silly to imitate Linux versioning
scheme where the leading "2" does not mean anything anymore, and
I tend to agree.
The first feature release after 1.0.0 will be 1.1.0, and the
development path leading to 1.1.0 will carry 1.0.GIT as the
version number from now on. Similarly, the third maintenance
release that follows 1.0.0 will not be 1.0.0c as planned, but
will be called 1.0.3. The "maint" branch will merge in fixes
and immediately tagged, so there is no need for 1.0.2.GIT that
is in between 1.0.2 (aka 1.0.0b) and 1.0.3.
Signed-off-by: Junio C Hamano <junkio@cox.net>
- Avoid misleading success message on error (Johannes)
- objects/info/packs: work around bug in http-fetch.c::fetch_indices()
- http-fetch.c: fix objects/info/pack parsing.
- An off-by-one bug found by valgrind (Pavel)
Signed-off-by: Junio C Hamano <junkio@cox.net>
We still advertise "git resolve" as a standalone command, but never
"git octopus", so nobody should be using it and it is safe to
retire it. The functionality is still available as a strategy
backend.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Split out the functions that deal with the socketpair after
finishing git protocol handshake to receive the packed data into
a separate file, and use it in fetch-pack to keep/explode the
received pack data. We earlier had something like that on
clone-pack side once, but the list discussion resulted in the
decision that it makes sense to always keep the pack for
clone-pack, so unpacking option is not enabled on the clone-pack
side, but we later still could do so easily if we wanted to with
this change.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Added an AIX clause in the Makefile; that clause likely
will be wrong for any AIX pre-5.2, but I can only test
on 5.3. mailinfo.c was missing the compat header file,
and convert-objects.c needs to define a specific
_XOPEN_SOURCE as well as _XOPEN_SOURCE_EXTENDED.
Signed-off-by: E. Jason Riedy <ejr@cs.berkeley.edu>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This attempts to clean up the way various compatibility
functions are defined and used.
- A new header file, git-compat-util.h, is introduced. This
looks at various NO_XXX and does necessary function name
replacements, equivalent of -Dstrcasestr=gitstrcasestr in the
Makefile.
- Those function name replacements are removed from the Makefile.
- Common features such as usage(), die(), xmalloc() are moved
from cache.h to git-compat-util.h; cache.h includes
git-compat-util.h itself.
Signed-off-by: Junio C Hamano <junkio@cox.net>
There is no setenv() in Solaris 5.8. The trivial calls to
setenv() were replaced by putenv() in a much earlier patch,
but setenv() was used again in git.c. This patch just adds
a compat/setenv.c.
The rule for building git$(X) also needs to include compat.
objects and compiler flags. Those are now in makefile vars
COMPAT_OBJS and COMPAT_CFLAGS.
Signed-off-by: E. Jason Riedy <ejr@cs.berkeley.edu>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Alex Riesen wants to keep extra makefile targets in config.mak, but
the file is included before any of our real targets. Having this
at the beginning allows you to do so.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This makes it possible to define WITH_SEND_EMAIL etc. in config.mak.
Also remove GIT_LIST_TWEAK because it isn't used anymore.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Otherwise we would end up linking all the unneeded stuff into git-daemon
only to link with git_default_config.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Remove $(SIMPLE_PROGRAMS) from $(PROGRAMS) so buildrules don't have
to be overridden.
Put $(SCRIPTS) with the other target-macros so it doesn't get lonely.
Signed-off-by: Andreas Ericsson <ae@op5.se>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Also rearrange some path settings in the Makefile in the process.
Signed-off-by: Ryan Anderson <ryan@michonline.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This is meant for the end user, who cannot be expected to edit
.git/config by hand.
Example:
git-config-set core.filemode true
will set filemode in the section [core] to true,
git-config-set --unset core.filemode
will remove the entry (failing if it is not there), and
git-config-set --unset diff.twohead ^recar
will remove the unique entry whose value matches the regex "^recar"
(failing if there is no unique such entry).
It is just a light wrapper around git_config_set() and
git_config_set_multivar().
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
The decision about whether to build http-push or not belongs in the
Makefile. This follows Junio's suggestion to determine whether curl
is new enough to support http-push.
Signed-off-by: Nick Hengeveld <nickh@reactrix.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Move shared HTTP request functionality out of http-fetch and http-push,
and replace the two fwrite_buffer/fwrite_buffer_dynamic functions with
one fwrite_buffer function that does dynamic buffering. Use slot
callbacks to process responses to fetch object transfer requests and
push transfer requests, and put all of http-push into an #ifdef check
for curl multi support.
Signed-off-by: Nick Hengeveld <nickh@reactrix.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
When HPA added Cygwin target, it ran just fine without NO_MMAP for him,
but recently we are getting reports that for some people things break
without it. For now, just suggest it in the Makefile without actually
updating the default.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This patch provides a C implementation of the 'git' program and
introduces support for putting the git-* commands in a directory
of their own. It also saves some time on executing those commands
in a tight loop and it prints the currently available git commands
in a nicely formatted list.
The location of the GIT_EXEC_PATH (name discussion's closed, thank gods)
can be obtained by running
git --exec-path
which will hopefully give porcelainistas ample time to adapt their
heavy-duty loops to call the core programs directly and thus save
the extra fork() / execve() overhead, although that's not really
necessary any more.
The --exec-path value is prepended to $PATH, so the git-* programs
should Just Work without ever requiring any changes to how they call
other programs in the suite.
Some timing values for 10000 invocations of git-var >&/dev/null:
git.sh: 24.194s
git.c: 9.044s
git-var: 7.377s
The git-<tab><tab> behaviour can, along with the someday-to-be-deprecated
git-<command> form of invocation, be indefinitely retained by adding
the following line to one's .bash_profile or equivalent:
PATH=$PATH:$(git --exec-path)
Experimental libraries can be used by either setting the environment variable
GIT_EXEC_PATH, or by using
git --exec-path=/some/experimental/exec-path
Relative paths are properly grok'ed as exec-path values.
Signed-off-by: Andreas Ericsson <ae@op5.se>
Signed-off-by: Junio C Hamano <junkio@cox.net>
A while ago, a rename-detection limit logic was implemented as a
response to this thread:
http://marc.theaimsgroup.com/?l=git&m=112413080630175
where gitweb was found to be using a lot of time and memory to
detect renames on huge commits. git-diff family takes -l<num>
flag, and if the number of paths that are rename destination
candidates (i.e. new paths with -M, or modified paths with -C)
are larger than that number, skips rename/copy detection even
when -M or -C is specified on the command line.
This commit makes the rename detection limit easier to use. You
can have:
[diff]
renamelimit = 30
in your .git/config file to specify the default rename detection
limit. You can override this from the command line; giving 0
means 'unlimited':
git diff -M -l0
We might want to change the default behaviour, when you do not
have the configuration, to limit it to say 20 paths or so. This
would also help the diffstat generation after a big 'git pull'.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This patch renames the tarball "git" rather than "git-core", and changes
the names of various packages from git-core-foo to git-foo. git-core is
still the true core package; an empty RPM package named "git" pulls in
ALL the git packages -- this makes updates work correctly, and allows
"yum install git" to do the obvious thing.
It also renames the git-(core-)tk package to gitk.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Stuffing -L flag and friends meant for the linking phase into
ALL_CFLAGS is not right; honor LDFLAGS and introduce ALL_LDFLAGS
to separate them out.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Because we use "lost-found" as the directory name to hold
dangling object names, it is confusing to call the command
git-lost+found, although it makes sense and is even cute ;-).
Signed-off-by: Junio C Hamano <junkio@cox.net>