kernel/git - git - PowerEL Git System

Commit Graph

Author	SHA1	Message	Date
Shawn O. Pearce	dc49cd769b	Cast 64 bit off_t to 32 bit size_t Some systems have sizeof(off_t) == 8 while sizeof(size_t) == 4. This implies that we are able to access and work on files whose maximum length is around 2^63-1 bytes, but we can only malloc or mmap somewhat less than 2^32-1 bytes of memory. On such a system an implicit conversion of off_t to size_t can cause the size_t to wrap, resulting in unexpected and exciting behavior. Right now we are working around all gcc warnings generated by the -Wshorten-64-to-32 option by passing the off_t through xsize_t(). In the future we should make xsize_t on such problematic platforms detect the wrapping and die if such a file is accessed. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net>	18 years ago
Shawn O. Pearce	3a55602eec	General const correctness fixes We shouldn't attempt to assign constant strings into char*, as the string is not writable at runtime. Likewise we should always be treating unsigned values as unsigned values, not as signed values. Most of these are very straightforward. The only exception is the (unnecessary) xstrdup/free in builtin-branch.c for the detached head case. Since this is a user-level interactive type program and that particular code path is executed no more than once, I feel that the extra xstrdup call is well worth the easy elimination of this warning. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net>	18 years ago
Nicolas Pitre	21666f1aae	convert object type handling from a string to a number We currently have two parallel notation for dealing with object types in the code: a string and a numerical value. One of them is obviously redundent, and the most used one requires more stack space and a bunch of strcmp() all over the place. This is an initial step for the removal of the version using a char array found in object reading code paths. The patch is unfortunately large but there is no sane way to split it in smaller parts without breaking the system. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <junkio@cox.net>	18 years ago
Junio C Hamano	599065a3bb	prefixcmp(): fix-up mechanical conversion. Previous step converted use of strncmp() with literal string mechanically even when the result is only used as a boolean: if (!strncmp("foo", arg, 3)) ==> if (!(-prefixcmp(arg, "foo"))) This step manually cleans them up to read: if (!prefixcmp(arg, "foo")) Signed-off-by: Junio C Hamano <junkio@cox.net>	18 years ago
Junio C Hamano	cc44c7655f	Mechanical conversion to use prefixcmp() This mechanically converts strncmp() to use prefixcmp(), but only when the parameters match specific patterns, so that they can be verified easily. Leftover from this will be fixed in a separate step, including idiotic conversions like if (!strncmp("foo", arg, 3)) => if (!(-prefixcmp(arg, "foo"))) This was done by using this script in px.perl #!/usr/bin/perl -i.bak -p if (/strncmp$([^,]+), "([^\\"])", (\d+)$/ && (length($2) == $3)) { s\|strncmp$([^,]+), "([^\\"])", (\d+)$\|prefixcmp($1, "$2")\|; } if (/strncmp$"([^\\"])", ([^,]+), (\d+)$/ && (length($1) == $3)) { s\|strncmp$"([^\\"])", ([^,]+), (\d+)$\|(-prefixcmp($2, "$1"))\|; } and running: $ git grep -l strncmp -- '*.c' \| xargs perl px.perl Signed-off-by: Junio C Hamano <junkio@cox.net>	18 years ago
Tommi Kyntola	f44213258d	git-blame: prevent argument parsing segfault The 3rd branch in builtin-blame.c should also check for lacking arguments. Running that in top dir does not trigger the problem because the 'prefix' is NULL. Signed-off-by: Tommi Kyntola <tommi.kyntola@ray.fi> Signed-off-by: Junio C Hamano <junkio@cox.net>	18 years ago
Junio C Hamano	870b39c15f	blame: --show-stats for easier optimization work. Signed-off-by: Junio C Hamano <junkio@cox.net>	18 years ago
Junio C Hamano	e68989a739	annotate: fix for cvsserver. git-cvsserver does not want the boundary commits shown any differently. Signed-off-by: Junio C Hamano <junkio@cox.net>	18 years ago
Junio C Hamano	06e75a7237	blame: document --contents option Signed-off-by: Junio C Hamano <junkio@cox.net>	18 years ago
Junio C Hamano	005f85d9ae	Use pretend_sha1_file() in git-blame and git-merge-recursive. git-merge-recursive wants an null tree as the fake merge base while producing the merge result tree. The null tree does not have to be written out in the object store as it won't be part of the result, and it is a prime example for using the new pretend_sha1_file() function. git-blame needs to register an arbitrary data to in-core index while annotating a working tree file (or standard input), but git-blame is a read-only application and the user of it could even lack the privilege to write into the object store; it is another good example for pretend_sha1_file(). Signed-off-by: Junio C Hamano <junkio@cox.net>	18 years ago
Junio C Hamano	1cfe77333f	git-blame: no rev means start from the working tree file. Warning: this changes the semantics. This makes "git blame" without any positive rev to start digging from the working tree copy, which is made into a fake commit whose sole parent is the HEAD. It also adds --contents <file> option to pretend as if the working tree copy has the contents of the named file. You can use '-' to make the command read from the standard input. If you want the command to start annotating from the HEAD commit, you need to explicitly give HEAD parameter. Signed-off-by: Junio C Hamano <junkio@cox.net>	18 years ago
Pavel Roskin	3dff5379bf	Assorted typo fixes Signed-off-by: Junio C Hamano <junkio@cox.net>	18 years ago
Junio C Hamano	1732a1fd94	git-blame: somewhat better commenting. Signed-off-by: Junio C Hamano <junkio@cox.net>	18 years ago
Ren,Ai(B Scharfe	4f0219a4c7	git-blame --incremental: don't use pager Starting a pager defeats the purpose of the incremental output mode. This changes git-blame to only paginate if --incremental was not given. git -p blame --incremental still starts the pager, though. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <junkio@cox.net>	18 years ago
Junio C Hamano	46e5e69d5f	git-blame --porcelain: quote filename in c-style when needed. Otherwise a pathname that has funny characters such as LF would screw up the parsing programs of the output. Strictly speaking, this is not backward compatible, but the current output for pathnames that have embedded LF and such cannot be sanely parsed anyway, and pathnames that only use characters from the portable pathname character set won't be affected. Signed-off-by: Junio C Hamano <junkio@cox.net>	18 years ago
Linus Torvalds	717d1462ba	git-blame --incremental This adds --incremental option to help GUI porcelains to show the result from git-blame incrementally. The output gives the origin information in the same format as the porcelain format. The first line has commit object name, the line number of the first line in the group in the original file, the line number of that file in the final image, and number of lines in the group. Then subsequent lines show the metainformation for the commit when the commit is shown for the first time, except the filename information is always shown (we cannot even make it conditional to -C option as blame always follows the renaming of the file wholesale). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net>	18 years ago
Junio C Hamano	85023577a8	simplify inclusion of system header files. This is a mechanical clean-up of the way *.c files include system header files. (1) sources under compat/, platform sha-1 implementations, and xdelta code are exempt from the following rules; (2) the first #include must be "git-compat-util.h" or one of our own header file that includes it first (e.g. config.h, builtin.h, pkt-line.h); (3) system headers that are included in "git-compat-util.h" need not be included in individual C source files. (4) "git-compat-util.h" does not have to include subsystem specific header files (e.g. expat.h). Signed-off-by: Junio C Hamano <junkio@cox.net>	18 years ago
Junio C Hamano	4c10a5caa7	blame: -b (blame.blankboundary) and --root (blame.showroot) When blame.blankboundary is set (or -b option is given), commit object names are blanked out in the "human readable" output format for boundary commits. When blame.showroot is not set (or --root is not given), the root commits are treated as boundary commits. The code still attributes the lines to them, but with -b their object names are not shown. Signed-off-by: Junio C Hamano <junkio@cox.net>	18 years ago
Junio C Hamano	b11121d9e3	git-blame: show lines attributed to boundary commits differently. When blaming with revision ranges, often many lines are attributed to different commits at the boundary, but they are not interesting for the purpose of finding project history during that revision range. This outputs the lines blamed on boundary commits differently. When showing "human format" output, their SHA-1 are shown with '^' prefixed. In "porcelain format", the commit will be shown with an extra attribute line "boundary". Signed-off-by: Junio C Hamano <junkio@cox.net>	18 years ago
Alex Riesen	6bee4e408c	git-blame: fix rev parameter handling. We lacked "--" termination in the underlying init_revisions() call which made it impossible to specify a revision that happens to have the same name as an existing file. Signed-off-by: Junio C Hamano <junkio@cox.net>	18 years ago
Junio C Hamano	ab3bb800b4	git blame -C: fix output format tweaks when crossing file boundary. We used to get the case that more than two paths came from the same commit wrong when computing the output width and deciding to turn on --show-name option automatically. When we find that lines that came from a path that is different from what we started digging from, we should always turn --show-name on, and we should count the name length for all files involved. Signed-off-by: Junio C Hamano <junkio@cox.net>	18 years ago
Junio C Hamano	8eaf79869f	git-annotate: fix -S on graft file with comments. The graft file can contain comment lines and read_graft_line can return NULL for such an input, which should be skipped by the reader. Signed-off-by: Junio C Hamano <junkio@cox.net>	18 years ago
Junio C Hamano	acca687fa9	git-pickaxe: retire pickaxe Just make it take over blame's place. Documentation and command have all stopped mentioning "git-pickaxe". The built-in synonym is left in the command table, so you can still say "git pickaxe", but it probably is a good idea to retire it as well. Signed-off-by: Junio C Hamano <junkio@cox.net>	18 years ago
Junio C Hamano	7bd9641df5	git-pickaxe: allow "-L <something>,+N" With this, git pickaxe -L '/--progress/,+20' v1.4.0 -- pack-objects.c gives you 20 lines starting from the first occurrence of '--progress' in pack-objects, digging from v1.4.0 version. You can also say git pickaxe -L '/--progress/,-5' v1.4.0 -- pack-objects.c Signed-off-by: Junio C Hamano <junkio@cox.net>	18 years ago
Junio C Hamano	931233bc66	git-pickaxe: -L /regexp/,/regexp/ With this change, you can specify the beginning and the ending line of the range you wish to inspect with pattern matching. For example, these are equivalent with the git.git sources: git pickaxe -L 7,21 v1.4.0 -- commit.c git pickaxe -L '/^struct sort_node/,/^}/' v1.4.0 -- commit.c git pickaxe -L '7,/^}/' v1.4.0 -- commit.c git pickaxe -L '/^struct sort_node/,21' v1.4.0 -- commit.c Signed-off-by: Junio C Hamano <junkio@cox.net>	18 years ago
Junio C Hamano	c2e525d97f	git-pickaxe: optimize by avoiding repeated read_sha1_file(). It turns out that pickaxe reads the same blob repeatedly while blame can reuse the blob already read for the parent when handling a child commit when it's parent's turn to pass its blame to the grandparent. Have a cache in the origin structure to keep the blob there, which will be garbage collected when the origin loses the last reference to it. Signed-off-by: Junio C Hamano <junkio@cox.net>	18 years ago
Junio C Hamano	854b97f6eb	git-pickaxe: fix origin refcounting When we introduced the cached origin per commit, we gave up proper garbage collecting because it meant that commits hold onto their cached copy. There is no need to do so. Signed-off-by: Junio C Hamano <junkio@cox.net>	18 years ago
Junio C Hamano	334947843c	git-pickaxe: re-scan the blob after making progress with -C The reason to do this is the same as in the previous change for line copy detection within the same file (-M). Also this fixes -C and -C -C (aka find-copies-harder) logic; in this application we are not interested in the similarity matching diffcore-rename makes, because we are only interested in scanning files that were modified, or in the case of -C -C, scanning all files in the parent and we want to do that ourselves. Signed-off-by: Junio C Hamano <junkio@cox.net>	18 years ago
Junio C Hamano	650e2f6752	git-pickaxe: re-scan the blob after making progress with -M Otherwise we would miss copied lines that are contained in the parts before or after the part that we find after splitting the blame_entry (i.e. split[0] and split[2]). Signed-off-by: Junio C Hamano <junkio@cox.net>	18 years ago
Junio C Hamano	0421d9f812	git-pickaxe: simplify Octopus merges further If more than one parents in an Octopus merge have the same origin, ignore later ones because it would not make any difference in the outcome. Signed-off-by: Junio C Hamano <junkio@cox.net>	18 years ago
Junio C Hamano	2f3f8b218a	git-pickaxe: rename detection optimization The idea is that we are interested in renaming into only one path, so we do not care about renames that happen elsewhere. Signed-off-by: Junio C Hamano <junkio@cox.net>	18 years ago
Jeff King	20239bae94	git-pickaxe: work properly in a subdirectory. We forgot to add prefix to the given path. [jc: interestingly enough, Jeff King had the same idea after I pushed mine out to "pu", and his patch was cleaner, so I dropped mine.] Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <junkio@cox.net>	18 years ago
Junio C Hamano	0d981c67d8	git-pickaxe: cache one already found path per commit. Depending on how bushy the commit DAG is, this saves calls to the internal diff-tree for fork-point commits. For example, annotating Makefile in the kernel repository saves about a third of such diff-tree calls. Signed-off-by: Junio C Hamano <junkio@cox.net>	18 years ago
Junio C Hamano	f69e743d97	git-pickaxe: split find_origin() into find_rename() and find_origin(). When a merge adds a new file from the second parent, the earlier code tried to find renames in the first parent before noticing that the vertion from the second parent was added without modification. Signed-off-by: Junio C Hamano <junkio@cox.net>	18 years ago
Junio C Hamano	ae86ad6575	git-pickaxe: tighten sanity checks. When compiled for debugging, make sure that refcnt sanity check code detects underflows in origin reference counting. Signed-off-by: Junio C Hamano <junkio@cox.net>	18 years ago
Junio C Hamano	f5f75c652b	git-pickaxe: refcount origin correctly in find_copy_in_parent() This makes "git-pickaxe -C master -- revision.c" to finish with proper refcounts for all origins. I am reasonably happy with it. Signed-off-by: Junio C Hamano <junkio@cox.net>	18 years ago
Junio C Hamano	2c40f98439	git-pickaxe: allow -Ln,m as well as -L n,m The command rejects -L1,10 as an invalid line range specifier and I got frustrated enough by it, so this makes it allow both forms of input. Signed-off-by: Junio C Hamano <junkio@cox.net>	18 years ago
Junio C Hamano	54a4c6173e	git-pickaxe: WIP to refcount origin structure. The origin structure is allocated for each commit and path while the code traverse down it is copied into different blame entries. To avoid leaks, try refcounting them. This still seems to leak, which I haven't tracked down fully yet. Signed-off-by: Junio C Hamano <junkio@cox.net>	18 years ago
Junio C Hamano	aec8fa1f58	git-pickaxe: swap comparison loop used for -C When assigning blames for code movements across file boundaries, we used to iterate over blame entries (i.e. groups of lines to be blamed) in the outer loop and compared each entry with paths in the parent commit in an inner loop. This meant that we opened the blob data from each path number of times. Reorganize the loop so that we read the same path only once, and compare it against all relevant blame entries. This should perform better, but seems to give mixed results, though. Signed-off-by: Junio C Hamano <junkio@cox.net>	18 years ago
Junio C Hamano	f6c0e19102	git-pickaxe: get rid of wasteful find_origin(). After finding out which path in the parent to scan to pass blames, using get_tree_entry() to extract the blob information again was quite wasteful, since diff-tree already gave us that information. Separate the function to create an origin out as get_origin(). You'll never know what is more efficient unless you try and/or think hard. I somehow thought that extracting one known path out of commit's tree is cheaper than running a diff-tree for the current path between the commit and its parent, but it is not the case. In real, non-toy projects, most commits do not touch the path you are interested in, and if the path is a few levels away from the toplevel, whole-subdirectory comparison logic diff-tree allows us to skip opening lower subdirectories. This commit rewrites find_origin() function to use a single-path diff-tree to see if the parent has the same blob as the current suspect, which is cheaper than extracting the blob information using get_tree_entry() and comparing it with what the current suspect has. This shaves about 6% overhead when annotating kernel/sched.c in the Linux kernel repository on my machine. The saving rises to 25% for arch/i386/kernel/Makefile. Signed-off-by: Junio C Hamano <junkio@cox.net>	18 years ago
Junio C Hamano	46014766bd	git-pickaxe: do not confuse two origins that are the same. It used to be that we can compare the address of the origin structure to determine if they are the same because they are always registered with scoreboard. After introduction of the loop to try finding the best split, that is not true anymore. The current code has rather serious leaks with origin structure, but more importantly it gets confused when two origins that points at the same commit and same path. We might eventually have to refcount and gc origin, but let's fix the correctness issue first. Signed-off-by: Junio C Hamano <junkio@cox.net>	18 years ago
Junio C Hamano	612702e8ea	git-pickaxe: do not keep commit buffer. We need the commit buffer data while generating the final result, but until then we do not need them. Signed-off-by: Junio C Hamano <junkio@cox.net>	18 years ago
Junio C Hamano	4a0fc95f18	git-pickaxe: introduce heuristics to avoid "trivial" chunks This adds scoring logic to blame_entry to prevent blames on very trivial chunks (e.g. lots of empty lines, indent followed by a closing brace) from being passed down to unrelated lines in the parent. The current heuristics are quite simple and may need to be tweaked later, but we need to start somewhere. Signed-off-by: Junio C Hamano <junkio@cox.net>	18 years ago
Junio C Hamano	5ff62c3002	git-pickaxe: improve "best match" heuristics Instead of comparing number of lines matched, look at the matched characters and count alnums, so that we do not pass blame on not-so-interesting lines, such as an empty line and a line that is indentation followed by a closing brace. Add an option --score-debug to show the score of each blame_entry while we cook this further on the "next" branch. Signed-off-by: Junio C Hamano <junkio@cox.net>	18 years ago
Junio C Hamano	1ca6ca876e	git-pickaxe: fix nth_line() We would want to be able to refer to the end of the file as "the beginning of Nth line" for a file that is N lines long. Signed-off-by: Junio C Hamano <junkio@cox.net>	18 years ago
Junio C Hamano	18abd745a0	git-pickaxe -C: blame cut-and-pasted lines. This completes the initial round of git-pickaxe. In addition to the detection of line movements we already have, this finds new lines that were created by moving or cutting-and-pasting lines from different files in the parent. With this, git pickaxe -f -n -C v1.4.0 -- revision.c finds that a major part of that file actually came from rev-list.c when Linus split the latter at commit ae563642 and blames them to earlier commits that touch rev-list.c. Signed-off-by: Junio C Hamano <junkio@cox.net>	18 years ago
Junio C Hamano	d24bba8008	git-pickaxe -M: blame line movements within a file. This makes pickaxe more intelligent than the classic blame. A typical example is a change that moves one static C function from lower part of the file to upper part of the same file, because you added a new caller in the middle. The versions in the parent and the child would look like this: parent child A static foo() { B ... C } D A E B F C G D static foo() { ... call foo(); ... E } F H G H With the classic blame algorithm, we can blame lines A B C D E F G and H to the parent. The child is guilty of introducing the line "... call foo();", and the blame is placed on the child. However, the classic blame algorithm fails to notice that the implementation of foo() at the top of the file is not new, and moved from the lower part of the parent. This commit introduces detection of such line movements, and correctly blames the lines that were simply moved in the file to the parent. Signed-off-by: Junio C Hamano <junkio@cox.net>	18 years ago
Junio C Hamano	cee7f245dc	git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net>	18 years ago

1 2

81 Commits (7fe4a728a16cf4e873702f8478fa3e28e8ae89ce)