There's a cross-site scripting problem in gitweb, where it will print
URLs generated by its href() helper without further quoting. This allows
an attacker to point a victim to a specially crafted gitweb URL and
inject arbitrary HTML into the resulting page (which the victim sees as
coming from gitweb).
The base of the URL comes from evaluate_uri(), which pulls the value of
$REQUEST_URI via the CGI module. It tries to strip off $PATH_INFO, but
fails to do so in some cases (including ones that contain special
characters, like "+"). Most of the uses of the URL end up being passed
to "$cgi->a(-href = href())", which will get quoted properly by the CGI
module. But in a few places, we output them ourselves as part of
manually-generated HTML, and whatever was in the original URL will
appear unquoted in the output.
Given that all of the nearby variables placed into this manual HTML
_are_ quoted, it seems like the authors assumed that these URLs would
not need quoting. So it's possible that the bug is actually in
evaluate_uri(), which should be doing a more careful job of stripping
$PATH_INFO. There's some discussion in a comment in that function, as
well as the commit message in 81d3fe9f48 (gitweb: fix wrong base URL
when non-root DirectoryIndex, 2009-02-15). But I'm not sure I understand
it.
Regardless, it's a good idea to quote these values at the point of
insertion into the HTML output:
1. Even if there is a bug in evaluate_uri(), this would give us
belt-and-suspenders protection.
2. evaluate_uri() is only handling the base. Some generated URLs will
also mention arbitrary refs or filenames in the repositories, and
these should be quoted anyway.
3. It should never _hurt_ to quote (and that's what all of the
$cgi->a() calls are doing already).
So there may be further work here, but this patch at least prevents the
XSS vulnerability, and shouldn't make anything worse.
The test here covers the calls in print_feed_meta(), but I manually
audited every call to href() to see how its output was used, and quoted
appropriately. Most of them are esc_attr(), as they're used in tag
attributes, but I used esc_html() when the URLs were printed bare. The
distinction is largely academic, as one is implemented as a wrapper for
the other.
Reported-by: NAKAYAMA DAISUKE <nakyamad@icloud.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Gitweb has several hard-coded 40 values throughout it to check for
values that are passed in or acquired from Git. To simplify the code,
introduce a regex variable that matches either exactly 40 or exactly 64
hex characters, and use this variable anywhere we would have previously
hard-coded a 40 in a regex.
Add some helper functions which allow us to write tighter regexes that
match exactly the number of hex characters we're expecting.
Similarly, switch the code that looks for deleted diffinfo information
to look for either 40 or 64 zeros, and update one piece of code to use
this function. Finally, when formatting a log line, allow an
abbreviated describe output to contain up to 64 characters.
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since my d48b284183 ("perl: bump the required Perl version to 5.8 from
5.6.[21]", 2010-09-24), we've depended on 5.8, so there's no reason to
conditionally require Digest::MD5 anymore. It was released with perl
v5.7.3[1]
The initial introduction of the dependency in
e9fdd74e53 ("gitweb: (gr)avatar support", 2009-06-30) says as much,
this also undoes part of the later 2e9c8789b7 ("gitweb: Mention
optional Perl modules in INSTALL", 2011-02-04) since gitweb will
always be run on at least 5.8, so there's no need to mention
Digest::MD5 as a required module in the documentation, let's instead
say that we require perl 5.8.
1. $ corelist Digest::MD5
Data for 2015-02-14
Digest::MD5 was first released with perl v5.7.3
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In commit 46a1385 (gitweb: skip unreadable subdirectories, 2017-07-18)
we forgot to handle non-unix ACLs as well. Fix this.
Signed-off-by: Guillaume Castagnino <casta@xwing.info>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For people that work with very large plain text files it may be easier
if one can bypass viewing the htmlized blob and instead click directly
to the raw file (rather then click through 'blob' and then to 'raw').
Signed-off-by: Job Snijders <job@instituut.net>
Reviewed-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
gitweb terminates and shows no project list, if it can not access a
sub-directory in the project root directory while looking for projects
to show.
Work it around by skipping unreadable directories.
Signed-off-by: Hielke Christian Braun <hcb@unco.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Sven Strickroth <email@cs-ware.de>
Reviewed-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the log formatting function to know about "git describe" output
such as "v2.8.0-4-g867ad08", in addition to just plain "867ad08".
There are still many valid refnames that we don't link to
e.g. v2.10.0-rc1~2^2~1 is also a valid way to refer to
v2.8.0-4-g867ad08, but I'm not supporting that with this commit,
similarly it's trivially possible to create some refnames like
"æ/var-gf6727b0" or which won't be picked up by this regex.
There's surely room for improvement here, but I just wanted to address
the very common case of sticking "git describe" output into commit
messages without trying to link to all possible refnames, that's going
to be a rather futile exercise given that this is free text, and it
would be prohibitively expensive to look up whether the references in
question exist in our repository.
There was on-list discussion about how we could do better than this
patch. Junio suggested to update parse_commits() to call a new
"gitweb--helper" command which would pass each of the revision
candidates through "rev-parse --verify --quiet". That would cut down
on our false positives (e.g. we'll link to "deadbeef"), and also allow
us to be more aggressive in selecting candidate revisions.
That may be too expensive to work in practice, or it may
not. Investigating that would be a good follow-up to this patch.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Jakub Narębski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the minimum length of an abbreviated object identifier in the
commit message gitweb tries to turn into link from 8 hexchars to 7.
This arbitrary minimum length of 8 was introduced in bfe2191 ("gitweb:
SHA-1 in commit log message links to "object" view", 2006-12-10), but
the default abbreviation length is 7, and has been for a long time.
It's still possible to reference SHA-1s down to 4 characters in length,
see v1.7.4-1-gdce9648's MINIMUM_ABBREV, but I can't see how to make
git actually produce that, so I doubt anyone is putting that into log
messages in practice, but people definitely do put 7 character SHA-1s
into log messages.
I think it's fairly dubious to link to things matching [0-9a-fA-F]
here as opposed to just [0-9a-f], that dates back to the initial
version of gitweb from 161332a ("first working version",
2005-08-07). Git will accept all-caps SHA-1s, but didn't ever produce
them as far as I can tell.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Jakub Narębski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change a typo'd MIME type in a comment. The Content-Type is
application/xhtml+xml, not application/xhtm+xml.
Fixes up code originally added in 53c4031 ("gitweb: Strip
non-printable characters from syntax highlighter output", 2011-09-16).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Jakub Narębski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "highlight" binary can, in some cases, determine the language type
by the means of file contents, for example the shebang in the first line
for some scripting languages. Make use of this autodetection for files
which syntax is not known by gitweb. In that case, pass the blob
contents to "highlight --force"; the parameter is needed to make it
always generate HTML output (which includes HTML-escaping).
Although we now run highlight on files which do not end up highlighted,
performance is virtually unaffected because when we call highlight, it
is used for escaping HTML. In the case that highlight is used, gitweb
calls sanitize() instead of esc_html(), and the latter is significantly
slower (it does more, being roughly a superset of sanitize()). Simple
benchmark comparing performance of 'blob' view of files without syntax
highlighting in gitweb before and after this change indicates ±1%
difference in request time for all file types. Benchmark was performed
on local instance on Debian, using Apache/2.4.23 web server and CGI.
Document the feature and improve syntax highlight documentation, add
test to ensure gitweb doesn't crash when language detection is used.
Signed-off-by: Ian Kelling <ian@iankelling.org>
Acked-by: Jakub Narębski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a case where an html link can be generated from unescaped input
resulting in invalid strict xhtml or potentially injected code.
An overview of a repo with a tag "1.0.0&0.0.1" would previously result
in an unescaped ampersand in the link body.
Signed-off-by: Andreas Brauchli <a.brauchli@elementarea.net>
Acked-by: Jakub Narębski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some multi-byte character encodings (such as Shift_JIS and GBK) have
characters whose final bytes is an ASCII '\' (0x5c), and they
will be displayed as funny-characters even if $fallback_encoding is
correct. This is because `highlight` command always expects UTF-8
encoded strings from STDIN.
$ echo 'my $v = "申";' | highlight --syntax perl | w3m -T text/html -dump
my $v = "申";
$ echo 'my $v = "申";' | iconv -f UTF-8 -t Shift_JIS | highlight \
--syntax perl | iconv -f Shift_JIS -t UTF-8 | w3m -T text/html -dump
iconv: (stdin):9:135: cannot convert
my $v = "
This patch prepare git blob objects to be encoded into UTF-8 before
highlighting in the manner of `to_utf8` subroutine.
Signed-off-by: Shin Kojima <shin@kojima.org>
Reviewed-by: Jakub Narębski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git_object() chomps $type that is read from "cat-file -t", but
it does so before checking if $type is defined, resulting in
a Perl warning in the server error log:
gitweb.cgi: Use of uninitialized value $type in scalar chomp at
[...]/gitweb.cgi line 7579., referer: [...]
when trying to access a non-existing commit, for example:
http://HOST/?p=PROJECT.git;a=commit;h=NON_EXISTING_COMMIT
Check the value in $type before chomping. This will cause us to
call href with its action parameter set to undef when formulating
the URL to redirect to, but that is harmless, as the function treats
a parameter that set to undef as if it does not exist.
Signed-off-by: Øyvind A. Holm <sunny@sunbase.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As of CGI.pm's 4.08 release, the behavior to call
CGI::param() in a list context is deprecated (because it can
be potentially unsafe if called inside a hash constructor).
This causes gitweb to issue a warning for some of our code,
which in turn causes the tests to fail.
Our use is in fact _not_ one of the dangerous cases, as we
are intentionally using a list context. The recommended
route by 4.08 is to use the new CGI::multi_param() call to
make it explicit that we know what we are doing.
However, that function is only available in 4.08, which is
about a month old; we cannot rely on having it.
One option would be to set $CGI::LIST_CONTEXT_WARN globally,
which turns off the warning. However, that would eliminate
the protection these newer releases are trying to provide.
We want to annotate each site as OK using the new function.
So instead, let's check whether CGI provides the
multi_param() function, and if not, provide an
implementation that just wraps param(). That will work on
both old and new versions of CGI. Sadly, we cannot just
check defined(\&CGI::multi_param), because CGI uses the
autoload feature, which claims that all functions are
defined. Instead, we just do a version check.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
CGI.pm 4.04 removed the startform method, which had previously been
deprecated in favour of start_form. Changes file for CGI.pm says:
4.04 2014-09-04
[ REMOVED / DEPRECATIONS ]
- startform and endform methods removed (previously deprecated,
you should be using the start_form and end_form methods)
Signed-off-by: Roland Mas <lolando@debian.org>
Reviewed-by: Jakub Narębski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When displaying a blob in gitweb, if it's an image, specify constraints for
maximum display width and height to prevent the image from overflowing the
frame of the enclosing page_body div.
This change assumes that it is more desirable to see the whole image without
scrolling (new behavior) than it is to see every pixel without zooming
(previous behavior).
Signed-off-by: Andrew Keller <andrew@kellerfarm.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Given two branches residing in refs/heads/master and refs/wip/feature
the list-of-branches view will present them in following way:
master
feature (wip)
When getting a snapshot of a 'feature' branch, the tarball is going to
have name like 'project-wip-feature-<short hash>.tgz'.
Signed-off-by: Krzesimir Nowak <krzesimir@endocode.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Allow extra-branch-refs feature to tell gitweb to show refs from
additional hierarchies in addition to branches in the list-of-branches
view.
Signed-off-by: Krzesimir Nowak <krzesimir@endocode.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Users of validate_* passing "0" might get failures on correct name
because of coercion of "0" to false in code like:
die_error(500, "invalid ref") unless (check_ref_format ("0"));
Also, the validate_foo subs are renamed to is_valid_foo.
Signed-off-by: Krzesimir Nowak <krzesimir@endocode.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This check will be used in more than one place later.
Signed-off-by: Krzesimir Nowak <krzesimir@endocode.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the @author_initials feature Jakub added in
v1.6.4-rc2-14-ga36817b to match non-ASCII author initials as intended.
The regexp Jakub added was intended to match
non-ASCII (/\b([[:upper:]])\B/g). But in Perl this doesn't actually
match non-ASCII upper-case characters unless the string being matched
against has the UTF8 flag.
So when we open a pipe to "git blame" we need to mark the file
descriptor we're opening as utf8 explicitly.
So as a result it abbreviates me to "AB" not "ÆAB", entirely because "Æ"
isn't /[[:upper:]]/ unless the string being matched against has the UTF8
flag.
Here's something that demonstrates the issue:
#!/usr/bin/env perl
use strict;
use warnings;
binmode STDOUT, ':utf8' if $ENV{UTF8};
open my $fd, "-|", "git", "blame", "--incremental", "--", "Makefile" or die "Can't open: $!";
binmode $fd, ":utf8" if $ENV{UTF8};
while (my $line = <$fd>) {
next unless my ($author) = $line =~ /^author (.*)/;
my @author_initials = ($author =~ /\b([[:upper:]])\B/g);
printf "%s (%s)\n", join("", @author_initials), $author;
}
When that's run with and without UTF8 being true in the environment it
gives, on git.git:
$ UTF8=0 perl author-initials.pl | sort | uniq -c |
sort -nr | head -n 5
99 JH (Junio C Hamano)
35 JN (Jonathan Nieder)
35 JK (Jeff King)
20 JS (Johannes Schindelin)
16 AB (Ævar Arnfjörð Bjarmason)
$ UTF8=1 perl author-initials.pl | sort | uniq -c |
sort -nr | head -n 5
99 JH (Junio C Hamano)
35 JN (Jonathan Nieder)
35 JK (Jeff King)
20 JS (Johannes Schindelin)
16 ÆAB (Ævar Arnfjörð Bjarmason)
Acked-by: Jakub Narębski <jnareb@gmail.com>
Tested-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Tested-by: Simon Ruderich <simon@ruderich.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The search help link was a superscript question mark right next to
a drop-down menu, which looks misaligned and is a cramped and
awkward click target. Remove the superscript tags and add some
spacing to fix these nits. Add a title attribute to provide an
explanatory mouseover.
Signed-off-by: Tony Finch <dot@dotat.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
On the repository summary page, leave the owner line out if the
repo does not have an owner, rather than displaying a labelled empty
field. This does not affect the owner column in the projects list
page, which is present unless $omit_owner is true.
Signed-off-by: Tony Finch <dot@dotat.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are often parent pages logically above the gitweb projects
list, e.g. home pages of the organization and department that host
the gitweb server. This change allows you to include links to those
pages in gitweb's breadcrumb trail.
Signed-off-by: Tony Finch <dot@dotat.at>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Acked-by: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The bug is manifest when running gitweb in a persistent process (e.g.
FastCGI, PSGI), and it's easy to reproduce. If a gitweb request
includes the searchtext parameter (i.e. s), subsequent requests using
the project_list action--which is the default action--and without
a searchtext parameter will be filtered by the searchtext value of the
first request. This is because the value of the $search_regexp global
(the value of which is based on the searchtext parameter) is currently
being persisted between requests.
Instead, clear $search_regexp before dispatching each request.
Signed-off-by: Charles McGarvey <chazmcgarvey@brokenzipper.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Most of these were found using Lucas De Marchi's codespell tool.
Signed-off-by: Stefano Lattarini <stefano.lattarini@gmail.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Acked-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
With the current code, the images from picon and gravatar are
requested over http://, and browsers give mixed contents warning
when gitweb is served over https://.
Just drop the scheme: part from the URL, so that these external
sites are accessed over https:// in such a case.
Signed-off-by: Andrej E Baranov <admin@andrej-andb.ru>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
$1 becomes undef by internal regex, since it has no capture groups.
Match against accpetable control characters using index() instead of a regex.
Signed-off-by: Orgad Shaneh <orgads@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Sorting gitweb's project list by age ('Last Change') currently shows
projects with undefined ages at the head of the list. This gives a less
useful result when there are a number of projects that are missing or
otherwise faulty and one is trying to see what projects have been
updated recently.
Fix by sorting these projects with undefined ages at the bottom of the
list when sorting by age.
Signed-off-by: Matthew Daley <mattjd@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Git configuration items can not contain underscores in their section
and bottom-level variable name; the 'remote_heads' feature can not
be enabled on a per-repository basis with that name.
This changes the git-config option to be `gitweb.remoteheads` but does
not change the gitweb.conf option, to avoid backwards compatibility
issues. We strip underscores from keys before looking through
git-config output for them.
An existing check on keynames was overly eager to reject non-word
letters, but if we ever start using three-level names, the middle
level string can contain almost anything, so fix that as well while
we are in the vicinity.
Signed-off-by: Phil Pennock <phil@apcera.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The title of an RSS feed is generated from many components,
including the filename provided as a query parameter, but we
failed to quote it. Besides showing the wrong output, this
is a vector for XSS attacks.
Signed-off-by: Jeff King <peff@peff.net>
When commit 592ea41 refactored the list of extensions for
syntax highlighting, it failed to take into account perl's
operator precedence within lists. As a result, we end up
creating a dictionary of one-to-one elements when the intent
was to map mutliple related types to one main type (e.g.,
bash, ksh, zsh, and sh should all map to sh since they share
similar syntax, but we ended up just mapping "bash" to
"bash" and so forth).
This patch adds parentheses to make the mapping as the
original change intended. It also reorganizes the list to
keep mapped extensions together.
Signed-off-by: Richard Hubbell <richard_hubbe11@lavabit.com>
Signed-off-by: Jeff King <peff@peff.net>
gitweb's feeds sometimes contained committer timestamps in the wrong timezone
due to a misspelling.
Signed-off-by: Dylan Simon <dylan@dylex.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When gitweb is used as a DirectoryIndex, it attempts to strip
PATH_INFO on its own, as $cgi->url() fails to do so.
However, it fails to account for the fact that PATH_INFO has
already been URL-decoded by the web server, but the value
returned by $cgi->url() has not been. This causes the stripping
to fail whenever the URL contains encoded characters.
To see this in action, setup gitweb as a DirectoryIndex and
then use it on a repository with a directory containing a
space in the name. Navigate to tree view, examine the gitweb
generated html and you'll see a link such as:
<a href="/test.git/tree/HEAD:/directory with spaces">directory with spaces</a>
When clicked on, the browser will URL-encode this link, giving
a $cgi->url() of the form:
/test.git/tree/HEAD:/directory%20with%20spaces
While PATH_INFO is:
/test.git/tree/HEAD:/directory with spaces
Fix this by calling unescape() on both $my_url and $my_uri before
stripping PATH_INFO from them.
Signed-off-by: Jay Soffian <jaysoffian@gmail.com>
Acked-by: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The tip tree is the one of major subsystem tree in the
Linux kernel project. On the tip tree, the Link: (or
similar Buglink:) tag is used for tracking the original
discussion or context. Since it's ususally in the S-o-b
area, it'd be better using same style with others.
Also as it tends to contain a message-id sent from git
send-email, a part of the line would set a wrong hyperlink
like [1]. Fix it by not using format_log_line_html().
[1] git.kernel.org/?p=linux/kernel/git/tip/tip.git;a=commit;h=08942f6d5d992e9486b07653fd87ea8182a22fa0
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are many types of tags used in S-o-b area [1].
Update the regex to handle them properly. It requires
the tag should be started by a capital letter and ended
by '-by: ' or '-By: '. The only exception is 'Cc: '.
[1] http://lwn.net/Articles/503829/
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we see a signed-off-by line (and its friends), we set $signoff
to true, but then we process the next line after we are done without
giving control to the rest of the loop. And when the line we saw is
not a signed-off-by line, we reset $signoff to false before running
the remainder of the loop.
Hence, the check for $signoff that attempts to remove an extra empty
line between two signed-off-by line was not doing anything useful.
Rename $empty to a more explicit name $skip_blank_line to tell us to
skip a blank line when we see one, set it after we see and emit a
blank line (to avoid showing more than one empty lines in a raw) or
after we handle a signed-off-by line (to avoid empty lines after
such a line), to fix this bug, and get rid of the $signoff variable
that is not useful.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In some setups the repository owner is not a well defined concept
and administrator can prefer it to be not shown. This commit add
and an option that enable to reach this effect.
Signed-off-by: Kacper Kornet <draenog@pld-linux.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Generating information about last change for a large number of git
repositories can be very time consuming. This commit add an option to
omit 'Last Change' column when presenting the list of repositories.
Signed-off-by: Kacper Kornet <draenog@pld-linux.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Prevent setting owner to an empty value if it is not specified in
projects.list file. Otherwise it stops retrieving information about the
owner from other files.
Signed-off-by: Kacper Kornet <draenog@pld-linux.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The highlightning of combined diffs is currently disabled. This is
because output from a combined diff is much harder to highlight because
it is not obvious which removed and added lines should be compared.
Current code requires that the number of added lines is equal to the
number of removed lines and only skips first +/- character, treating
second +/- as a line content, Thus, it is not possible to simply use
existing algorithm unchanged for combined diffs.
Let's start with a simple case: only highlight changes that come from
one parent, i.e. when every removed line has a corresponding added line
for the same parent. This way the highlightning cannot get wrong. For
example, following diffs would be highlighted:
- removed line for first parent
+ added line for first parent
context line
-removed line for second parent
+added line for second parent
or
- removed line for first parent
-removed line for second parent
+ added line for first parent
+added line for second parent
but following output will not:
- removed line for first parent
-removed line for second parent
+added line for second parent
++added line for both parents
In other words, we require that pattern of '-'-es in pre-image matches
pattern of '+'-es in post-image.
Further changes may introduce more intelligent approach that better
handles combined diffs.
Signed-off-by: Michał Kiedrowicz <michal.kiedrowicz@gmail.com>
Acked-by: Jakub Narębski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Reading diff output is sometimes very hard, even if it's colored,
especially if lines differ only in few characters. This is often true
when a commit fixes a typo or renames some variables or functions.
This commit teaches gitweb to highlight characters that are different
between old and new line with a light green/red background. This should
work in the similar manner as in Trac or GitHub.
The algorithm that compares lines is based on contrib/diff-highlight.
Basically, it works by determining common prefix/suffix of corresponding
lines and highlightning only the middle part of lines. For more
information, see contrib/diff-highlight/README.
Combined diffs are not supported but a following commit will change it.
Since we need to pass esc_html()'ed or esc_html_hl_regions()'ed lines to
format_diff_lines(), so it was taught to accept preformatted lines
passed as a reference.
Signed-off-by: Michał Kiedrowicz <michal.kiedrowicz@gmail.com>
Acked-by: Jakub Narębski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now lines are formatted closer to place where we actually use HTML
formatted output.
This means that we put raw lines in the @chunk accumulator, rather than
formatted lines. Because we still need to know class (type) of line
when accumulating data to post-process and print, process_diff_line()
subroutine was retired and replaced by diff_line_class() used in
git_patchset_body() and new restructured format_diff_line() used in
print_diff_chunk().
As a side effect, we have to pass \%from and \%to down to callstack.
This is a preparation patch for diff refinement highlightning. It's not
meant to change gitweb output.
[jn: wrote commit message]
Signed-off-by: Michał Kiedrowicz <michal.kiedrowicz@gmail.com>
Acked-by: Jakub Narębski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This renames print_sidebyside_diff_chunk() to print_diff_chunk() and
makes use of it for both side-by-side and inline diffs. Now diff lines
are always accumulated before they are printed. This opens the
possibility to preprocess diff output before it's printed, which is
needed for diff refinement highlightning (implemented in incoming
patches).
If print_diff_chunk() was left as is, the new function
print_inline_diff_lines() could reorder diff lines. It first prints all
context lines, then all removed lines and finally all added lines. If
the diff output consisted of mixed added and removed lines, gitweb would
reorder these lines. This is true for combined diff output, for
example:
- removed line for first parent
+ added line for first parent
-removed line for second parent
++added line for both parents
would be rendered as:
- removed line for first parent
-removed line for second parent
+ added line for first parent
++added line for both parents
To prevent gitweb from reordering lines, print_diff_chunk() calls
print_diff_lines() as soon as it detects that both added and removed
lines are present and there was a class change, and at the end of chunk.
Signed-off-by: Michał Kiedrowicz <michal.kiedrowicz@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>