urlmatch: add underscore to URL_HOST_CHARS
When parsing a URL to normalize it, we allow hostnames to contain only
dot (".") or dash ("-"), plus brackets and colons for IPv6 literals.
This matches the old URL standard in RFC 1738, which says:
  host           = hostname | hostnumber
  hostname       = *[ domainlabel "." ] toplabel
  domainlabel    = alphadigit | alphadigit *[ alphadigit | "-" ] alphadigit
But this was later updated by RFC 3986, which is more liberal:
  host        = IP-literal / IPv4address / reg-name
  reg-name    = *( unreserved / pct-encoded / sub-delims )
  unreserved  = ALPHA / DIGIT / "-" / "." / "_" / "~"
While names with underscore in them are not common and possibly violate
some DNS rules, they do work in practice, and we will happily contact
them over http://, git://, or ssh://. It seems odd to ignore them for
purposes of URL matching, especially when the URL RFC seems to allow
them.
There shouldn't be any downside here. It's not a syntactically
significant character in a URL, so we won't be confused about parsing;
we'd have simply rejected such a URL previously (the test here checks
the url code directly, but the obvious user-visible effect would be
failing to match credential.http://foo_bar.example.com.helper, or
similar config in http.<url>.*).
Arguably we'd want to allow tilde ("~") here, too. There's likewise
probably no downside, but I didn't add it simply because it seems like
an even less likely character to appear in a hostname.
Reported-by: Alex Waite <alex@waite.eu>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
			
			
				maint
			
			
		
							parent
							
								
									af6d1d602a
								
							
						
					
					
						commit
						e4c497a194
					
				|  | @ -47,7 +47,7 @@ test_expect_success 'url authority' ' | ||||||
| 	test-tool urlmatch-normalization "scheme://@host" && | 	test-tool urlmatch-normalization "scheme://@host" && | ||||||
| 	test-tool urlmatch-normalization "scheme://%00@host" && | 	test-tool urlmatch-normalization "scheme://%00@host" && | ||||||
| 	! test-tool urlmatch-normalization "scheme://%%@host" && | 	! test-tool urlmatch-normalization "scheme://%%@host" && | ||||||
| 	! test-tool urlmatch-normalization "scheme://host_" && | 	test-tool urlmatch-normalization "scheme://host_" && | ||||||
| 	test-tool urlmatch-normalization "scheme://user:pass@host/" && | 	test-tool urlmatch-normalization "scheme://user:pass@host/" && | ||||||
| 	test-tool urlmatch-normalization "scheme://@host/" && | 	test-tool urlmatch-normalization "scheme://@host/" && | ||||||
| 	test-tool urlmatch-normalization "scheme://host/" && | 	test-tool urlmatch-normalization "scheme://host/" && | ||||||
|  |  | ||||||
|  | @ -5,7 +5,7 @@ | ||||||
| #define URL_DIGIT "0123456789" | #define URL_DIGIT "0123456789" | ||||||
| #define URL_ALPHADIGIT URL_ALPHA URL_DIGIT | #define URL_ALPHADIGIT URL_ALPHA URL_DIGIT | ||||||
| #define URL_SCHEME_CHARS URL_ALPHADIGIT "+.-" | #define URL_SCHEME_CHARS URL_ALPHADIGIT "+.-" | ||||||
| #define URL_HOST_CHARS URL_ALPHADIGIT ".-[:]" /* IPv6 literals need [:] */ | #define URL_HOST_CHARS URL_ALPHADIGIT ".-_[:]" /* IPv6 literals need [:] */ | ||||||
| #define URL_UNSAFE_CHARS " <>\"%{}|\\^`" /* plus 0x00-0x1F,0x7F-0xFF */ | #define URL_UNSAFE_CHARS " <>\"%{}|\\^`" /* plus 0x00-0x1F,0x7F-0xFF */ | ||||||
| #define URL_GEN_RESERVED ":/?#[]@" | #define URL_GEN_RESERVED ":/?#[]@" | ||||||
| #define URL_SUB_RESERVED "!$&'()*+,;=" | #define URL_SUB_RESERVED "!$&'()*+,;=" | ||||||
|  |  | ||||||
		Loading…
	
		Reference in New Issue
	
	 Jeff King
						Jeff King