warbo-utilities: 56d3233102b3696b3e5754b23c3d3f050445d7be
1: From: Chris Warburton
2: Date: Tue, 23 Jan 2018 15:23:53 +0000
3: Subject: Re: Fix HTML escaping when rendering README
4: Message-Id: <a0f5591b1a0c9954-fca2827cd6fd3da2-artemis@nixos>
5: References: <a0f5591b1a0c9954-0-artemis@nixos>
6: In-Reply-To: <a0f5591b1a0c9954-0-artemis@nixos>
7:
8: Regarding the 'Contents of follows', this was a simple typo, so now (in
9: uncommitted changes) it says 'Contents of README.md follows' (or
10: whatever the filename is).
11:
12: Regarding the rendering, this is a bit trickier. We would like markdown
13: to be rendered to HTML, and spliced into the page. Yet these READMEs may
14: come from external sources, so we don't want to allow XSS attacks.
15:
16: From doing a little research, it looks like sanitising markdown is
17: rather hopeless: on the one hand, it allows raw HTML tags, like
18: '<script>', which is dangerous. On the other, there are ways to make
19: markdown render to dangerous HTML, like links to 'javascript:' URLs, and
20: tricky issues like '> ' being used to mark up quotes. For example:
21:
22: > This will be treated as a quotation. What happens if we put an <a
23: > onclick="maliciousJavascript">anchor</a> here?
24:
25: If we try to sanitise the '<a>' tag, we might parse it as '<a
26: >'
27:
28: When actually that leading '>' is a quote indicator, and the actual
29: anchor includes the malicious Javascript.
30:
31: Gah!
32:
33: Anyway, it looks like the best way to handle this is to render to HTML,
34: then sanitise the HTML. There are various solutions to doing this, but
35: the general advice is to use a whitelist of allowed tags and attributes;
36: everything else should be stripped. This way, any future extensions to
37: HTML (e.g. 'onFoo' handler attributes), or anything that we didn't think
38: of, will just be silently stripped rather than having to play cat and
39: mouse.
40:
41: Unfortunately I can't find a standalone Linux command which will do this
42: santising. As with most security stuff, I'd rather not implement it
43: myself...
Generated by git2html.