Thursday, 27 March 2014

IDNA2008 - take 2

OK, second attempt at international domain name support.

Branch: chris/idna2008

I've had to import some unrestricted code from elsewhere, due to the
necessity of Unicode normalisation and other things. It is working
and conforming to the spec, as far as I read it.

A couple of minor issues/todos:
1. If an invalid URL is encountered during page layout/box conversion,
NetSurf gives a BoxConvert warning and the page is never displayed.
This is caused by my new code making nsurl_create return
NSERROR_BAD_URL when an IDN fails the compliance checks.
I've not been able to work out where in the core this error code is
terminating page layout.
Page showing this problem:
http://blogs.msdn.com/b/shawnste/archive/2006/09/14/idn-test-urls.aspx

2. If a frontend wants to display the UTF-8 version of an IDN then
currently the URL needs stripping into component parts, the host run
through idna_decode() and the whole thing put back together again.
This should probably be handled by nsurl but I'm not sure of the best
way to implement it.

3. There are some to-dos noted in code comments for further compliance
checking. They are optional in the spec, and I don't see any need to
implement them - anything invalid will be rejected by DNS. Most of
the mandatory checks seem overkill anyway, given that there is
stricter checking at DNS registration time.
I have included the optional decode-reencode check for already encoded
addresses to weed out any undecodeable nonsense the user might have
typed in, but it doesn't bother to do normalisation or validity
checking of the decoded address before re-encoding it (maybe it
should, I'm not sure, the spec was vague on this point).

Chris

No comments:

Post a Comment