On 27 Mar 2014 20:54:57 +0100, Chris Young wrote:
> OK, second attempt at international domain name support.
>
> Branch: chris/idna2008
>
> I've had to import some unrestricted code from elsewhere, due to the
> necessity of Unicode normalisation and other things. It is working
> and conforming to the spec, as far as I read it.
>
> A couple of minor issues/todos:
> 1. If an invalid URL is encountered during page layout/box conversion,
> NetSurf gives a BoxConvert warning and the page is never displayed.
> This is caused by my new code making nsurl_create return
> NSERROR_BAD_URL when an IDN fails the compliance checks.
> I've not been able to work out where in the core this error code is
> terminating page layout.
> Page showing this problem:
> http://blogs.msdn.com/b/shawnste/archive/2006/09/14/idn-test-urls.aspx
>
> 2. If a frontend wants to display the UTF-8 version of an IDN then
> currently the URL needs stripping into component parts, the host run
> through idna_decode() and the whole thing put back together again.
> This should probably be handled by nsurl but I'm not sure of the best
> way to implement it.
>
> 3. There are some to-dos noted in code comments for further compliance
> checking. They are optional in the spec, and I don't see any need to
> implement them - anything invalid will be rejected by DNS. Most of
> the mandatory checks seem overkill anyway, given that there is
> stricter checking at DNS registration time.
> I have included the optional decode-reencode check for already encoded
> addresses to weed out any undecodeable nonsense the user might have
> typed in, but it doesn't bother to do normalisation or validity
> checking of the decoded address before re-encoding it (maybe it
> should, I'm not sure, the spec was vague on this point).
Is there any interest in reviewing/merging this now 3.1 is out of the
way?
I'm thinking point 2 above might also tie in with Vince's
proposed changes to extract escaped path elements from an nsurl, as it
is a similar challenge.
Chris
No comments:
Post a Comment