Sunday, 10 February 2013

Re: font_split change, action required

In article
<OUT-51182ED5.MD-1.4.17.chris.young@unsatisfactorysoftware.co.uk>,
Chris Young <chris.young@unsatisfactorysoftware.co.uk> wrote:
> On Sun, 10 Feb 2013 19:55:31 +0000 (GMT), Michael Drake wrote:

> > The implementation requirements for font_split have changed slightly.
> > The split point (char_offset) does not now have to be a space.

> I'm clearly missing something here, as I took this literally initially
> and just set the split point where x would fall

It's up to the nsfont_split implementation to pick somewhere sensible to
split at. The necessity for the split point to be a space has been
removed, because I removed the failure of HTML layout to handle the split
point not being a space. This allows a front end to implement Unicode
line breaking. For example, see:

http://www.netsurf-browser.org/welcome/index.ja

If you only split at spaces, that paragraph will only ever be one long
line. (There are no spaces to split on.)

At the moment only the GTK front end has Unicode line breaking. All the
other front ends only know how to split on spaces. [*]

> expecting the core to seek back to the previous character that was a
> valid split point. Instead, I ended up with lines split mid-word.

All the front ends that only handle splitting on spaces (everything but
the GTK front end) do this (with a few variations):

1. Find how much text can fit in the available width
2. Store the char_offset of that split point (the first char that
doesn't fit)..
3. If char_offset is a space, return that split point.
4. If char_offset is not a space, search back towards the start of the
string, and break on the first space.
5. If we got a space (didn't reach the start of the string) return the
char_offset of the space we found.
6. If we didn't find a space before the char_offset split point, search
towards the end of the string, looking for a space.
7. Split point is either the first space found, or if none, then it's
the length of the string.

> I've slightly bodged it now, so it only splits on spaces, but when it
> doesn't find a space (which seems to be happening in some instances
> when a space is clearly present, not sure why yet), it just splits
> where x falls. This is working without any words getting split across
> lines.

That sounds peculiar.

> If the intention is to split at not only spaces,

Yes, to allow Unicode line breaking.

> is there any reason why the frontend can't just return the character
> that falls at x?

The nsfont_split implementation is meant to find a place for a line break.


[*] I'll have a good look at how we can do Unicode line breaking on all
platforms after I'm done with textareas, and a few other things.

--

Michael Drake (tlsa) http://www.netsurf-browser.org/

No comments:

Post a Comment