Monday, 11 February 2013

Re: font_split change, action required

On Mon, 11 Feb 2013 00:10:08 +0000 (GMT), Michael Drake wrote:

> It's up to the nsfont_split implementation to pick somewhere sensible to
> split at. The necessity for the split point to be a space has been
> removed, because I removed the failure of HTML layout to handle the split
> point not being a space. This allows a front end to implement Unicode
> line breaking. For example, see:
>
> http://www.netsurf-browser.org/welcome/index.ja
>
> If you only split at spaces, that paragraph will only ever be one long
> line. (There are no spaces to split on.)

Which it is here. Actually with some window resizing I can see it's
two lines overlapping in the middle, which is clearly wrong and
probably a bug in my font code.

> All the front ends that only handle splitting on spaces (everything but
> the GTK front end) do this (with a few variations):

I don't see why all this needs to be in the frontend.

> 1. Find how much text can fit in the available width
> 2. Store the char_offset of that split point (the first char that
> doesn't fit)..

That should be as much as the frontend needs to do.

> 3. If char_offset is a space, return that split point.
> 4. If char_offset is not a space, search back towards the start of the
> string, and break on the first space.
> 5. If we got a space (didn't reach the start of the string) return the
> char_offset of the space we found.
> 6. If we didn't find a space before the char_offset split point, search
> towards the end of the string, looking for a space.
> 7. Split point is either the first space found, or if none, then it's
> the length of the string.

The core can do all that.

Advantages:
* nsfont_split is no longer needed (core can use
nsfont_position_in_string and nsfont_width)
* Consistent splitting behaviour across all platforms - splitting on
characters other than space only needs to be added once
* More complex splitting could be added, for example splitting
mid-word and hyphenating depending on language rules

Disadvantage:
nsfont_width might need to be called in addition to
nsfont_position_in_string. There's a potential speed penalty if the
actual width post-split is needed by the core, depending on the
current implementations of nsfont_split, although I can't see why it
would need that information very frequently (only if it reaches #6?)

> > I've slightly bodged it now, so it only splits on spaces, but when it
> > doesn't find a space (which seems to be happening in some instances
> > when a space is clearly present, not sure why yet), it just splits
> > where x falls. This is working without any words getting split across
> > lines.
>
> That sounds peculiar.

That's what I thought. I need to study that code a bit closer, I'm
not entirely convinced it is doing what it is supposed to, it gets a
bit confusing with UTF-8 and UTF-16 strings operating in parallel.

Chris

No comments:

Post a Comment