Monday, 14 July 2014

Re: libhubbub: branch rupindersingh/libhubbub updated. release/0.3.0-30-g1b8977f

There is no way a tokeniser could itself change to a state in the script domain. the only way it could do is through the treebuilder signalling it to do so. The change can be treated as if it is the intial state to the tokeniser, as it once began tokenising. This could be shifted to the content model flags, but that would mean larger code in the tokeniser, & would do nothing to change black-boxiness of the tokeniser. Instead, there would be added labour, as there would then be many overlapping & conflicting states in the script content flag itself. I have already changed the tokeniser to accomodate the changing of state, & it works fine with the script tags ... A precautionary measure that could be taken is to allow changing only to a set of pre-defined states.
If this doesn't look fine, then I'll try scraping off some commits and try redoing it using content model flags.


On Mon, Jul 14, 2014 at 8:53 PM, John-Mark Bell <jmb@netsurf-browser.org> wrote:
On Sat, Jul 12, 2014 at 10:47:48PM +0100, NetSurf Browser Project wrote:
>
> - Log -----------------------------------------------------------------
> commitdiff http://git.netsurf-browser.org/libhubbub.git/commit/?id=1b8977f12ae4b9a88b6bea16540661c31a5bb326
> commit 1b8977f12ae4b9a88b6bea16540661c31a5bb326
> Author: Rupinder Singh Khokhar <rsk1coder99@gmail.com>
> Commit: Rupinder Singh Khokhar <rsk1coder99@gmail.com>
>
>     Added provision for the treebuilder to change tokeniser's state. Additionally, in every loop of the dispatcher, it will be checked whether it is safe for tokeniser to process CDATA, and corresponding opts on the tokeniser will be set. this may slow the library down because of repeated checking in every loop.

I don't understand this changeset. Why have you exposed the tokeniser's
internal state to the outside world? It is extremely dangerous to do
this, as it makes it utterly unclear what triggers a state transition.
Additionally, only the tokeniser needs to know what state it is in; the
client of the tokeniser must treat it as a black box.


J.


No comments:

Post a Comment