Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

White space management - DO NOT MERGE #287

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

marclaval
Copy link
Contributor

This is a proposition to solve issues #68 and #285 which exist because hashspace strips white space and line return characters. Before moving forward by optimizing and fixing all failing tests (76 in compiler, 100 in RT !!!), I'd like to collect some feedback.

The basic idea behind this PR is to keep all these characters and to let browsers manage them.
To do so, the PEGJS parser now keeps everything and creates many text or eol blocks.
Optimizations (i.e. skipping some blocks) is done by the tree builder. Only one rule for the moment: EOL and withe spaces are skipped in a line if it contains only hashspace blocks which won't appear in the DOM ("if", "elseif", "else", "endif", "comment", "foreach", "endforeach", "eol", "log", "let").

With that changes, the following template behaves as expected:

{template test()}
    <div class="msg">
        <span>One</span><span>Two</span>
    </div>
    <div class="msg">
        <span>One</span>
        <span>Two</span>
    </div>
    <pre>
    abc
  def
            ghij  
    </pre>
{/template}

The main drawback is that compiled templates are bigger due to extra text nodes ...

Please discuss this approach.

@piuccio
Copy link
Contributor

piuccio commented Sep 11, 2014

http://www.w3.org/TR/html401/struct/text.html#h-9.1
http://perfectionkills.com/experimenting-with-html-minifier/#collapse_whitespace

My feeling is that the best solution would be to collapse every sequence of white space into one, however this requires an option to opt out of this behavior, in case of <pre> or elements with white-space css property.

This might be one of the trickiest problem

{template whatever()}
Component in action:
<#another-template />

Source:
<pre>
<#another-template />
</pre>
{/template}

It's very reasonable to keep all white spaces and let the browser do it's job, and as you propose, lines with only hashspace blocks can be safely removed.

@benouat
Copy link
Member

benouat commented Sep 11, 2014

We could optimize a bit the compiled template by introducing a kind of smaller syntax at textnode level.

As of today implementation (it might slightly change in the near future due to @PK1A changes to textnodes expression management) they are handled like this:

n.$text({e1:[9,"person.name",false]},["",1,"!"])

  • the first object represent the expressions list, all expressions that are used from within the text node.
  • the second argument is an array of odd/even pair of pure text / expression reference. (here we have empty text, reference to expression 1, and "!" text.

My proposal would be to split text nodes based on those white space / EOL characters
First, for white spaces we could inject them at the beginning and at the end of the node

n.$text(12, {e1:[9, "person.name", false]}, ["", 1, "!"], 8)

  • 12 means 12 white space characters before this text node
  • 8 means 8 white space characters to be re-injected after this text node.

Then for the EOL, we could simply introduce a second text node method that would append a EOL character after it (this method would also use the same technique as described above for the white spaces)

n.$textnl(3, {e1:[9, "foo['bar']", false]}, ["baz", 1], 2)

  • $textnl means that we want a text node with a new line after
  • 3 white spaces need to be added before the node
  • 2 white spaces need to be added after it and before the line break character.

What do you think ?

@olaf-k
Copy link

olaf-k commented Sep 11, 2014

It's very reasonable to keep all white spaces and let the browser do its job, and as you propose, lines with only hashspace blocks can be safely removed.

I'm with @piuccio here. There are too many tricky cases and the optimization is not worth the complexity. Let's keep it simple.

@benouat
Copy link
Member

benouat commented Sep 11, 2014

FYI This is what they have put in place for handlebars

@marclaval
Copy link
Contributor Author

Interesting to see that they keep all white spaces by default, but that they also offer an option to remove them completely without any kind of collapsing or stripping logic.
It would interesting to know if this option is actually used!

@b-laporte
Copy link
Member

I am with @piuccio as well - i.e.

  • collapse all white spaces into 1 - by default - because the compiled JS must have a default level of optimization (i.e. can't be too big as in 95% cases one space is enough)
  • support a new 'keep-spaces' attribute to let the compiler know that all the white spaces belonging to an elemnent have to be kept - e.g.
<tempalte id="foo">
   <div> 
       White spaces are collapsed into 1 here
   </div>
   <div keep-spaces>
        <span> White spaces are    kept     here </span>
   </div>
</template>

PS: of course 'keep-spaces' should not be passed to the runtime as an attribute as it won't need it...

@marclaval
Copy link
Contributor Author

I measured the possible optimization gain based on https://github.com/ariatemplates/hashspace/blob/gh-pages/playground/playground-samples-all.js
This is the file which contains all the samples of the playground.

I compared the size of the file with and without this PR:

  • without compression, it increases from 130kB to 138kB, that's 6.2% more.
  • with compression (local zip), it increases from 31.7kB to 32.3kB, that's 1.9% more.

I'm really not sure that we need this kind of optimization.

@piuccio
Copy link
Contributor

piuccio commented Sep 12, 2014

Just for fun I created a tiny repository.

This page contains non optimized HTML, while this one is optimized using google page speed.

Their optimization is to collapse all white spaces into a single blank character (either a space or a new line).
<pre> tags are not optimized.

This has no impact on most of the elements except for the last two blocks, they have a naughty css property.

If you check the score from page speed insight you'll see that the non optimized page scores 99%, so I guess google does not give a very high priority to minifying HTML.

@Mlaval The sample you're using looks a bit small, is it really only 130kB??

@marclaval
Copy link
Contributor Author

It is 130.45kB according to Github. I used it for the measuring as it is the biggest file of compiled templates I could find.

It is also important to note that the proposed optimization will not bring back the size from 138kB to 130kB since there will be still some additional text nodes in compiled templates. It would be somewhere in the middle.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants