Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[OPTIMISATION] Improving argument parsing #2283

Open
Tracked by #2250
denis-migdal opened this issue Oct 20, 2023 · 58 comments
Open
Tracked by #2250

[OPTIMISATION] Improving argument parsing #2283

denis-migdal opened this issue Oct 20, 2023 · 58 comments

Comments

@denis-migdal
Copy link
Contributor

denis-migdal commented Oct 20, 2023

Current tasklist: #2283 (comment)

See [2275] for other optimizations about function calls.

========================================================

I started working on the optimisation of $B.args0().

I will edit this first message in the future to give a link to a github dedicated to this project + to give a summary of the results and the current progression of the project.

I will use the messages below to discuss about this issue.

Some ideas here.

@denis-migdal
Copy link
Contributor Author

denis-migdal commented Oct 20, 2023

@PierreQuentel I started to rewrite $B.args0. Currently, it only supports functions call with positional and *t parameters.

Note: We can't give positional and *t arguments on a **args parameter (at least with the implementation of Python I tried).

It seems currently x2 to x5 faster than current implementation.
Tomorrow I will post a github with the code I use to test it.

Do you see any issues with this parser (cf below) ?
Is there something I did not took into account, or do you see a set of parameters that would produce bad results ?

function SimpleParser(fct, args) {

    const result = {};

    const kargs  = args[args.length-2];
    const kwargs = args[args.length-1];

    const args_names    = fct.$infos.arg_names;
    const args_defaults = fct.$infos.__defaults__;
    const default_offset = args_names.length - args_defaults.length;
    const varargs_name  = fct.$infos.vararg;
    const kwargs_name   = fct.$infos.kwarg;

    const max = Math.max( args.length-2, args_names.length );

    // positional parameters...
    let offset = 0;
    for( ; offset < max ; ++offset)
        result[ args_names[offset] ] = args[offset];

    // vararg parameter
    if( varargs_name !== null )
        result[varargs_name] = args.slice( args_names.length, -2 );

    // positionnal only
    if( kargs === null && kwargs === null ) {

        if( default_offset < offset )
            throw new Error('XXX');

        // default parameters
        for(let i = offset - default_offset;
                i < args_defaults.length;
                ++i)
            result[ args_names[offset++] ] = args_defaults[i];

        if( kwargs_name !== null )
            result[kwargs_name] = __BRYTHON__.obj_dict({});

        return result;
    }


    //TODO: named / **args arguments
    //TODO: *arg  / **args parameters

    return result;
}

@denis-migdal
Copy link
Contributor Author

denis-migdal commented Oct 21, 2023

@PierreQuentel Normally I should have a complete version now (?). I didn't fully tested it.
I do not show the proper error messages (but that won't affect performances).

Do you see an issue with this implementation ?

On the very limited test I made, I am generally x1.7 faster than current implementation. In fact it is even faster as my measures include the time taken by the loop (so in fact I am normally ~x1.85 faster).
Note: It only shows when testing >10000000 arguments parsing, else results are too unstables. Indeed, for small durations, browsers time measurement aren't very precise.

Some variables can still be precomputed to speed things up, and maybe some little tweaks could increase performances a little more :

  • fct.$infos => const $infos = fct.$infos
  • $infos.karg_names doesn't seems to exists (some parameters can't accept positionnal arguments, and therefore shouldn't be in $infos.arg_names e.g. in def foo(a,*t, b):, b can't be positionnal.
  • I use an array for const keys, maybe there are structs that'd perform better. Potentially preallocated.
    • Map() with values set to true at first when not yet set, then set to false once set (but initialization may be quite costly).
    • {} with values set to true at first when not yet set, then set to false once set (but initialization may be quite costly).
    • {} of already defined values and a {} of values not defined yet (idem can be costly).
    • idem with 2 Set().
  • For functions having **kargs parameter, we can optimize things (we don't need keys as an array).

There are some changes for function calls. The last arguments is always named arguments, therefore we don't need the strange {$kw:[]} structure, only an array or null if no named parameters. This could give us a very small speed increase too.

Here the web page I used to test it : test.html.zip
This is quite dirty but it does the trick.

And here my argument parser :

function SimpleParser(fct, args) {

    const result = {};

    //TODO: rename :
    // - args   = when calling   the function.
    // - params = when declaring the function.

    const args_names    = fct.$infos.arg_names;
    const nb_pos_params = args_names.length;
    const varargs_name  = fct.$infos.vararg;
    const kwargs_name   = fct.$infos.kwarg;
    const nb_pos_args   = args.length-1;

    const min = Math.min( nb_pos_args, nb_pos_params );

    const kargss  = args[nb_pos_args];

    // positional parameters...
    let offset = 0;
    for( ; offset < min ; ++offset)
        result[ args_names[offset] ] = args[offset];

    // vararg parameter
    //TODO: NOT SURE FOR kargss !!!!
    if( varargs_name !== null )
        result[varargs_name] = args.slice( nb_pos_params, -1 );
    else if( nb_pos_args > nb_pos_params ) {
        throw new Error('Too much pos parameters');
    }

    // positionnal only
    if( kargss === null ) {

        const args_defaults = fct.$infos.__defaults__;
        const nb_defaults   = args_defaults.length;
        const default_offset= nb_pos_params - nb_defaults;

        if( default_offset < offset )
            throw new Error('Not enough pos parameters');

        // default parameters
        for(let i = offset - default_offset;
                i < nb_defaults;
                ++i)
            result[ args_names[offset++] ] = args_defaults[i];

        if( kwargs_name !== null )
            result[kwargs_name] = __BRYTHON__.obj_dict({});

        return result;
    }

    const kargs_names   = fct.$infos.karg_names ?? args_names.slice(); // TODO: missing !!!

    const keys  = kargs_names.slice(offset);
    // other structs possibles ???

    const extra = {};

    for(let id = 0; id < kargss.length; ++id ) {

        const
[test.html.zip](https://github.com/brython-dev/brython/files/13060472/test.html.zip)
 kargs = kargss[id];
        for(let argname in kargs) {

            let i = keys.indexOf(argname);

            if( i === -1) {
                if( kwargs_name === null )
                    throw new Error('Unfound named parameters or duplicate');

                // not quite optimized for *kwargs parameters...
                // no need for keys indexOf if *kwargs parameters.
                if(argname in result || argname in extra)
                    throw new Error('Defined many times !');

                extra[argname] = kargs[argname];
                continue;
            }

            result[ argname ] = kargs[argname];
            //delete keys[i];
            keys[i] = null;
        }
    }

    if( kwargs_name !== null )
        result[kwargs_name] = __BRYTHON__.obj_dict(extra);

    const kargs_defaults= fct.$infos.__kwdefaults__;

    // checks default values...
    for(let ioffset = 0; ioffset < keys.length; ++ioffset) {

        const key = keys[ioffset];
        if( key !== null ) {
            if( ! (key in kargs_defaults ))
                throw new Error('missing values !!');

            result[key] = kargs_defaults[key];
        }

    }

    return result;
}

@denis-migdal
Copy link
Contributor Author

Hmmm... it seems I can be even faster for named argument processing :

  • if there is a **args parameter : we don't need keys anyways, do an optimized parsing. (?).
  • if there is no **args parameter : I don't really need keys, just to count the number of assignation I'd made. Then count the number of unfound parameters when searching for their default value (likely to add a little cost). Then I compare the 2 numbers to determine if an argument was duplicate or an inexistant argument was given.

This could be tested. But I think, first, we should determine whether the proposed implementation is correct.

@denis-migdal
Copy link
Contributor Author

denis-migdal commented Oct 21, 2023

Still doing some tests (but I'll try to stop now xD).

  • New version (cf below) seems more optimized when having named arguments (didn't optimized it for **kwargs parameter though).
  • Testing is hard. Not enough tests = unstable results. Too much tests = strange memory issues it seems (started to manifest when I put the functions into different files).
  • Tests might be slightly skewed against my new parsing methods as I test them firsts (putting it in seconds seems to speed them up a little ???).

Sources: parse_args.zip

See also : https://jfmengels.net/optimizing-javascript-is-hard/

@PierreQuentel
Copy link
Contributor

Denis, I'm sorry but I need a Pull Request with a complete replacement for argument parsing, that is, a new version of the functions in py_utils.js that passes all the tests in the built-in test suite (/tests/index.html).

Otherwise it's impossible to know if your version correctly covers all the cases, and comparing its speed to that of the current implementation won't be useful.

@denis-migdal
Copy link
Contributor Author

denis-migdal commented Oct 21, 2023

Denis, I'm sorry but I need a Pull Request with a complete replacement for argument parsing, that is, a new version of the functions in py_utils.js that passes all the tests in the built-in test suite (/tests/index.html).

Something isn't right.

I downloaded the zip from github, remplaced $B.arg0 in py_utils.js, called python make_dist.py, tested /test/index.html ... and I passed all test the first time... This isn't normal.

Are the tests relying more on $B.parse_args ?
Do you have tests for all possible combination of parameters and arguments ?
Is there another script I have to call to build the Brython files ?

EDIT: I throw an exception in my new function and still pass all the tests.
It seems clear that my function isn't called.
EDIT: did the same in parse_args.

@denis-migdal
Copy link
Contributor Author

Also :

$ python3.12 make_release.py 
/tmp/brython-master/scripts/make_release.py:54: SyntaxWarning: invalid escape sequence '\d'
  content = re.sub("npm/brython@\d\.\d+\.\d+", "npm/brython@" + vname,
/tmp/brython-master/scripts/make_release.py:56: SyntaxWarning: invalid escape sequence '\d'
  content = re.sub("npm/brython@\d\.\d+\s", "npm/brython@" + vname2,
/tmp/brython-master/scripts/make_release.py:58: SyntaxWarning: invalid escape sequence '\d'
  content = re.sub("npm/brython@\d\.\d+\.x", "npm/brython@" + vname2 + '.x',
/tmp/brython-master/scripts/make_release.py:60: SyntaxWarning: invalid escape sequence '\d'
  content = re.sub("npm/brython@\d\s", "npm/brython@" + vname1,
/tmp/brython-master/scripts/make_release.py:62: SyntaxWarning: invalid escape sequence '\d'
  content = re.sub("npm/brython@\d\.x\.y", "npm/brython@" + vname1 + '.x.y',
/tmp/brython-master/scripts/make_release.py:64: SyntaxWarning: invalid escape sequence '\.'
  content = re.sub("3\.\d+\.x", f'3.{version.version[1]}.x', content)
Brython [3, 12]
CPython (3, 12)
Traceback (most recent call last):
  File "/tmp/brython-master/scripts/make_release.py", line 17, in <module>
    import make_ast_classes       # generates /src/py_ast_classes.js
    ^^^^^^^^^^^^^^^^^^^^^^^
  File "/tmp/brython-master/scripts/make_ast_classes.py", line 12, in <module>
    f = open('Python.asdl', encoding='utf-8')
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: 'Python.asdl'

@PierreQuentel
Copy link
Contributor

I downloaded the zip from github, remplaced $B.arg0 in py_utils.js, called python make_dist.py, tested /test/index.html ... and I passed all test the first time... This isn't normal.

I don't know what is wrong here.

You don't have to run make_dist.py, tests/index.html uses the individual Brython scripts (brython_builtins.js, py_utils.js, etc...).

Can you call /src/py_utils.js in the address bar to see if it is the version you have modified ?

The error message in make_release.py does not surprise me. It is not a script meant to be used by Brython users, only the release manager :-). It uses files downloaded from various places by the script scripts/downloads.py.

@denis-migdal
Copy link
Contributor Author

It seems the files are cached. I have to go to src/py_utils.js, then refresh the page, in order to update it after modifying it.

Now I get some errors :)
Gonna see if I can fix them quick ;).

@denis-migdal
Copy link
Contributor Author

denis-migdal commented Oct 21, 2023

It seems to work quite well (for now I got an error at line 529 of the basic test suite).

Got few small errors to fix :

  • some stuff linked to the input/precomputed data.
  • transforming the *args into a tupple.
  • I confused < and > in one condition (my bad xD).

My code may be a little slower as I don't have access to some pre-computed data in the format I'd need, so I have to compute them when parsing arguments.

For now, I also use the original $B.args0() to build the excepted exception.

So we'll have to discuss a little once I'd pass all tests ;).

I also have one question :

  • What are the differences between $defaults and $infos.__kwdefaults__ ?

@denis-migdal
Copy link
Contributor Author

Another question (sorry I am not quite used to Python) :

  • in def format(self, /, **kwargs):, what is / ?

@denis-migdal
Copy link
Contributor Author

denis-migdal commented Oct 22, 2023

Yeah I'm stuck at 1043 of the basic test suite because of this /.
I don't understand this behavior :

>>> class X:
...     def foo(self, **args):
...             print(args)
... 
>>> x = X()
>>> x.foo(self=x)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: X.foo() got multiple values for argument 'self'
>>> class X:
...     def foo(self, /, **args):
...             print(args)
... 
>>> x = X()
>>> x.foo(self=x)
{'self': <__main__.X object at 0x7feb070463e0>} 

EDIT: I think I understand it now.
I found no Python documentation on it. But it seems everything in the left-side of "/" can only be set using position argument, and it won't raise an error when using a key in a **args with the same name as a parameter name in the left-side of "/".

@PierreQuentel
Copy link
Contributor

Your guess is correct. The reference here is the Language Reference, specifically the section about function definition.

@denis-migdal
Copy link
Contributor Author

Thanks.

I'm struggling to fix little details like this xD.
Passed the first test suite though ^^.

@denis-migdal
Copy link
Contributor Author

completed all tests in 44.59 s.
failed : []

^^

Normal time with the current implementation : ~45 s.

However, comparison isn't fair :

  • I need to clean my code a little.
  • When I detect an error, I call the original args0() function to build the expected error message.
  • I do not always use or have the pre-computed data I'd need. So this adds a little cost.

@denis-migdal
Copy link
Contributor Author

Currently, before any code cleaning, I am (on the very limited tests I made) :

  • x0.8/0.9 slower than current implementation for positional arguments.
  • x1.2/x1.3 faster than current implementation for named arguments (I was ~x1.7 faster before).

But as I said before, tests are biased against my method as I test them first.

@denis-migdal
Copy link
Contributor Author

denis-migdal commented Oct 22, 2023

Added a little shortcut, now x1.3 to x1.7 faster for positional arguments (ofc on the very limited speed tests I made).

Could be faster if I pre-computed stuff upon function creation.

EDIT: Also could remove some tests if functions had several args0() functions depending if it has *t, **args, etc. parameters.

@denis-migdal
Copy link
Contributor Author

@PierreQuentel What should be the next step knowing that :

  • Currently, I pass all tests and should be faster than current implementation (though without adaptations, I'm still suboptimal, cf below).
  • I only replaced $B.args0() function, not the other functions.
  • I use the original $B.args0() function to raise exceptions when an error is detected.
  • To be more efficient, I'd need that the last parameters always be null, or the array of named arguments [{}, $B.obj_dict({"e": 32})] (no needs for the {$kw:[...]} structure anymore).
  • To be more efficient, I'd need some precomputed data (this prevents me from having to perform a copy at each argument parsing).
  • To be more efficient, we could have different $B.args0() called inside functions, depending whether the function has e.g. a *arg parameter.

@PierreQuentel
Copy link
Contributor

Great news Denis ! Can you submit a PR so that I can test what you have done so far ?

@denis-migdal
Copy link
Contributor Author

Ok.

Tomorrow I'll rename some variables and add some comments, before submitting a PR.

denis-migdal added a commit to denis-migdal/brython-contribs- that referenced this issue Oct 24, 2023
@denis-migdal
Copy link
Contributor Author

PR made.
See #2287

PierreQuentel added a commit that referenced this issue Oct 25, 2023
@denis-migdal
Copy link
Contributor Author

denis-migdal commented Oct 25, 2023

To speed up even more performances :

  • In functions use different args0 functions depending on what the function takes as parameters (requires changes in the JS generation from AST).
  • Replace the other functions like parse_args with the new algorithm (?).
  • Precompute the list of named-only defaults values
  • Is there a faster way to build a tuple from a subset of an array ?
  • Modify the generated JS so that the last argument would always be null or [{}, ...] (no needs for {$kw:[]} struct anymore.

To have cleaner code :

  • Generating myself the exceptions to be raised when an error is detected instead of relying on the original args0.

@PierreQuentel
Copy link
Contributor

Having a different version of $B.args0 if the function has default values or not is unfortunately not possible, because the default values can be set at run time, like in

def f(x):
  pass

try:
  f() # here, f() has no defaults
  raise Exception('should have raised TypeError')
except TypeError:
  pass

f.__defaults__ = (0,)
# f() now has a default value for x, so f() no more raises TypeError
f()

@denis-migdal
Copy link
Contributor Author

Having a different version of $B.args0 if the function has default values or not is unfortunately not possible, because the default values can be set at run time

WTF Python xD.

Well, it could be possible if __defaults__ was implemented using a setter, that'd also modify the args0 used by the function ? Like fct.$args0(...). But I guess that'd become a little tricky ?

@PierreQuentel
Copy link
Contributor

__defaults__ is already implemented as a setter in py_builtin_functions.js / $B.function.__setattr__ . It sets the attribute $defaults of the function object. The generated code could start with a test on f.$defaults and choose a different version of $B.args0 depending on its value ?

@denis-migdal
Copy link
Contributor Author

denis-migdal commented Oct 26, 2023

The generated code could start with a test on f.$defaults and choose a different version of $B.args0 depending on its value ?

If we do such check before each call of $B.args0(), that'd kill performances.

I think it'd be better to initially set a fct.args0 = $B.args0_XXXX() during build time, and to update the value of fct.args0 inside the __defaults__ setter whenever its value is modified.

@PierreQuentel
Copy link
Contributor

I think it'd be better to initially set a fct.args0 = $B.args0_XXXX() during build time, and to update the value of fct.args0 inside the defaults setter whenever its value is modified.

Agreed. In fact, a specific, optimized argument parsing function could be created for each function and updated if __defaults__ or __kwdefaults__ is set.

@denis-migdal
Copy link
Contributor Author

Thanks.

Mmm... I think I might be able to do it.
I'll test it another day as I am currently working on something else.

@denis-migdal
Copy link
Contributor Author

denis-migdal commented Nov 9, 2023

Okay, here my road map :

  • Modify the JS generation from AST to use new parsing method on the benchmark, putting all functions in the same loop (prevents browser optimisation).
    • Potentially fix/optimize my output
  • Modify the JS generation from AST to use new parsing for all functions to validate against the unit tests
    • Will requires to modify the default attributes getter (@PierreQuentel where is it defined ?)
  • Optimize function creation by having a "cache" of parsing functions.

@PierreQuentel
Copy link
Contributor

What do you mean by "attribute getter" ?

If it is attribute resolution (compute the result of getattr(object, attr)) it is done in py_builtin_functions.js / $B.$getattr

@denis-migdal
Copy link
Contributor Author

denis-migdal commented Nov 9, 2023

What do you mean by "attribute getter" ?

Sorry, it was a typo.

I meant the setter for __defaults__ and __kw_defaults__ or a stuff like that, that might change the parsing function to use when modified.

@PierreQuentel
Copy link
Contributor

Génial ! I could reproduce the same results, around 25% faster than the current implementation.

The next step would be to set the parsing function, after the section in FunctionDef.prototype.to_js() commented with "//Set admin info". Something like

// Set parsing function
// Compute arguments required for generate_args0, based on those defined
// in this function (has_posonlyargs, _defaults, kw_defaults, etc.)
let hasPosOnly = ...,
js += `${name2}.arg_parser = $B.generate_args0(${hasPosOnly}, ...)\n`

Setting function defaults is done by $B.make_function_defaults() in py_builtin_functions.js. With this new version it would also reset the function attribute arg_parser.

@denis-migdal
Copy link
Contributor Author

denis-migdal commented Nov 10, 2023

The next step would be to set the parsing function, after the section in FunctionDef.prototype.to_js() commented with "//Set admin info". Something like

It doesn't work for the function run15 : I set run15.args_parser = ..., but inside the function, run15.args_parser isn't defined.

If I set run.args_parser = ..., I can access it inside the function. However, if I write js += this.name + ".args_parser = ....", I get another error for other functions.

EDIT: but if I put it in run15.$infos.parse_args it works...

@denis-migdal
Copy link
Contributor Author

denis-migdal commented Nov 10, 2023

I passed all unit test.
Still have some garbage to remove when needed.

@PierreQuentel I have several questions :

  • is make_function_defaults called at the creation of each function ? Currently my getArgs0() is called twice at each function creation, one time after $infos and one time in make_function_defaults.
    EDIT: it seems to be the case when I look at generated JS.
  • Python dictionaries with string keys/values are a little chaotic (for kwdefaults and kwargs) :
    • JS object can either be in $jsobj or $strings. Would be more performant if I could access them through a unique attribute (and without having to call a function). E.g. setting $strings when setting $jsobj.
    • jsobj and $strings can be either undefined or null. Idem, would be more performant if it was only null or only undefined.

@denis-migdal
Copy link
Contributor Author

Okay, I think I'm done for the current PR.

@PierreQuentel
Copy link
Contributor

It's dangerous to access the dictionary keys / values from $jsobj or $strings, the internal implementation of dictionary might change in the future. It's better to use dict internal methods such as dict.$get_string or dict.$iter_items_with_hash(), which abstract the implementation.

I found another issue with your new implementation. In Python, an instance of a class with methods keys() and __getitem__ can be passed under the form **kwarg

def f(**kw):
    return kw

class D:

  def keys(self):
      yield 'a'

  def __getitem__(self, key):
      return 99

d = D()

result = f(**d)
assert result == {'a': 99}, result

@denis-migdal
Copy link
Contributor Author

It's dangerous to access the dictionary keys / values from $jsobj or $strings, the internal implementation of dictionary might change in the future. It's better to use dict internal methods such as dict.$get_string or dict.$iter_items_with_hash(), which abstract the implementation.

Ok, then I'll pre-compute a $kw_defaults for the defaults values, and see which abstraction method I'll use for **kwargs.

I found another issue with your new implementation. In Python, an instance of a class with methods keys() and __getitem__ can be passed under the form **kwarg

I assume dictionaries also have this keys() method ?
I also have to verify that the type of the key is a string and raise an error if it isn't ?
I'll have to start building the error messages myself.

It'll be slower when parsing **kwargs, but well can't have everything.

@denis-migdal
Copy link
Contributor Author

Done.

Maybe not in the most optimal way to loop over dict elements for **kwargs arguments.

I may be wrong, but I don't think you have an unit test for when we give a dict with non-string key as a **kwargs.

@PierreQuentel
Copy link
Contributor

Could you make a PR to fix this issue in the current development version ?

@denis-migdal
Copy link
Contributor Author

Could you make a PR to fix this issue in the current development version ?

Wouldn't it be better to directly merge #2316 as it seems faster in all conditions (-5% to -10% in total exec time / -20% when no heavy browser opti) ?

I just need to remove "USE_PERSO_ARGS0_EVERYWHERE" condition once you validate it.

@denis-migdal
Copy link
Contributor Author

Done.
Question : do you still use fct.$defaults ?

@denis-migdal
Copy link
Contributor Author

@PierreQuentel

@denis-migdal

This comment was marked as outdated.

@denis-migdal

This comment was marked as outdated.

@denis-migdal

This comment was marked as outdated.

@denis-migdal
Copy link
Contributor Author

denis-migdal commented Nov 21, 2023

Tsk, previous idea (I hid it) won't work as functions can be substituted/redefined in Python...

Such a shame, this could have lead to precomputation of part of the result which would make argument parsing almost without any cost in some cases...

Python really have rules that prevents lot of optimizations...

@denis-migdal
Copy link
Contributor Author

denis-migdal commented Nov 21, 2023

Okay, still, I got a great idea of optimization for calls using named arguments.

Such calls is written as :

$B.$call(....)(a,b, {$kw: [{}, ...]})

Meaning that, when we do not have a **kwarg parameter, we already have an object containing some of the function parameter.

So, inside the parser, instead of having to create a new result = {} then later filling it with the value of args[i].$kw[0], we can just directly use args[i].$kw[0] as result. And just have to count the number of element prefilled in our result.

This can possibly generate a big speed up.

Of course, we could even rewrite function calls as :

$B.$call(....)(a,b,
                      2, // pre-compute number of named arguments
                      {a: 32, b:34}, // named arguments
                      null // **kargs arguments, can be [...] if there are some
                      )

Sure, it'll create an object even when we do not use named arguments, BUT, we'd be able to recycle it if function doesn't accept **kwargs, or if the precomputed number of named arguments is 0 (i.e. is empty).
Could be negative if there are no named arguments, but there is **kargs arguments, which would allow the checks with a single comparison instead of 2.

Another optimization, Object.create(null) seems 1.18x faster than {} for creation, access, and in. Almost identical for modification.
{} is used in 368 places in Brython code.

EDIT: not a good idea, modification if forbidden in strict mode.
Creation with __proto__: null is very slow.

PierreQuentel added a commit that referenced this issue Nov 26, 2023
… (reduces size of generated Javascript). Adapt $B.make_function_defaults to JS coding style; remove f.$defaults. Related to issue #2283.
@PierreQuentel
Copy link
Contributor

In the commit above and the previous ones I have reduced the size of generated Javascript code for functions by delegating the creation of f.$infos to functions $B.make_function_infos() and $B.make_code_attr(). I have also removed f.$defaults in $B.make_function_defaults()

@denis-migdal
Copy link
Contributor Author

Cool.

I have currently some work to do + some personal matter so I'm not quite available theses days.

But once I'll have a little time I'll add the little opti I though for named arguments ;).

@PierreQuentel
Copy link
Contributor

When publishing 3.12.1 and running the speed test script, I observed that function creation was slower than before, which is logical since the function argument parser is built on function creation, and it takes a little time.

In commit 4f5bc28 I have slightly modified the feature : the parser is only built the first time the function is called. With this change, function creation is now faster, and execution is the same.

@denis-migdal
Copy link
Contributor Author

When publishing 3.12.1 and running the speed test script, I observed that function creation was slower than before, which is logical since the function argument parser is built on function creation, and it takes a little time.

There are several things :

  1. the creation of the parser function, which is indeed slow as we use Function. We could pre-create some parsers that are used very often. But I don't think this will be a huge gain.
  2. finding which parser to use. This isn't optimized.

Please note that Function is very slow as we have to parse JS code in order to create the function. This cost is "hidden" when you directly write the function in the code as it is parsed before you start executing.

In commit 4f5bc28 I have slightly modified the feature : the parser is only built the first time the function is called. With this change, function creation is now faster, and execution is the same.

Hum, you add a condition (${name2}.$args_parser ?? $B.make_args_parser(${name2})) on each function calls ?
Conditions are very expensive for such code.

Maybe you can do it like that to remove the condition:

function make_args_parser_then_parse(fct, args) {
     fct.args_parser = $B.make_args_parser(fct);            // replace the initial fct.args_parser by the newly created parser.
     return fct.args_parser(fct, args);                                // you can do it in one line if you want.
}

// create a new function:
function foo() {} // do stuff
foo.args_parser = make_args_parser_then_parse; // called upon first parsing, create the parser then parse.
// consecutive parsing will call directly the parser.

In the same way, maybe there is a way to build the function $infos only the first time it is needed (and build the parser at the same time ?) :

// create a new function:
function foo() {} // do stuff
Object.defineProperty("$infos", {
    // some config.
    get: function() { const $infos = createInfos(fct); Object.defineProperty("$infos",  $infos); return $infos;  }
});

Something like that. Then createInfos could also call $B.make_args_parser to prevent doing some operations twice.
Then, we could replace make_args_parser_then_parse by make_$infos_and_parser_then_parse.

@denis-migdal
Copy link
Contributor Author

Hum Object.defineProperty() seems quite slow, a little too much. Then the best solution would be using make_args_parser_then_parse.

Do we need $infos outside of the args_parser() / set kw defaults / etc. ?

I'll also have to do some benchmarks to see where this extra-cost comes from.
If it is due to Function, I do not think we really need to bother a lot. It's just that previously the cost was hidden has the JS code parsing time wasn't considered. If it is something else, I'll have to take a look to the optimizations possible.

@denis-migdal
Copy link
Contributor Author

Made a PR ( #2336 ) for the little opti I though about (x1.11 in parsing of named argument when no **kwargs parameter).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants