-
-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance Bottleneck in nodejs-polars When Creating Multiple Expr Objects #265
Comments
I encountered the same issue. |
It's partly a |
My project highly depends on Node.js, it's hard to migrate to Bun or Deno... |
Anything other operation using |
For me |
@maizhichao unfortunately, this is a known issue with nodejs. Their FFI implementation is incredibly slow compared to python or other js engines (deno, bun). The main bottleneck is sending values over n-api, which refactoring |
Have you tried latest version of polars?
What version of polars are you using?
0.0.15
What operating system are you using polars on?
MacOS 14.4.1 (M3 Pro)
What node version are you using
v20.12.1
Describe your bug.
I have encountered a significant performance issue when using the nodejs-polars library. Specifically, the time required to create multiple Expr objects is considerably higher compared to the Python version of polars.
What are the steps to reproduce the behavior?
To illustrate the issue, I conducted a performance test by generating one million Expr objects in both nodejs-polars and Python polars. The following code snippets demonstrate the test setup:
Python Code
Node.js Code
What is the actual behavior?
Python polars: Approximately 7 seconds to create 1,000,000 Expr objects.
Node.js polars: Approximately 1,000 seconds to create the same number of Expr objects.
Impact
This performance discrepancy presents a significant bottleneck when performing operations that require frequent creation of Expr objects in nodejs-polars. It substantially limits the library's usability for large-scale data processing tasks in a Node.js environment.
What is the expected behavior?
The performance of creating Expr objects in nodejs-polars should be closer to, or ideally match, the performance in the Python version of polars.
Possible Reason
The issue might be caused by
_Expr
that will create an new Expr object when executed. Each execution will take about 0.5ms in my laptop, consuming considerable time if executed million times. Moreover, In my test dataset, the actual computing time for millions rows of data is very short, most of the time was wasted in creating the Expr objects.There might be two ways to solve this problem:
_Expr
with class, andreturn this
in each expression method, avoiding time-consuming operations of creating a complex new object in javascript.Thanks for reading the issue, I hope my suggestion would be helpful for the nodejs-polars library.
The text was updated successfully, but these errors were encountered: