Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

passing-input-with-finite-value to core._mass is being violated #1011

Open
NimaSarajpoor opened this issue Jul 21, 2024 · 6 comments
Open

passing-input-with-finite-value to core._mass is being violated #1011

NimaSarajpoor opened this issue Jul 21, 2024 · 6 comments

Comments

@NimaSarajpoor
Copy link
Collaborator

The function core.mass(Q, T, ...) calls core.preprocess(T, m, ...), which return the following values:

  • T
  • M_T
  • Σ_T
  • T_subseq_isconstant

We then pass these values to core._mass. However, these M_T and Σ_T can contain non-finite value if the original T has non-finite value. In fact, if you just follow the function core._mass and go the callee function and then continue till you reach the very end of the road, you can see that at the end, we use that in the function core._calculate_squared_distance to determine whether a distance should be infinite or not.

stumpy/stumpy/core.py

Lines 1093 to 1094 in 3077d0d

if np.isinf(M_T) or np.isinf(μ_Q):
D_squared = np.inf

In the function core._calculate_squared_distance, the flag is correctly set. Was wondering if should do the same for core._mass?

@seanlaw
Copy link
Contributor

seanlaw commented Jul 21, 2024

What exactly are you proposing? To set fastmath={"nsz", "arcp", "contract", "afn", "reassoc"} for core._mass? Maybe... I don't know.

@NimaSarajpoor
Copy link
Collaborator Author

Right!

I am now thinking what if we just avoid setting fastmath here for core._mass because this function does not do any arithmetic operations. I can try it out and check its impact on performance if there is any.

@NimaSarajpoor
Copy link
Collaborator Author

In the following, A --> B means function A calls function B.

# in stumpy.core
mass --> _mass
_mass --> calculate_distance_profile
calculate_distance_profile -->  _calculate_squared_distance_profile
_calculate_squared_distance_profile --> _calculate_squared_distance

I removed the fastmath flag for all expect the last one, which comes with fastmath={"nsz", "arcp", "contract", "afn", "reassoc"}.

I then compared the performance of core.mass for the following input:

seed = 0
np.random.seed(seed)
Q = np.random.rand(100)
T = np.random.rand(1_000_000)

I ran the following script:

import numpy as np
import time
import stumpy

def check_mass_performance():
    n_iter = 100

    seed = 0
    np.random.seed(seed)
    Q = np.random.rand(100)
    T = np.random.rand(1000000)

    t_lst = []
    for _ in range(n_iter):
        start = time.time()
        stumpy.mass(Q, T)
        t_lst.append(time.time() - start)
    
    return np.mean(t_lst[1:]), np.std(t_lst[1:])


if __name__ == "__main__":
    out = check_mass_performance()
    print(f'mean: {out[0]}, std: {out[1]}' )

And I ran it for several times.

Running time mean std
case 1 0.096-0.097 ~0.002
case 2 0.096-0.097 ~0.002

where case 1 is the existing version and case 2 is when the fastmath flags are removed except for core. _calculate_squared_distance. Although I did not do any statistical test, it seems that there is no certain gain in having fastmath flags on all those njit functions.

@seanlaw
Copy link
Contributor

seanlaw commented Aug 3, 2024

I am now thinking what if we just avoid setting fastmath here for core._mass because this function does not do any arithmetic operations. I can try it out and check its impact on performance if there is any.

I see. When there is no speed gain, I would rather be explicit rather than implicit. In this case, it appears that the inputs/outputs can all contain non-finite values. This means that fastmath=True is "wrong". I would simply make every level of the nested function calls fastmath={"nsz", "arcp", "contract", "afn", "reassoc"}. That would be my preference as this would be very explicit and one would never have to question whether each function was allowed to have non-finite values (the answer is "yes!").

I think this should be handled together with #708 though

@NimaSarajpoor
Copy link
Collaborator Author

This means that fastmath=True is "wrong". I would simply make every level of the nested function calls fastmath={"nsz", "arcp", "contract", "afn", "reassoc"}. That would be my preference as this would be very explicit

That's a good idea! Got it!!

I think this should be handled together with #708 though

Okay. So, the plan is to first solve #708 (and this issue, i.e. #1011), and then resume the work on #1012 as suggested in #1012 (comment).

@seanlaw
Copy link
Contributor

seanlaw commented Aug 4, 2024

Yes, thank you @NimaSarajpoor!

@NimaSarajpoor NimaSarajpoor mentioned this issue Aug 6, 2024
59 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants