Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Formulae does not respect original names in interactions and function calls #64

Open
tomicapretto opened this issue Feb 18, 2022 · 0 comments
Labels
enhancement New feature or request

Comments

@tomicapretto
Copy link
Collaborator

It was a very bad choice to decide to format tokens according to my personal taste. The result is that formulae modifies the original name of the term. See x : y and f(x= z) below. Note the spaces around : and after =.

import pandas as pd
from formulae import design_matrices

df = pd.DataFrame(
    {
        "x": list("ab"),
        "y": list("cd"),
        "z": [1, 2]
    }
)

def f(x):
    return x

dm = design_matrices("0 + x : y + f(x= z)", df)
dm.common.terms
{'x:y': Term([Variable(x), Variable(y)]), 'f(x = z)': Term([Call(f(x = z))])}

But formulae says there are two terms, x:y without spaces and f(x = z) with spaces on both sides.

Let's see what other formula parsing libraries do.

Patsy

from patsy import dmatrix
dm = dmatrix("0 + x : y + f(x = z)", df)
dm.design_info.term_names
['x:y', 'f(x=z)']

Patsy also removes the space around : and formats the function call, removing spaces around the equal symbol.

Formulaic

from formulaic import model_matrix
mm = model_matrix("0 + x : y + f(x = z)", df)
mm.model_spec.feature_names
['f(x= z)', 'x[T.a]:y[T.c]', 'x[T.a]:y[T.d]', 'x[T.b]:y[T.c]', 'x[T.b]:y[T.d]']

Formulaic removes spaces around : but it keeps the function call as it is.

I think that Formulaic is right. Operators such as : can be formatted, but we should not touch function calls.

@tomicapretto tomicapretto added the enhancement New feature or request label Feb 24, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant