Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Let sympy use log2(x) instead of log(x)/log(2) #712

Merged
merged 5 commits into from
Dec 6, 2024

Conversation

nerai
Copy link

@nerai nerai commented Sep 4, 2024

Fixes #711

The case of log10 is also included.

I was not immediately able to run the unit tests, but I tested it with this code:

import numpy as np
from pysr import PySRRegressor
import sympy
from sympy.codegen.cfunctions import log2

X = np.abs(np.random.randn(500, 1)) + 2
y = np.log2(X[:, 0] ** 2) + 10 / np.log10(X[:, 0])
model = PySRRegressor(
	unary_operators=[
		"log2",
		"log10",
	],
	maxsize = 20,
)
model.fit(X, y)
print(model.equations_["sympy_format"])

Equation: y = (10.003 / log10(x₀)) + log2(x₀ * x₀)
Translated to sympy: log2(x0*x0) + 10.003168/log10(x0)

@MilesCranmer
Copy link
Owner

codegen looks like an internal utility? https://docs.sympy.org/latest/modules/utilities/codegen.html Is there anything simpler we can use?

@coveralls
Copy link

coveralls commented Sep 4, 2024

Pull Request Test Coverage Report for Build 12207024711

Warning: This coverage report may be inaccurate.

This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.

Details

  • 3 of 3 (100.0%) changed or added relevant lines in 3 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.01%) to 93.581%

Totals Coverage Status
Change from base Build 12206594525: 0.01%
Covered Lines: 1385
Relevant Lines: 1480

💛 - Coveralls

@nerai
Copy link
Author

nerai commented Sep 6, 2024

codegen looks like an internal utility? https://docs.sympy.org/latest/modules/utilities/codegen.html Is there anything simpler we can use?

I'm not familiar with sympy, so I'm not sure. The docs indicate that the regular log function is translated into the paired representation, though it is not explicit:

To get a logarithm of a different base b, use log(x, b), which is essentially short-hand for log(x)/log(b).

Grepping the sympy source code for instances of log2 did not reveal anything of note, though I may have missed it.

@MilesCranmer
Copy link
Owner

MilesCranmer commented Sep 7, 2024

I am a bit worried about using this version over just log(x, 2) because codegen looks to be pretty uncommon in user code: https://github.com/search?q=/sympy.codegen.cfunctions/+language:python+&type=code. Is there anything in the sympy docs that recommend this approach over just log(x, 2)?

@nerai
Copy link
Author

nerai commented Sep 10, 2024

Good point. I asked over at sympy, and it seems it's a rare but supported function. It is documented at https://docs.sympy.org/latest/modules/codegen.html#sympy.codegen.cfunctions.log2 .

@MilesCranmer
Copy link
Owner

Thanks! Okay I’m on board then. Now, one other thing is we need the codegen import:

The codegen callable is not in the sympy namespace automatically, to use it you must first import codegen from sympy.utilities.codegen

This seems to have been done by some other package, but ideally we should also run this within PySR.

@nerai
Copy link
Author

nerai commented Sep 11, 2024

Thanks! Okay I’m on board then. Now, one other thing is we need the codegen import:

The codegen callable is not in the sympy namespace automatically, to use it you must first import codegen from sympy.utilities.codegen

This seems to have been done by some other package, but ideally we should also run this within PySR.

I think this refers only to the callable, i.e. the function codegen, not the namespace? That's how I read the example at https://docs.sympy.org/latest/modules/codegen.html#codegen-sympy-utilities-codegen . It is a bit confusing for sure.

@MilesCranmer
Copy link
Owner

Just checking in on this – could you add the line

from sympy.utilities.codegen import codegen  # noqa: F401

to the imports? This is needed according to the docs page.

@nerai
Copy link
Author

nerai commented Sep 30, 2024

Can you check if that's correct? My understanding of the docs page is that it refers to the codegen callable, not the codegen namespace. The code only uses the namespace, so the import should not be not required.

@MilesCranmer

This comment was marked as outdated.

@MilesCranmer
Copy link
Owner

MilesCranmer commented Dec 6, 2024

Seems like you need to import directly from the codegen.cfunctions for it to work:

[ins] In [1]: import sympy

[ins] In [2]: sympy.codegen.cfunctions.log2(4.0)
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[2], line 1
----> 1 sympy.codegen.cfunctions.log2(4.0)

AttributeError: module 'sympy' has no attribute 'codegen'

compared with:

[ins] In [1]: from sympy.codegen.cfunctions import log2

[ins] In [2]: log2(4.0)
Out[2]: 2.00000000000000

So I think the reason this currently works at all is because some dependency is doing the from sympy.codegen.cfunctions import .... However, to be safe, I think we should do it here as well.

@MilesCranmer MilesCranmer merged commit 4ed3570 into MilesCranmer:master Dec 6, 2024
17 checks passed
@nerai
Copy link
Author

nerai commented Dec 7, 2024

Neat, thank you :)

@nerai nerai deleted the sympy_log2 branch December 7, 2024 17:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Let sympy use log2(x) instead of log(x)/log(2)
3 participants