Sub-process output parsing assumes UTF-8 but is not always UTF-8 #3591

Klaim · 2024-11-12T09:59:51Z

Troubleshooting docs

My problem is not solved in the Troubleshooting docs

Anaconda default channels

I do NOT use the Anaconda default channels (pkgs/* etc.)

How did you install Mamba?

Micromamba

Search tried in issue tracker

yes

Latest version of Mamba

My problem is not solved with the latest version

Tried in Conda?

I have this problem with Conda as well, without using Mamba

Describe your issue

Through figuring out #3584 we realized that currently when micromamba (or mamba) calls python and then parses it's output, the code assumes that the output is UTF-8. However python is designed to output using the current system/console encoding. When it is not UTF-8 and the data is detected as not being UTF-8 we can get errors, otherwise we are essentially processing incorrect data without explicit errors.
This issue can be most visible on Windows which default encoding is not UTF-8 (it can be set to UTF-8, making the issue disappear), but it can also appear on any other system which default encoding is not UTF-8.

That problem was worked-around so far by adding in the CI scripts environment variables to request python to explicitly output UTF-8 which is why our CI didnt detect the issue when new python-calling code was added to mamba/micromamba, while users can.

#3584 demonstrates that we could set that variable always through the sub-process launching command instead of requesting users to do it from externally. We do know we are calling python at that point and also know what encoding we expect to receive.
We need to generalize this solution to the other sub-process launching, including python but also the other ones. Output of these sub-process when parsed should always be treated as system-encoding (reproc doesnt change that apparently) and we need to make sure that if we parse such output it is understood or converted.

Once that is done, we can removed the ci scripts flags/env variables that hides the problem.

mamba info / micromamba info

micromamba 2.0.3 exposes that faulty behavior

Logs

N/A

environment.yml

N/A

~/.condarc

N/A

The text was updated successfully, but these errors were encountered:

Klaim added the type::bug Something isn't working label Nov 12, 2024

Klaim self-assigned this Nov 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sub-process output parsing assumes UTF-8 but is not always UTF-8 #3591

Sub-process output parsing assumes UTF-8 but is not always UTF-8 #3591

Klaim commented Nov 12, 2024

Sub-process output parsing assumes UTF-8 but is not always UTF-8 #3591

Sub-process output parsing assumes UTF-8 but is not always UTF-8 #3591

Comments

Klaim commented Nov 12, 2024

Troubleshooting docs

Anaconda default channels

How did you install Mamba?

Search tried in issue tracker

Latest version of Mamba

Tried in Conda?

Describe your issue

mamba info / micromamba info

Logs

environment.yml

~/.condarc