Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kernel: bypass unicode translation in group for latin1 putc_requests #9013

Conversation

frazze-jobb
Copy link
Contributor

Group checks that user_drv is in raw output mode and latin1 encoding mode
if true then the request is sent as latin1 and outputted by user_drv as is
if false then the request is converted to a unicode binary before its sent to user_drv.

@frazze-jobb frazze-jobb added team:VM Assigned to OTP team VM enhancement labels Nov 1, 2024
@frazze-jobb frazze-jobb requested a review from garazdawi November 1, 2024 13:45
@frazze-jobb frazze-jobb self-assigned this Nov 1, 2024
Copy link
Contributor

github-actions bot commented Nov 1, 2024

CT Test Results

    3 files    155 suites   2h 10m 40s ⏱️
3 599 tests 3 312 ✅ 287 💤 0 ❌
4 234 runs  3 894 ✅ 340 💤 0 ❌

Results for commit 27e7d05.

♻️ This comment has been updated with latest results.

To speed up review, make sure that you have read Contributing to Erlang/OTP and that all checks pass.

See the TESTING and DEVELOPMENT HowTo guides for details about how to run test locally.

Artifacts

// Erlang/OTP Github Action Bot

Comment on lines 675 to 676
Latin1 = get_unicode_state(Drv) =:= false,
Raw = get_output_state(Drv) =:= raw,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is my understanding correct that this adds two extra round-trip RPC calls between group and the driver for every print request?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, your understanding was correct, but that has been removed, I think we can assume that user_drvers output mode is raw when we are in noshell mode. And we then only have to check encoding is latin1. which is now done in setopts. I have changed so that we dont need the extra roundtrips.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can assume that user_drvers output mode is raw when we are in noshell mode.

No, you cannot. The shell =:= noshell in group is used to determine if a shell process is attached to the specific group, that is if the group is the user process or not. If the system has a shell, there will be one or more other group processes that handles the shells. Also, the change of encoding request can happen from any of the groups in the system.

You can however assume that it is only the process with shell =:= noshell that we can do the optimization with, so if you want to avoid doing the calls to user_drv, then user_drv needs to notify the user group of the encoding change. user_drv can also switch from being in raw to cooked mode if anyother issues shell:start_interactive/0, so user_drv also need to notify user of that change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aha, then I will revert back to asking the user_drv for now. But make it into a single question instead of two.

@frazze-jobb frazze-jobb force-pushed the frazze/kernel/group_latin1_bypass/OTP-19296 branch from 2f69015 to c4c6ca1 Compare November 4, 2024 10:39
@@ -39,7 +39,8 @@
overflow/1,
verify_sections/4,
unicode/1,
bad_io_server/1
bad_io_server/1,
bypass_unicode_conversion/1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
bypass_unicode_conversion/1
bypass_unicode_conversion/1

Comment on lines 675 to 676
Latin1 = get_unicode_state(Drv) =:= false,
Raw = get_output_state(Drv) =:= raw,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can assume that user_drvers output mode is raw when we are in noshell mode.

No, you cannot. The shell =:= noshell in group is used to determine if a shell process is attached to the specific group, that is if the group is the user process or not. If the system has a shell, there will be one or more other group processes that handles the shells. Also, the change of encoding request can happen from any of the groups in the system.

You can however assume that it is only the process with shell =:= noshell that we can do the optimization with, so if you want to avoid doing the calls to user_drv, then user_drv needs to notify the user group of the encoding change. user_drv can also switch from being in raw to cooked mode if anyother issues shell:start_interactive/0, so user_drv also need to notify user of that change.

@frazze-jobb frazze-jobb force-pushed the frazze/kernel/group_latin1_bypass/OTP-19296 branch 4 times, most recently from 96bf978 to ec11a53 Compare November 7, 2024 09:03
@frazze-jobb frazze-jobb added the testing currently being tested, tag is used by OTP internal CI label Nov 7, 2024
io_request({put_chars_sync, latin1, Chars, Reply}, TTY) ->
case {prim_tty:unicode(TTY), prim_tty:output_mode(TTY)} of
{false, raw} ->
Bin = if is_binary(Chars) -> Chars; true -> list_to_binary(Chars) end,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the list_to_binary crashes, an error should be returned to the user.

@frazze-jobb frazze-jobb force-pushed the frazze/kernel/group_latin1_bypass/OTP-19296 branch from ec11a53 to 9d6974f Compare November 11, 2024 11:23
garazdawi
garazdawi previously approved these changes Nov 11, 2024
latin1 requests are sent to user_drv, if its in latin1 mode, and raw
output mode, then just output the latin1 binary. Otherwise user_drv
will convert the output to unicode.
@frazze-jobb frazze-jobb force-pushed the frazze/kernel/group_latin1_bypass/OTP-19296 branch from 9d6974f to 27e7d05 Compare November 12, 2024 10:04
@frazze-jobb frazze-jobb merged commit e423a9e into erlang:master Nov 13, 2024
21 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement team:VM Assigned to OTP team VM testing currently being tested, tag is used by OTP internal CI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Very slow file:write to stdout with latin1 encoding on OTP 26.2.3
3 participants