Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(agent): Win32 pseudo-console-based agent launcher executable #378

Merged
merged 13 commits into from
Nov 7, 2023

Conversation

CarlosNihelton
Copy link
Contributor

One can never find the end of the joy of playing with Win32.

Not sure whether this should be marked as feature or bugfix. The aim is to prevent WSL API to cause terminal windows popping up when the agent uses it to take an action inside a WSL instance. That happened because WSL API launches console processes, so if the parent is not a console process, a new one is created. There is no public API to control that behavior.

So the solution is to compile the agent as a console application, so it hosts the WSL API processes. To make it still invisible, another layer of indirection is added: a native Win32 window process is created, but instead of showing a window, it creates a pseudo-console device and hosts the agent under it, as if the launcher was a skinny version of conhost.exe. Being a pseudo-console, we can fully control how it's presented to the user, or not presented at all, as done in this PR.

The logic behind creating a pseudo-console device and using it to host child processes is quite convoluted, in my opinion, lots of possible error paths requiring freeing resources, thus I made it more C++-ish, with RAII like semantics, either by leveraing classes or being creative with std::unique_ptr.

For more information about this thingy, please check out:

https://learn.microsoft.com/en-us/windows/console/creating-a-pseudoconsole-session
and
https://devblogs.microsoft.com/commandline/windows-command-line-introducing-the-windows-pseudo-console-conpty/

The GUI, its tests and the appxmanifest were also updated to reflect that the "agent's background process" must be the launcher.

A nice side effect is that, if the launcher is killed, let's say by the task manager, the agent quits cleanly, deleting the addr file as if it received Ctrl-C. That was not (and still is not) true for killing the agent itself, it has no chance to react.

@CarlosNihelton CarlosNihelton changed the title Agent launcher feat(agent): Win32 pseudo-console-based agent launcher executable Oct 31, 2023
@CarlosNihelton CarlosNihelton marked this pull request as ready for review October 31, 2023 12:44
when building the MSIX
That allows enforcing the final location in the package
Both agent and launcher will stay in the same directory.
GUI applications rely on some sort of event loop to mediate the
unpredictable timing user input will happen.

The Windows OS models this by means of the window message loop
and the WndProc function that says how the window responds to
messages it receives.

We want to wait on ReadFile to drain the console pipes, so the
child process won't block on the next printf after the pipe becomes
full.
We also want to wait on the agent process so we can quit if it quits.
We also want to respond to messages sent by the OS (such as WM_QUIT).

Threads without actual windows still have the message loop.
We can combine the window approach with some waiting functions and
create a custom message loop that allows our thread to sleep while
we don't receive any relevant input neither on the thread queue,
nor on the console pipe or the agent process termination.

Creating the pseudo-console should be declarative.
Starting the agent process under the pseudo-console should be a function
call.
Reacting to the agent's process termination or new input from the
console pipe should resemble reactive programming by passing functions.

I don't want to see tons of Win32 in my main()

I hope you enjoy this small spec.

:)
Leveraging FormatMessage function to generate a string message
out of the system's catalog, based on the HRESULT value.
Plus some sugar with std::source_location.

That will come handy in avoiding freeing memory and closing handles
in multiple code paths.

A word on `unique_string`: Because FormatMessage allocates a buffer
and expects the caller to later free it using Windows Heap functions
this small wrapper based on std::unique_ptr makes it very handy to
deal with such confusing allocate/deallocate scenarios.

Logging: we don't do it everywhere, this is a very tiny application.
But if an exception occur, than we should.
Thus, LogSingleShot(): opens a file, writes to it and closes it.
In a single shot. Not the best performance, but meant to be called just
once, before quitting.
Destructor is just a bunch of checked CloseHandle calls
and the most important ClosePseudoConsole call.

Construction is more interesting: creates anonymous pipes to
serve as the console stdio HANDLEs.

The console geometry seems irrelevant for our use case, but I
opted by passing it for further exploration.
By far the most complicated part of this feature.
Starting a child process under a pseudo console requires:
1. a STARTUPINFOEX structure filled with the handles the pseudo console
   uses for stdio
2. a LPPROC_THREAD_ATTRIBUTE_LIST containing the key PROC_THREAD_ATTRIBUTE_PSEUDOCONSOLE
   mapped to the HPCON (pseudo-console device handle).
   This list has to be allocated in the heap, since the size of its
   elements is hidden from the application code.
   We need to try to initialize it first to fail deterministically, just
   to learn how many bytes needs to be allocated.
   Then allocate the required amount of memory and try again, which
   should succeed.
   Then we fill the key-value pair we desire by calling
   UpdateProcThreadAttribute.
   Notice that's not related to the current process or thread,
   but for the one we'll start.
3. Then we CreateProcessW
4. Care must be taken to avoid leaking the process/thread attribute
   list.
   Disposing it requires calling DeleteProcThreadAttributeList followed
   by freeing the memory.
   Again, the unique_ptr with custom deleter idiom comes in handy.
   This is scoped to the process creation, once done we no longer need
   such structure.
   Since this program is meant to start only one process and live in
   sync with it, we don't need to either cache that thing nor make it
   visible outside of this routine.
Although we never create or show a window, this is still a GUI
application
Thus it's powered by the window/thread message loop I mentioned before.
The need for the event loop might still not be clear enough.
Let me detail it:

1. Pseudo-console stdio HANDLEs are anonymous pipes HANDLEs.
2. The child process inherits the parts it needs to implement stdio.
3. The parent (this program) holds the other end of the pipes.
4. If stdout pipe becomes full, the child process will block at the next
   attempt to printf.
5. Thus we need to periodically read from the pipe to drain it.
6. We can do whatever we want with the bytes we read, nothing inclusive.

7. But we must read it.

8. Calling ReadFile on an empty pipe blocks the caller until new
   data arrives.
9. The child process might quit without writing anything new.

10. The parent will wait forever.

11. Users could kill this program (via task manager, for example)
12. That causes the process to receive WM_QUIT (amongst other related
    messages)

13. We need to respond to that message (and others maybe).

A lot of asynchrony.

MsgWaitForMultipleObjectsEx to the rescue. Seems made for this problem.
We can wait on multiple HANDLEs that the OS can interpret as
synchronizeable, i.e. that has some state that can change meaningfully,
such as a process that terminates or a pipe that receives new data.
Not to be confused with Overlapped IO. That's a whole other world.
It also waits on new messages arriving to the window or thread message
queue.
If configured accordingly it blocks until one of the interesting
activities happen, letting us know which one, so we can handle it
accordingly.

The association between which HANDLE to watch and which behavior to
adopt could be hardcoded in the event loop itself, but I found so
beautiful it passed as a parameter at the event loop construction.
Otherwise the event loop would be part of the PseudoConsole class,
making it like a God.
As the fulltrust process and startup task
in the appxmanifest.
instead of the real agent, which is now a CLI application

Also updates test code that builds the agent

The GUI expects the launcher now.
Copy link
Contributor

@EduardGomezEscandell EduardGomezEscandell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments, but good stuff in general.

Regarding the pipes: can't we simply pass the same STDIO from the parent process without pipes? To reduce complexity.

msix/ubuntu-pro-agent-launcher/console.hpp Show resolved Hide resolved
msix/ubuntu-pro-agent-launcher/error.cpp Outdated Show resolved Hide resolved
msix/ubuntu-pro-agent-launcher/main.cpp Show resolved Hide resolved
msix/ubuntu-pro-agent-launcher/main.cpp Show resolved Hide resolved
msix/ubuntu-pro-agent-launcher/console.cpp Show resolved Hide resolved
msix/ubuntu-pro-agent-launcher/console.cpp Outdated Show resolved Hide resolved
msix/ubuntu-pro-agent-launcher/error.cpp Outdated Show resolved Hide resolved
@CarlosNihelton CarlosNihelton self-assigned this Oct 31, 2023
@CarlosNihelton
Copy link
Contributor Author

Regarding the pipes: can't we simply pass the same STDIO from the parent process without pipes? To reduce complexity.

There is no STDIO for a Windows application unless we create a console by calling AllocConsole, which defeats the purpose of not showing a console at all.

Copy link
Contributor

@EduardGomezEscandell EduardGomezEscandell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost there

msix/ubuntu-pro-agent-launcher/error.cpp Outdated Show resolved Hide resolved
Copy link
Contributor

@EduardGomezEscandell EduardGomezEscandell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎆 Great!

squashing commits to keep the history clean.

Deletes copy constructor and operator from Process

Consequently, applies rule-of-five.
And springle some noexcepts in the special member functions.

Add a catch(...) clause

log path directory to always under LOCALAPPDATA

Thus, renaming MakePathRelativeToEnvDir to UnderLocalAppDataPath
Throws an exception if the LOCALAPPDATA path is too big to fit in
a MAX_PATH sized buffer.

Uses at instead of [] to index vectors

The at method is checked and throw exceptions if out-of-bounds.

Avoid overwrite previous logs

operator= must return a value ;)

Plus some pedantic header inclusion
@CarlosNihelton CarlosNihelton merged commit 870c5eb into main Nov 7, 2023
33 of 34 checks passed
@CarlosNihelton CarlosNihelton deleted the agent-launcher branch November 7, 2023 11:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants