Skip to content

Commit

Permalink
feat(windows-agent): Allow users to run it via CLI (#941)
Browse files Browse the repository at this point in the history
We modify the MSIX packaging to create an [app execution
alias](https://learn.microsoft.com/en-us/uwp/schemas/appxpackage/uapmanifestschema/element-uap3-appexecutionalias),
which causes a dummy binary to be created in a well-known location (by
default `%LOCALAPPDATA%\Microsoft\WindowsApps`) which is included in
user's `%PATH%` such that it becomes possible for the user (or scripts
running in user's behalf) to run the agent via CLI. That's preferrable
over declaring the agent as an
[Application](https://learn.microsoft.com/en-us/uwp/schemas/appxpackage/uapmanifestschema/element-application)
because it won't be "promoted" in the start menu as an application
would. Users can still find the CLI executable by searching for its
exact name ( `ubuntu-pro-agent` in this case - the `.exe` is optional in
the search but required in the appx manifest declaration).

Since we didn't remove the startup task declaration, starting the agent
via the CLI app execution alias is also considered by the OS as a user
interaction with the app, thus the startup task is enabled for that user
such that the subsequent logons will have the agent starting
automatically.

The agent is made into a single-instance application so we prevent a
second instance to overwrite logs, replace the gRPC service bindings and
corrupt the state of an already running process. We achieve that with a
simple mechanism of writing a lock file in the agent's private
directory, opened with exclusive access, so a process attempting to open
it a second time won't succeed. Since the already-running instance could
be child of the ubuntu-pro-agent-launcher.exe (which is an invisible
GUI) it's better to not implement the common pattern of GUI apps of
activating the already running process window when a instance detects
itself as the second one.

The end result looks like the following:

![Screenshot 2024-10-14
151735](https://github.com/user-attachments/assets/ca017a9f-f40a-46a4-b013-d4155b6d3667)


![image](https://github.com/user-attachments/assets/3d98fcd9-c71d-4ac8-b8b2-07209c2fb076)

- `ubuntu-pro-agent.exe` in PATH
- `ubuntu-pro-agent` can be found in the start menu
- A terminal window running it is and remains visible
- clean, completion, help and version commands still work on a second
instance
- the default command does not work and exits early enough before
touching logs, disk or network resources.

--- 

UDENG-4623

P.S.: I sneaked one minor unrelated fix in
6c1f70e because I was passing nearby.
Also 114d656 is temporary and should be
fixed after I workout the documentation PR (coming next) related to this
one.
  • Loading branch information
CarlosNihelton authored Nov 13, 2024
2 parents 9e1963c + 7344beb commit 9419e90
Show file tree
Hide file tree
Showing 6 changed files with 185 additions and 17 deletions.
7 changes: 6 additions & 1 deletion msix/UbuntuProForWSL/Package.appxmanifest
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
<?xml version="1.0" encoding="utf-8"?>
<Package xmlns="http://schemas.microsoft.com/appx/manifest/foundation/windows10" xmlns:desktop="http://schemas.microsoft.com/appx/manifest/desktop/windows10" xmlns:desktop2="http://schemas.microsoft.com/appx/manifest/desktop/windows10/2" xmlns:desktop6="http://schemas.microsoft.com/appx/manifest/desktop/windows10/6" xmlns:virtualization="http://schemas.microsoft.com/appx/manifest/virtualization/windows10" xmlns:uap="http://schemas.microsoft.com/appx/manifest/uap/windows10" xmlns:uap5="http://schemas.microsoft.com/appx/manifest/uap/windows10/5" xmlns:rescap="http://schemas.microsoft.com/appx/manifest/foundation/windows10/restrictedcapabilities" IgnorableNamespaces="desktop desktop2 desktop6 virtualization uap uap5 rescap">
<Package xmlns="http://schemas.microsoft.com/appx/manifest/foundation/windows10" xmlns:desktop="http://schemas.microsoft.com/appx/manifest/desktop/windows10" xmlns:desktop2="http://schemas.microsoft.com/appx/manifest/desktop/windows10/2" xmlns:desktop6="http://schemas.microsoft.com/appx/manifest/desktop/windows10/6" xmlns:virtualization="http://schemas.microsoft.com/appx/manifest/virtualization/windows10" xmlns:uap="http://schemas.microsoft.com/appx/manifest/uap/windows10" xmlns:uap3="http://schemas.microsoft.com/appx/manifest/uap/windows10/3" xmlns:uap5="http://schemas.microsoft.com/appx/manifest/uap/windows10/5" xmlns:rescap="http://schemas.microsoft.com/appx/manifest/foundation/windows10/restrictedcapabilities" IgnorableNamespaces="desktop desktop2 desktop6 virtualization uap uap3 uap5 rescap">
<Identity Name="CanonicalGroupLimited.UbuntuPro" Publisher="CN=23596F84-C3EA-4CD8-A7DF-550DCE37BCD0" Version="0.0.0.0" />
<Properties>
<DisplayName>Ubuntu Pro for WSL</DisplayName>
Expand Down Expand Up @@ -33,6 +33,11 @@
<uap5:Extension Category="windows.startupTask" Executable="agent\ubuntu-pro-agent-launcher.exe" EntryPoint="Windows.FullTrustApplication">
<uap5:StartupTask TaskId="UbuntuPro" Enabled="true" DisplayName="Ubuntu Pro for WSL background agent" />
</uap5:Extension>
<uap3:Extension Category="windows.appExecutionAlias" Executable="agent\ubuntu-pro-agent.exe" EntryPoint="Windows.FullTrustApplication">
<uap3:AppExecutionAlias>
<desktop:ExecutionAlias Alias="ubuntu-pro-agent.exe" />
</uap3:AppExecutionAlias>
</uap3:Extension>
<desktop:Extension Category="windows.fullTrustProcess" Executable="agent\ubuntu-pro-agent-launcher.exe">
<desktop:FullTrustProcess>
<desktop:ParameterGroup GroupId="agent" Parameters="" />
Expand Down
55 changes: 47 additions & 8 deletions windows-agent/cmd/ubuntu-pro-agent/agent/agent.go
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ import (
"os"
"path/filepath"
"runtime"
"strconv"
"strings"

"github.com/canonical/ubuntu-pro-for-wsl/common"
Expand Down Expand Up @@ -86,15 +87,29 @@ func New(o ...option) *App {
return nil
},
RunE: func(cmd *cobra.Command, args []string) error {
var opt options
for _, f := range o {
f(&opt)
}

cleanup, err := a.ensureSingleInstance(opt)
if err != nil {
// We won't serve(), so let's close the ready channel right now.
// Otherwise callers of WaitReady will block forever.
close(a.ready)
return err
}
defer cleanup()

ctx := context.Background()

cleanup, err := a.setUpLogger(ctx)
cleanup, err = a.setUpLogger(ctx)
if err != nil {
log.Warningf(ctx, "could not set logger output: %v", err)
}
defer cleanup()

return a.serve(ctx, o...)
return a.serve(ctx, opt)
},
// We display usage error ourselves
SilenceErrors: true,
Expand All @@ -112,12 +127,7 @@ func New(o ...option) *App {
}

// serve creates new GRPC services and listen on a TCP socket. This call is blocking until we quit it.
func (a *App) serve(ctx context.Context, args ...option) error {
var opt options
for _, f := range args {
f(&opt)
}

func (a *App) serve(ctx context.Context, opt options) error {
publicDir, err := a.publicDir(opt)
if err != nil {
close(a.ready)
Expand Down Expand Up @@ -266,3 +276,32 @@ func (a *App) setUpLogger(ctx context.Context) (func(), error) {

return func() { _ = f.Close() }, nil
}

// ensureSingleInstance creates a lock file to ensure that only one instance of the agent is running.
// It returns a cleanup function to release that file or an error if the lock file could not be flushed to disk.
func (a *App) ensureSingleInstance(opt options) (cleanup func(), err error) {
priv, err := a.privateDir(opt)
if err != nil {
return nil, fmt.Errorf("could not access the agent's private dir: %v", err)
}

// We deliberately create a new file instead of reusing the address file, for example, because that file has many other reasons for being recreated.
// No other file the agent creates match the semantics of exclusive ownership needed here.
path := filepath.Join(priv, "ubuntu-pro-agent.lock")
f, err := createLockFile(path)
if err != nil {
return nil, err
}

pid := strconv.Itoa(os.Getpid())
if _, err := f.WriteString(pid); err != nil {
return nil, fmt.Errorf("could not write PID to lock file %s: %v", path, errors.Join(err, f.Close()))
}
if err := f.Sync(); err != nil {
return nil, fmt.Errorf("could not flush lock file %s: %v", path, errors.Join(err, f.Close()))
}

return func() {
log.Warningf(context.Background(), "when releasing the lock file: %v", f.Close())
}, nil
}
90 changes: 82 additions & 8 deletions windows-agent/cmd/ubuntu-pro-agent/agent/agent_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@ package agent_test

import (
"bytes"
"errors"
"fmt"
"io"
"os"
Expand All @@ -23,6 +24,64 @@ func init() {
daemontestutils.DefaultNetworkDetectionToMock()
}

func TestSingleInstance(t *testing.T) {
t.Parallel()

testcases := map[string]struct {
args []string
anotherInstanceRunning bool
breakLockFile bool

wantError bool
}{
"Success when single instance": {args: []string{"version"}},

// Testing the 'clean' verb cannot run in parallel because it tries to delete files still opened by other test cases.
"Completion succeeds with another instance running": {args: []string{"completion", "bash"}, anotherInstanceRunning: true},
"Completion succeeds with broken lock file": {args: []string{"completion", "bash"}, breakLockFile: true},
"Help succeeds with another instance running": {args: []string{"help"}, anotherInstanceRunning: true},
"Help succeeds with broken lock file": {args: []string{"help"}, breakLockFile: true},
"Version succeeds with another instance running": {args: []string{"version"}, anotherInstanceRunning: true},
"Version succeeds with broken lock file": {args: []string{"version"}, breakLockFile: true},

"Default (serve) fails with another instance running": {anotherInstanceRunning: true, wantError: true},
"Default (serve) fails with broken lock file": {breakLockFile: true, wantError: true},
}

for name, tc := range testcases {
t.Run(name, func(t *testing.T) {
t.Parallel()

private := t.TempDir()
if tc.anotherInstanceRunning {
path := filepath.Join(private, "ubuntu-pro-agent.lock")

f, err := agent.CreateLockFile(path)
require.NoError(t, err, "Setup: couldn't create lock file")

defer f.Close()
}

if tc.breakLockFile {
path := filepath.Join(private, "ubuntu-pro-agent.lock")
err := os.MkdirAll(path, 0700)
err2 := os.WriteFile(filepath.Join(path, "breaking-lock.txt"), []byte{}, 0600)
require.NoError(t, errors.Join(err, err2), "Setup: couldn't break the lock file")
}

a := agent.NewForTesting(t, "", private)
a.SetArgs(tc.args...)

err := a.Run()
if tc.wantError {
require.Error(t, err, "Run should return an error")
return
}
require.NoError(t, err, "Run should not return an error")
})
}
}

func TestHelp(t *testing.T) {
a := agent.NewForTesting(t, "", "")
a.SetArgs("--help")
Expand Down Expand Up @@ -218,8 +277,8 @@ func TestAppRunFailsOnComponentsCreationAndQuit(t *testing.T) {
invalidLocalAppData bool
invalidUserProfile bool
}{
"Invalid private directory": {invalidPublicDir: true},
"Invalid public directory": {invalidPrivateDir: true},
"Invalid private directory": {invalidPrivateDir: true},
"Invalid public directory": {invalidPublicDir: true},
"Invalid LocalAppData": {invalidLocalAppData: true},
"Invalid UserProfile": {invalidUserProfile: true},
}
Expand Down Expand Up @@ -248,12 +307,12 @@ func TestAppRunFailsOnComponentsCreationAndQuit(t *testing.T) {
privateDir = badDir
}

a := agent.New(agent.WithPublicDir(publicDir), agent.WithPrivateDir(privateDir), agent.WithRegistry(registry.NewMock()))
a.SetArgs()

err := os.WriteFile(badDir, []byte("I'm here to break the service"), 0600)
require.NoError(t, err, "Failed to write file")

a := agent.New(agent.WithPublicDir(publicDir), agent.WithPrivateDir(privateDir), agent.WithRegistry(registry.NewMock()))
a.SetArgs("")

err = a.Run()
require.Error(t, err, "Run should exit with an error")
a.Quit()
Expand Down Expand Up @@ -372,6 +431,8 @@ func TestLogs(t *testing.T) {
}()

a.WaitReady()
// TODO: Implement the real fix for WaitReady() per UDENG-4900
<-time.After(1 * time.Second)

select {
case <-ch:
Expand Down Expand Up @@ -407,12 +468,14 @@ func TestClean(t *testing.T) {
// Not parallel because we modify the environment

testCases := map[string]struct {
emptyUserProfile bool
emptyLocalAppDir bool
emptyUserProfile bool
emptyLocalAppDir bool
anotherInstanceRunning bool

wantErr bool
}{
"Success": {},
"Success": {},
"Success with another instance running": {anotherInstanceRunning: true},

"Error when %UserProfile% is empty": {emptyUserProfile: true, wantErr: true},
"Error when %LocalAppData% is empty": {emptyLocalAppDir: true, wantErr: true},
Expand Down Expand Up @@ -457,6 +520,17 @@ func TestClean(t *testing.T) {
require.NoError(t, err, "Setup: could not write file outside the private directory")
}

if tc.anotherInstanceRunning {
path := filepath.Join(appData, common.LocalAppDataDir, "ubuntu-pro-agent.lock")

f, err := agent.CreateLockFile(path)
require.NoError(t, err, "Setup: couldn't create lock file")
// Since the 'clean' verb kills all other agent processes, by the time the agent attempts to clean the private directory the lock file would have been released.
// So for this test case the only thing that matters is whether the agent proceeds with cleaning (as it should) or not.
// We don't want the test to fail because the file is still open.
require.NoError(t, f.Close(), "Setup: couldn't close the fake lock file")
}

a := agent.New(agent.WithRegistry(registry.NewMock()))
a.SetArgs("clean")

Expand Down
3 changes: 3 additions & 0 deletions windows-agent/cmd/ubuntu-pro-agent/agent/export_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -46,3 +46,6 @@ func NewForTesting(t *testing.T, publicDir, privateDir string) *App {
func (a App) Config() daemonConfig {
return a.config
}

// CreateLockFile tries to create or open an empty file with given name with exclusive access.
var CreateLockFile = createLockFile
28 changes: 28 additions & 0 deletions windows-agent/cmd/ubuntu-pro-agent/agent/lockfile_linux.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
package agent

import (
"errors"
"fmt"
"os"
"syscall"

"github.com/ubuntu/decorate"
)

// createLockFile tries to create or open an empty file with given name with exclusive access.
// If the file already exists AND is still locked, it will fail.
func createLockFile(path string) (f *os.File, err error) {
decorate.OnError(&err, "could not create lock file %s: %v", path, err)

f, err = os.OpenFile(path, os.O_WRONLY|os.O_TRUNC|os.O_CREATE, 0600)
if err != nil {
return nil, err
}
// This would only fail if the file is locked by another process.
err = syscall.Flock(int(f.Fd()), syscall.LOCK_EX|syscall.LOCK_NB)
if err != nil {
return nil, fmt.Errorf("could not lock file: %v", errors.Join(err, f.Close()))
}

return f, nil
}
19 changes: 19 additions & 0 deletions windows-agent/cmd/ubuntu-pro-agent/agent/lockfile_windows.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
package agent

import (
"os"

"github.com/ubuntu/decorate"
)

// createLockFile tries to create or open an empty file with given name with exclusive access.
func createLockFile(path string) (f *os.File, err error) {
decorate.OnError(&err, "could not create lock file %s: %v", path, err)

// On Windows removing fails if the file is opened by another process with ERROR_SHARING_VIOLATION.
if err := os.Remove(path); err != nil && !os.IsNotExist(err) {
return nil, err
}
// If this process is the only instance of this program, then the file won't exist.
return os.OpenFile(path, os.O_CREATE|os.O_EXCL, 0600)
}

0 comments on commit 9419e90

Please sign in to comment.