Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Password disclosure with Jenkins Swarm plugin #15

Open
paddy-hack opened this issue Nov 30, 2017 · 30 comments
Open

Password disclosure with Jenkins Swarm plugin #15

paddy-hack opened this issue Nov 30, 2017 · 30 comments

Comments

@paddy-hack
Copy link
Contributor

I've been following along to set up a Docker swarm of self registering Jenkins agents using your docker-jenkins-slave-dind project as my starting point. I didn't like passing the password on the command-line (trivially retrievable via the ps command) so I had a look at your Docker secrets solution. That doesn't address the ps command issue but at least it's an improvement.

Then I noticed the -passwordEnvVariable option in the agent's CLI documentation. That looked better so I tried that. I had to export the variable after reading it from the Docker secrets to make things work. So far, so good. The disappointment came when I ran the env command in a test job. The password is still trivially retrievable. Do you have any suggestions to improve the password confidentiality?

Personally, I kind of expected the password to be read from the environment variable into (secure) memory by the agent (and kept in memory for as long as necessary) and then immediately unset the environment variable before accepting any jobs.

# I realize this is probably an upstream issue, but I am loathe to create yet another account to report an issue. Apologies if that bothers you. 🙇

@vfarcic
Copy link
Owner

vfarcic commented Dec 1, 2017

I don't think there is a solution to the problem without making changes to the plugin itself. At least, I could not come up with a better one than to use a Docker secret which, as you said, exposes the password through ps command.

@vfarcic
Copy link
Owner

vfarcic commented Dec 1, 2017

Now, when I think about it (after reading your comment one more time), there might be a better solution. We could use passwordEnvVariable, run the java process in background, unset the env. var., and move the process back to the foreground. I think that would prevent people from discovering the value using env in their jobs. The password would still be discoverable if a) a person manages to enter inside the container and outputs contents of the secret file or b) if he creates a job with a command like cat /run/secrets/PASSWORD_SECRET_FILE. A more secure solution would probably need to involve something like Vault.

What do you think?

@paddy-hack
Copy link
Contributor Author

Thanks for your replies. I've thought of a "background-unset-foreground" work-around in the mean time as well. I really think the plugin should do this, though. I also realized, as you did, that you can very likely still get at the password via the file in /run/secrets/, definitely when running the Swarm plugin CLI as root (as your Dockerfile and docker-compose files seem to do by default even though you create a jenkins user 🤔).

I run the CLI via an entrypoint as the jenkins user (after adding a group with the same gid as the docker engine on the host and adding jenkins to that group so it can access /var/run/docker.sock). That setup has no problem reading the secret.

I haven't checked anything yet, but perhaps we can remove the /run/secrets/ file once read or drop privileges so it becomes unreadable? Not sure whether the Docker engine will let you or would restore privileges. If the jenkins user cannot do this, my entrypoint or your run.sh command could be run as root and run the CLI as jenkins via a su -c "java ..." jenkins.

WDYT?

@paddy-hack
Copy link
Contributor Author

[re clearing the password environment variable] I really think the plugin should do this, though.

I had a look at the plugin code and really wanted to do this ⬇️ but it seems Java has its own ideas about programatically changing a process' environment variables (as in: you very likely cannot 😒).

--- a/client/src/main/java/hudson/plugins/swarm/Client.java
+++ b/client/src/main/java/hudson/plugins/swarm/Client.java
@@ -83,6 +83,7 @@ public class Client {
         // password from the env and set as password.
         if (options.passwordEnvVariable != null) {
             options.password = System.getenv(options.passwordEnvVariable);
+            System.unsetenv(options.passwordEnvVariable);
         }
 
         // Only look up the hostname if we have not already specified

Now, strictly speaking, it is not the swarm CLI process that needs to drop the passwordEnvVariable but the slave-side process that runs the job. Unfortunately, I have no idea how to achieve that, if at all possible.

I've also looked at that "background-unset-foreground" work-around but seem to be running into job control issues courtesy of the busybox shell in the docker base image. I'm not sure, but I think I just may be able to work that out.

perhaps we can remove the /run/secrets/ file

I've tinkered a bit with the /run/secrets/. From within the container, root cannot remove or modify any files there. Changing permissions from the inside does not work either. But none of that is necessary. You can change the owner, group and permissions from the docker-compose.yml file using the long syntax.

So given that, picking suitable owner, group and permissions and running the swarm CLI as a non-root user, you can make the /run/secrets/PASSWORD_SECRET_FILE unreadable. I've had good success with the default owner and group (both root) and changing the permissions from the default 0444 to 0440 or 0400.

If you're interested, I'll see if I can pull all this together in a pair of PRs for docker-jenkins-slave-dind and the jenkins-swarm-agent-secrets file.

@vfarcic
Copy link
Owner

vfarcic commented Dec 5, 2017

Pushing PRs sound great!

@paddy-hack
Copy link
Contributor Author

I just submitted two PRs addressing much, but not all, of this issue. It handles the removal of the password from the ps output and protects the Docker secrets from prying eyes. The latter requires that the swarm-agent runs as a non-root user which is achieved as part of the first PR above.

I have not been successful is removing the environment variable from the swarm-plugin CLI process' context 😭. The run.sh script does not have access to a controlling terminal, preventing access to job control, see this Busybox FAQ. Rather than spending time trying to find a way around that, I think time is better spent fixing the swarm-plugin CLI part so that it removes the environment variable as soon as it is read or, alternatively, keep it out of context of all jobs that the slave runs (whichever is most easily achieved in Java).

@vfarcic
Copy link
Owner

vfarcic commented Dec 13, 2017

Have some workshop tomorrow that uses this image. Not enough time to test it before that. I'll merge the code on Friday if that's OK.

@paddy-hack
Copy link
Contributor Author

No problem.

@vfarcic vfarcic closed this as completed Dec 16, 2017
@vfarcic
Copy link
Owner

vfarcic commented Dec 16, 2017

PR worked :). Closing the issue...

@paddy-hack
Copy link
Contributor Author

Your reverts in 62202380 and 82f86062 effectively nullified what I was trying to achieve, so please reopen this issue. At present, the only thing that is achieved is removing the password from the ps output. It is still very visible in the env output and, if using Docker secrets, cat /run/secrets/jenkins-pass

@vfarcic
Copy link
Owner

vfarcic commented Dec 17, 2017

Sorry for the revert. I had a few failure in my own cluster and a few existing users complained that things stopped working. I reverted it only as a temporary measure until we figure out how to make it work in all the cases.

The problem with the PR is that it causes permission issues with artifacts created through containers. Most images are using the root user 1. I think we should change it in a way that security hardening is optional. For example, we can move parts from Dockerfile to run.sh. If an environment variable is set (e.g., SECURE=true|false), it could create the jenkins user with or without putting it into the root group. The same applies to the rest of the changes from this PR.

There might be a better way than what I described.

I guess that the major problem is to make it flexible enough to cover both users with security concerns as well as those who run things as root (e.g., containers), and the existing users.

What do you think?

@paddy-hack
Copy link
Contributor Author

No need to apologize 😉 If my PRs break stuff without documenting that, they shouldn't be merged or, as you did, (partially) reverted. My point is with the state of this issue. After reverting, this issue should have been reopened. That's all.

I did note in my PRs that ownership of/in the workspace needs to be changed. However, I have noticed permission related problems in my own cluster at the office as well. I am not sure if these are the result of left-overs from earlier experiments or real but for the moment I have:

  • completely removed the jenkins user from the Docker images that I use
  • changed the swarm plugin CLI to run as the root user in the container
  • removed everything below the /workspace/ directory
  • set things up to use a /workspace/$HOSTNAME/workspace/ for the -fsroot of each replica (because I'm not sure what would happen when different jobs for the same project run concurrently on different agents that share the /workspace/)

At present, I still have very little feedback on how the above works.

In the mean time, I have realized that my setup is probably pretty demanding in terms of permissions. I run a Docker Swarm spread out over two physical servers (identically configured save for their RAID levels and partition sizes). Both physical servers act as manager nodes (I'm looking for a third 😄). A dedicated user, ID 998, runs docker-machine VMs that have been added to the swarm as worker nodes. That same user creates the docker stack that runs the Jenkins agents on all worker nodes. The swarm agent CLI runs as root, ID 0. The docker user and group have ID 999. In addition, the Jenkins jobs run tasks on containers with varying user IDs, typically 0 and 1000 but other IDs are not unheard of (Red Hat based images have a habit of creating the first "normal" user with an ID of 500, IIRC).

Anyway, I agree with your suggestion that parts of the Dockerfile should probably be moved to run.sh to allow for more flexibility, via environment variables (or command-line arguments if used as an entrypoint), as to how things start up. As to making security hardening an opt-in, I respectfully disagree. It should be a cop-out 😜

@vfarcic vfarcic reopened this Dec 18, 2017
@vfarcic
Copy link
Owner

vfarcic commented Dec 18, 2017

Sorry for that. I missunderstood the initial message. You're right. Issue reopened :)

I also agree that security should be cop-out. However, the problem is that I made a mistake not including it from the start and now there are users who expect certain behavior with certain setup. How about a temporary opt-in with a deprecation notice. After a while, we can change it to cop-out. What do you think?

P.S. I have too much work today so I only skimmed through your message. I'll go through it fully and might post more comments in a day or two. Sorry for that.

@paddy-hack
Copy link
Contributor Author

Thanks for reopening and no problem at all with not following up right-away (as long as you don't forget and eventually do follow up 😉) because we're all busy. Also thanks for agreeing that security should be cop-out.

In the mean time, I discovered that

fixing the swarm-plugin CLI part so that it [...] keep[s] [the environment variable] out of context of all jobs that the slave runs

is not sufficient because the node's System Info merrily lists it in the Environment Variables section 😒. Hmm, maybe I should just forget about the environment variable approach, switch back to -password and remove execute permissions on the ps command in the container image 🤔. Nope, no good, anyone can still cat /proc/*/cmdline .

More info on user and group IDs:

  • /var/run/docker.sock is UID 0, GID 1000 inside the (VirtualBox) docker-machines
  • /workspace/ is UID 1000, GID 50 inside the docker-machines (both 998 on the physical host)

Running everything as root (see #15 (comment)) appears to work fine at the office. The only issue I've seen (so far?) is an occasional git fetch (or git config?) error when I hit Build Now in quick succession (with parallel builds enabled). Waiting a second or two between clicks does not trigger the error. Not sure what this is all about, looks like a timeout, could be IO speed related (slave side or git repo side). I don't think it's permission related.

@paddy-hack
Copy link
Contributor Author

paddy-hack commented Dec 23, 2017

remove execute permissions on the ps command in the container image 🤔

Seeing that /bin/ps is a symlink to /bin/busybox, that would prevent run.sh from doing its job to begin with so the above is a dumb idea. Removing /bin/ps won't help much because you can still simply run busybox ps as part of the job. Besides cat /proc/*/cmdline still remains a viable "attack" vector. No, the Jenkins swarm plugin really needs to remove the environment variable as soon as possible (along the lines in the diff in #15 (comment)).

@vfarcic
Copy link
Owner

vfarcic commented Dec 23, 2017

I agree.

The only real solution is to fix the problem at its core (the plugin itself). Did you report it to plugins' repo?

@paddy-hack
Copy link
Contributor Author

No, I'm not sure how to go about that seeing that issues are not enabled for the plugin's GitHub repo.

@vfarcic
Copy link
Owner

vfarcic commented Dec 26, 2017

Jenkins issues are managed through JIRA. Those related to swarm-plugin are in https://issues.jenkins-ci.org/browse/JENKINS-48477?jql=project%20%3D%20JENKINS%20AND%20component%20%3D%20swarm-plugin.

@paddy-hack
Copy link
Contributor Author

Thanks, I found Jenkins-26620. Looks like this is an old issue ... guess I could create an account (what! no hyphens allowed 😢) and chime in. Should also check if that @-mark trick (still) works.

@vfarcic
Copy link
Owner

vfarcic commented Dec 28, 2017

That indeed looks like a similar issue. It also looks like one no one cares about.

@paddy-hack
Copy link
Contributor Author

Bit the bullet, created an account and submitted https://issues.jenkins-ci.org/browse/SECURITY-687.

@vfarcic
Copy link
Owner

vfarcic commented Jan 28, 2018

Do you think that we should close this issue and let it take it's course in the one you opened @paddy-hack ?

@paddy-hack
Copy link
Contributor Author

No, we still have the issue that the files in /run/secrets/ are visible if root runs the swarm-client CLI.

Based on what's in the issue I opened, it looks like the fix will involve a "secret" that the Jenkins master issues and that a "swarm" will have to present. I would put that in a Docker secret (like you do with the password now). Still, if the user that runs the swarm-client has read access to the /run/secrets/ files, a job can get at the secret.

I suggest we keep this as pending and get back to it once the upstream issue is resolved. At present, there is just too much speculation as to how things will be fixed.

@vfarcic
Copy link
Owner

vfarcic commented Jan 29, 2018

Agreed

@darenjacobs
Copy link

I need jenkins-swarm-agent with java compiler. Can you please point me in the right direction.

@vfarcic
Copy link
Owner

vfarcic commented Apr 15, 2018

Sorry for not getting back to you earlier @darenjacobs . I somehow missed the notification...

I'm not sure what you mean by jenkins-swarm-agent. Do you mean https://github.com/vfarcic/docker-jenkins-slave-dind?

@srinivasmanthena
Copy link

srinivasmanthena commented Oct 18, 2018

I ran into an issue where jenkins agent couldn't connect to master if the password has special characters. I posted my issue on stackoverflow, but want to bring to your attention as this could be a new bug. I don't think this is an issue with docker secrets as /run/secrets/ has the correct password mounted to container. here is the stackoverflow
https://stackoverflow.com/questions/52768542/docker-secrets-not-working-when-password-has-special-characters

@vfarcic
Copy link
Owner

vfarcic commented Oct 22, 2018

Sorry for not getting back to this earlier. I'm traveling and won't be able to go through it before the next week. If you manage to figure out the solution earlier, I'd appreciate a PR.

@srinivasmanthena
Copy link

I couldn't figure out where the issue is occurring. I am just a admin, not a DEV, so I am not confident in reading through the jenkins-slave jar java code. Its either docker code that is not reading the special characters or jenkins-slave-jar java code.

@vfarcic
Copy link
Owner

vfarcic commented Nov 1, 2018

That does seem to be an issue with Jenkins JNLP agent. If that's the case, I don't think there's much that can be done outside fixing the plugin itself.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants