Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Configuration of SL_WIN_LATEST_64 is a mystery #24

Closed
lonniev opened this issue Aug 9, 2014 · 27 comments
Closed

Configuration of SL_WIN_LATEST_64 is a mystery #24

lonniev opened this issue Aug 9, 2014 · 27 comments

Comments

@lonniev
Copy link

lonniev commented Aug 9, 2014

The description of the Windows boxes at https://vagrantcloud.com/ju2wheels does not provide enough clues to deduce how to talk with the Windows VM once SL has instantiated it.

  • What password is used for Administrator? Apparently the randomized one assigned by Windows.
  • Is an sshd service installed and running? Apparently not: connection refused to port 22.
  • Was a vagrant user created as part of the box? Apparently not.
  • Was the WinRM service configured and started? Apparently not.
  • Is RDP available? Yes. I stumbled into how to see the Administrator password in SL and was able to get in that way. I now can't find the Show Me option in SL for those passwords again.

I would be helped if the description of the box at vagrantcloud stated what core services and authentication means are offered in the box, particularly if the box or the provider don't offer password-less sudo vagrant/vagrant and the insecure vagrant key pair tactics.

Can you give me a bit more newbie advice on how to vagrant up the Windows boxes at ju2wheels with the softlayer provider?

@lonniev
Copy link
Author

lonniev commented Aug 9, 2014

The password reveal is available on the Device Lists page: it is hidden in the dropdown that is revealed by clicking the disabled-looking |> to the left of the device's name. Very counterintuitive.

It still will help others if the description on the boxes at vagrantcloud states which administration interfaces are baked into each box.

@lonniev
Copy link
Author

lonniev commented Aug 10, 2014

The ju2wheels/SL_WIN_* boxes need a wee bit more prep work:

  • add the user vagrant with password vagrant as an Administrator
  • perform the WinRM service configuration
  • install an rsync client
  • install an sshd server
  • import the vagrant insecure rsa key into the sshd
  • tweak up the Vagrantfile so that it properly uses rsync, ssh, and winrm to connect and perform file and command operations

Currently, I can use winrm on the host (MacOS) side to vagrant up and provision the SL Windows instance and I can vagrant ssh into the vagrant account. What I don't have working is rsync from the host to the guest of the crucial ./ directories into /vagrant -- and as I write that I wonder where that unixy /vagrant directory will be created or should exist on the Windows guest.

This is the error of the moment:

Rsyncing folder: /Users/lonniev/Vagrants/softlayer-windows-jazzclm/ => /vagrant                                    
/Users/lonniev/.vagrant.d/gems/gems/vagrant-softlayer-0.3.2/lib/vagrant-softlayer/action/sync_folders.rb:89:in `block in call': uninitialized constant VagrantPlugins::SoftLayer::Errors::RsyncError (NameError)

(If I am reinventing wheels by trying to automate this, if there is an SL box that already has this all pre-provisioned, just point that out and I'll gladly give up this homework.)

@lonniev
Copy link
Author

lonniev commented Aug 10, 2014

The cwRsync app on Windows uses the cygwin.dll for unixy I/O in a DOS world, so it is going to want paths like "/cygdrive/c/..." to access outside the home directory. The following line in the Vagrantfile resolves the above RsyncError and allows chef-solo to get all the cookbook and environment/role files it wants.

  config.vm.synced_folder ".", "/cygdrive/c/vagrant", type: "rsync", rsync__exclude: [ "Vagrantfile", ".git/" ]

@ju2wheels
Copy link
Contributor

A few things on the boxes:

  • Ill update the default admin users to the stock OS templates, but if we do sshd/winrm I wouldnt add them as part of stock OS template boxes and would do them as post_install Powershell scripts as part of contrib for each service as doing both services seems redundant and customers will probably choose only one or the other.
  • I feel very reluctant to add the vagrant account and insecure key to cloud boxes for security reasons by default. In my environment we block access to boxes with a firewall so they cant be reached until rules are explicitly opened, but if a customers environment isnt setup this way then doing this leaves a security gap regardless of how short lived if its publicly accessible. I would rather leave this up to the end user to do if thats what they want and leave the boxes with the stock SoftLayer admin users (root/Administrator).
  • Finding a secure way to add the ability to get into the Windows boxes with ssh key or winrm would however be something worth generalizing so they can be used like the Linux boxes but im not sure what the best option is there.
  • Not sure whether I will add that rsync as part of box yet or just document it, need to use it in action first to figure out if thats use case specific or not.

If you need some examples, lets reference existing veewee and vagrant-windows work to cannibalize into a standalone Powershell scripts that works across all the versions. Ill start looking into this as well.

@lonniev
Copy link
Author

lonniev commented Aug 10, 2014

I agree that having to mix in unix apps (rsync, ssh) to a Windows box
smells unnatural. I like your “post_install” suggestion—but I haven’t used
or learned it yet. I’ll read up on what post_install might offer.

What about a collection of boxes that have the vagrant/vagrant account—with
a “remove vagrant” recipe? That would make these Windows deployments more
admin-friendly for those of us who just wish Windows would go away?

If I tweak up a generic SL box with cwRsync, sshd, a vagrant account, and
the vagrant.pub in %USER%.ssh, what’s the best brief gist on how to
snapshot that modified box into a new box on vagrantcloud?

On Sun, Aug 10, 2014 at 2:17 PM, Julio Lajara [email protected]
wrote:

A few things on the boxes:

  • Ill update the default admin users to the stock OS templates, but if
    we do sshd/winrm I wouldnt add them as part of stock OS template boxes and
    would do them as post_install Powershell scripts as part of contrib
    for each service as doing both services seems redundant and customers will
    probably choose only one or the other.
  • I feel very reluctant to add the vagrant account and insecure key
    to cloud boxes for security reasons by default. In my environment we block
    access to boxes with a firewall so they cant be reached until rules are
    explicitly opened, but if a customers environment isnt setup this way then
    doing this leaves a security gap regardless of how short lived if its
    publicly accessible. I would rather leave this up to the end user to do if
    thats what they want and leave the boxes with the stock SoftLayer admin
    users (root/Administrator).
  • Finding a secure way to add the ability to get into the Windows
    boxes with ssh key or winrm would however be something worth generalizing
    so they can be used like the Linux boxes but im not sure what the best
    option is there.
  • Not sure whether I will add that rsync as part of box yet or just
    document it, need to use it in action first to figure it if thats use case
    specific or not.


Reply to this email directly or view it on GitHub
#24 (comment)
.

@lonniev
Copy link
Author

lonniev commented Aug 10, 2014

My thought about the rsync faulure was malarky. The real issue was that rsync's ssh session wasn't authenticating. The line above actually just creates a directory "c:\cygdrive\c\vagrant" with a copy of the local current working directory within it. It is totally useless because the later chef specs for environments, roles, and cookbooks do the proper rsync into c:\users\vagrant..

cwRsync doesn't need "/cygdrive/c" as the root of a path to "C:".

@lonniev
Copy link
Author

lonniev commented Aug 11, 2014

Is it correct that the SL provider calls the post_install script (that can be pulled in from external URI) after installation of the SL image and before the provisioning phase (e.g. chef-solo) is run?

If so, I need to craft a script that adds the vagrant user, obtains and installs the rsync and sshd apps, copies vagrant's keys, and configures winrm, rsync, and sshd. That doesn't seem to be too impractical.

@lonniev
Copy link
Author

lonniev commented Aug 11, 2014

except for pulling in rsync and sshd, this is the gist of the sought script:

# get a reference to the local OS configurator
$computer = [ADSI]"WinNT://."

# create the vagrant user with password vagrant
$user = $computer.Create("User","vagrant")
$user.setpassword("vagrant")
$user.put("Fullname", "Vagrant User")
$user.SetInfo()

# ADS_UF_DONT_EXPIRE_PASSWD flag is 0x10000
$user.UserFlags[0] = $user.UserFlags[0] -bor 0x10000
$user.SetInfo()

# add the users created to be added to the local administrators group.
net localgroup Administrators /add "vagrant"

# configure WinRM

winrm quickconfig

Set-Item WSMAN:\LocalHost\MaxTimeoutms -Value "1800000"
Set-Item WSMAN:\LocalHost\Client\AllowUnencrypted -Value $true
Set-Item WSMAN:\LocalHost\Client\Auth\Basic -Value $true

Set-Service WinRM -startuptype "automatic"
Start-Service WinRM

@lonniev
Copy link
Author

lonniev commented Aug 12, 2014

You can see the entire script at https://github.com/lonniev/softlayer-windows-jazzclm/tree/master/post_install/windows

Apparently, post_install can handle either *.bat or *.ps1 files but can only run the powershell ones if the execution policy is set to unrestricted (or the admin goes through the trouble of signing the scripts with a cert). Therefore, to run the powershell script, I have to wrap it in a batch script that says, no, really, just run the powershell file.

I'm still debugging the powershell syntax and the sequencing of the script but it's a nearly there solution.

@lonniev lonniev closed this as completed Aug 12, 2014
@ju2wheels
Copy link
Contributor

Is that working without issues for you using WinRM against the vagrant account? Curious if you have run into any of the issues described here as I hit one of the errors shown here but havent yet determined if its due to my environment's use of GPO yet or not.

@lonniev
Copy link
Author

lonniev commented Aug 12, 2014

@ju2wheels is there a post_install protocol expectation that the script, if it exists, should conclude by requesting an OS restart? How does the SL API otherwise see the state change from not ready to running?

@lonniev
Copy link
Author

lonniev commented Aug 12, 2014

When I set up WinRM on the instance by RDPing to the VM using the Administrator account and the randomized password, I can then return to the host and run "vagrant provision" using a winrm communicator. That works.

It is, however, not a proper solution because the goal is to run "vagrant up --provider softlayer" once and yield an up and running, fully provisioned server.

If I can get through the trial and error of getting the powershell post_install to work, I'll have the VM correctly evolved to have both ssh and winrm communications and a vagrant/vagrant password-free administrator account.

Then, I can return to the point of the exercise: provisioning the desired server apps for the Windows server.

I'm nearly there. The only delay is that I keep typing in chaos monkey typos into my scripts and then getting to wait 20 minutes for the post_install(er) to timeout. (How about making that timeout(1200) parameterized?)

@lonniev
Copy link
Author

lonniev commented Aug 12, 2014

I'm going to conclude my post_install with an explicit "shutdown /r" command. I can't find documentation that says this is required but it sure seems to be what the host-side SL provider code is expecting. It won't hurt to try it in one of these 20-minute cycles. ;-)

@ju2wheels
Copy link
Contributor

The timeout is parameterized in one of the recent releases, either dev or the pending the pull request but not in the current stable I think.

@ju2wheels
Copy link
Contributor

The generic script will have to install Windows Management Framework update on pre Win2008 R2 systems in order to avoid having to create automation and fallout from having to manage the Winrm support matrix nightmare.

After looking at it some more Im also not going to split it out as originally thought, as it does make more sense to provide rsync funcationality with winrm even the customer doesnt intend to use winrm. Instead of having a standalone rsync agent and cygwin/sshd we will just drop in a minimal sshd.

@lonniev
Copy link
Author

lonniev commented Aug 13, 2014

Getting a windows sshd onto the image during the post_install phase is a
real chore: most of the sshd apps are lousy and either their download sites
or their installers demand interactive entry. Bitvise WinSshd doesn’t
require interactivity but one has to feed it a “here” file to get it to
synch authorized_keys and to allow password-less logins. Its messy.
MobaSSHD was promising because it bundles wget, chown, chmod, and rsync
along with sshd. However, it requires a GUI installer. I futzed with WASP
Select-Window | Select-Control | Send-Click but couldn’t get the timing
right.

Also a pain is getting Windows to use an existing directory as the user’s
home and profile path (two separate folders that are typically unioned). If
you try to create a homedir with an existing .ssh within it, Windows puts
the profile folder in ~user.DOMAIN.

If the box image is manually prepped with an sshd client, rsync, and the
vagrant user—all installed and created with Windows GUI apps—then the
resulting vagrant box is much, much easier to use.

Let me know in my morning (about 6 hours from now) if you spin a new
ju2wheels/SL_WIN_LATEST_64 with vagrant, sshd, and rsync in it.

Thanks.

On Tue, Aug 12, 2014 at 9:41 PM, Julio Lajara [email protected]
wrote:

The generic script will have to install Windows Management Framework
update http://support.microsoft.com/kb/968930 on pre Win2012 systems in
order to avoid having to create automation and fallout from having to
manage the Winrm support matrix nightmare
http://technet.microsoft.com/en-us/library/ff520073(WS.10).aspx.

After looking at it some more Im also not going to split it out as
originally thought, as it does make more sense to provide rsync
funcationality with winrm even the customer doesnt intend to use winrm.
Instead of having a standalone rsync agent and cygwin/sshd we will just
drop in a minimal sshd.


Reply to this email directly or view it on GitHub
#24 (comment)
.

@lonniev
Copy link
Author

lonniev commented Aug 13, 2014

Just found a trick to force the OS to create the user directory along with
the profile path without having to exit the post_install and have that user
login.

That is:
http://timrayburn.net/blog/start-a-process-as-another-user-in-powershell/

One can “net user vagrant vagrant /add” and then use the above tactic to
create the user directory. Afterwards, the .ssh directory can be created
and populated with the .ssh/authorized_keys/vagrant public key.

What a pain. ;-)

On Tue, Aug 12, 2014 at 11:50 PM, Lonnie VanZandt [email protected] wrote:

Getting a windows sshd onto the image during the post_install phase is a
real chore: most of the sshd apps are lousy and either their download sites
or their installers demand interactive entry. Bitvise WinSshd doesn’t
require interactivity but one has to feed it a “here” file to get it to
synch authorized_keys and to allow password-less logins. Its messy.
MobaSSHD was promising because it bundles wget, chown, chmod, and rsync
along with sshd. However, it requires a GUI installer. I futzed with WASP
Select-Window | Select-Control | Send-Click but couldn’t get the timing
right.

Also a pain is getting Windows to use an existing directory as the user’s
home and profile path (two separate folders that are typically unioned). If
you try to create a homedir with an existing .ssh within it, Windows puts
the profile folder in ~user.DOMAIN.

If the box image is manually prepped with an sshd client, rsync, and the
vagrant user—all installed and created with Windows GUI apps—then the
resulting vagrant box is much, much easier to use.

Let me know in my morning (about 6 hours from now) if you spin a new
ju2wheels/SL_WIN_LATEST_64 with vagrant, sshd, and rsync in it.

Thanks.

On Tue, Aug 12, 2014 at 9:41 PM, Julio Lajara [email protected]
wrote:

The generic script will have to install Windows Management Framework
update http://support.microsoft.com/kb/968930 on pre Win2012 systems
in order to avoid having to create automation and fallout from having to
manage the Winrm support matrix nightmare
http://technet.microsoft.com/en-us/library/ff520073(WS.10).aspx.

After looking at it some more Im also not going to split it out as
originally thought, as it does make more sense to provide rsync
funcationality with winrm even the customer doesnt intend to use winrm.
Instead of having a standalone rsync agent and cygwin/sshd we will just
drop in a minimal sshd.


Reply to this email directly or view it on GitHub
#24 (comment)
.

@ju2wheels
Copy link
Contributor

The api_timeout is related to the calls directly to the API, the one you are probably more interested in raising is provision_timeout (default 20 minute) which has already been committed to the develop branch. My provision script didnt seem to need a reboot, but yes its definitely taking on the order of 45min-1hr (portal reports estimated time to complete for Win 2012 STD w/4gb RAM at 71min for me) to build a windows machine so you will have to increase that timeout.

@lonniev
Copy link
Author

lonniev commented Aug 13, 2014

ok. yes, it would be the provision_timeout.

I wonder if we could modify that wait loop to spit out any intermediate
state changes that are exposed through the SL api? 60+ minutes is a long
time to stare at a shell script wondering if it is hung or is just waiting
because it hasn’t remarked since it last said, “this might take a Few
minutes”. ;-)

On Wed, Aug 13, 2014 at 10:25 AM, Julio Lajara [email protected]
wrote:

The api_timeout is related to the calls directly to the API, the one you
are probably more interested in raising is provision_timeout (default 20
minute) which has already been committed to the develop branch. My
provision script didnt seem to need a reboot, but yes its definitely taking
on the order of 45min-1hr (portal reports estimated time to complete for
Win 2012 STD w/4gb RAM at 71min for me) to build a windows machine so you
will have to increase that timeout.


Reply to this email directly or view it on GitHub
#24 (comment)
.

@ju2wheels
Copy link
Contributor

it will currently output every 10 seconds that its not done yet if logging is set to debug but not the actual state steps shown on portal.

@lonniev
Copy link
Author

lonniev commented Aug 13, 2014

starvation or forced feeding. ;-) Not only would all the unnecessary logging appear but I would get some 360+ updates will waiting. I would say once every "few" minutes a message like "still waiting for the Running state, state is currently Foo. Having waited xx minutes, I'll give up in 20-xx minutes from now" would be soothing.

@lonniev
Copy link
Author

lonniev commented Aug 13, 2014

Trying to offer ssh as a communicator for vagrant provisioning is a goose chase: after getting an sshd server in place, then an rsync command, it then wants a bash shell. That bash shell has to return $?==0 for printf $SSH_AUTH_SOCK, and so it goes. It is presuming that ssh present implies a proper unix environment.

I will leave my post_install with enough of ssh present that one can vagrant ssh into the cmd shell. That is useful by itself.

I then fixed a (nother) typo in the script on the winrm configuration. I now have winrm working as a communicator.

So I have reached the goal of taking a provided ju2wheels box for SL and Windows and post installing enough ssh and winrm configuration for doing business provisioning of the VM with chef-solo. Whew.

@ju2wheels
Copy link
Contributor

@lonniev have you experienced an abnormal high rate of post provision hook script failures (it fails to download the post_install script) at random like every 2 to 3 provisions and then works fine on rebuild?

@lonniev
Copy link
Author

lonniev commented Aug 15, 2014

The last few days have been rough like that. However, I am trying to debug
why powershell works in one environment and then not when called remotely
and I’m off in the jungle of weird cmdlets, policies, and registry hacks.
Along the way, I keep making unix-not-windows typos. So, I blame most of
the strange behavior on my ignorance.

I use “iwr”, Windows wget app, to pull web resources into the box. Yes,
several times yesterday the iwrs would time out and then suddenly work on a
retry.

On Thu, Aug 14, 2014 at 6:30 PM, Julio Lajara [email protected]
wrote:

@lonniev https://github.com/lonniev have you experienced an abnormal
high rate of post provision hook script failures (it fails to download the
post_install script) at random like every 2 to 3 provisions and then works
fine on rebuild?


Reply to this email directly or view it on GitHub
#24 (comment)
.

@ju2wheels
Copy link
Contributor

Ive come up with a way to get us passwordless ssh but these long build times are slowing dev ;-( . Aiming for sometime next week to have generalized scripts at least for WinRM that works across all the Win versions.

FYI use of iwr is not portable. Check out the change in thist gist: https://gist.github.com/ju2wheels/d4d4a767c535977b231c

The current plan (#29):

  1. Provide generalized scripts and instructions on creating custom post_install to select components wanted for the following services:
  2. Windows Management Framework Normalization (brings older Win variants up to WinRM 2.0/Powershell 2.0, will be required for WinRM enablement to simplify automation due to the number of versions)
  3. WinRM 2.0 w/HTTP (optional flags for AD cert based HTTPS and self signed HTTPS, have scripts and idea but not sure yet if it self signed will work in the end)
  4. Cygwin (setup of Cygwin with cyg-apt for post build package management and optional flag for Cygwin Ports enablement and added package enablement)
  5. vagrant-softlayer will be enhanced with an option to append selected SSH keys to API user_data and a post provision script will take this and config Cygwin ssh for the Admin user only.
  6. provide the scripts for creating vagrant user for standard vagrant box but do not include it in the default post_install scripts, user will have to create their own and pull it in themselves and assume responsibility for shooting themselves in the foot security wise.
  7. The above creates a "pluggable" framework for post_install based on your bat script.
  8. It allows for the addition of alternative process scripts to be pluggable as well (ie pulling scripts from vagrant-softlayer followed by custom stuff like pulling internal scripts from private network to change admin password.

In the end this should allow us a flexible means to do passwordless ssh and reset of WinRM password to something non random allowing better out of the box usage of ssh and WinRM.

@lonniev
Copy link
Author

lonniev commented Aug 15, 2014

I like the concept—except for (4). Cygwin is very useful once it is in
place but it has a nasty installer. If you have a way to make including and
maintaining it easy, ok. Having it there makes life easier for those of us
used to and preferring unix admin.

I was considering setting aside the SL provider to work with a local
virtual box Windows image to resolve the unexpected challenges with doing
relatively minor things (like “sudo vagrant mkdir -p
~/.ssh/authorized_keys”) in bat, powershell, remotely, with Windows
security policies.

The overhead of waiting for SL to bring up a new image kills the
trial-and-error process.

On Fri, Aug 15, 2014 at 10:43 AM, Julio Lajara [email protected]
wrote:

Ive come up with a way to get us passwordless ssh but these long build
times are slowing dev ;-( . Aiming for sometime next week to have
generalized scripts at least for WinRM that works across all the Win
versions.

FYI use of iwr is not portable. Check out the change in thist gist:
https://gist.github.com/ju2wheels/d4d4a767c535977b231c

The current plan:

  1. Provide generalized scripts and instructions on creating custom
    post_install to select components wanted for the following services:
  2. Windows Management Framework Normalization (brings older Win
    variants up to WinRM 2.0/Powershell 2.0, will be required for WinRM
    enablement to simplify automation due to the number of versions)
  3. WinRM 2.0 w/HTTP (optional flags for AD cert based HTTPS and self
    signed HTTPS, have scripts and idea but not sure yet if it self signed will
    work in the end)
  4. Cygwin (setup of Cygwin with cyg-apt for post build package
    management and optional flag for Cygwin Ports enablement and added package
    enablement)
  5. vagrant-softlayer will be enhanced with an option to append
    selected SSH keys to API user_data and a post provision script will
    take this and config Cygwin ssh for the Admin user only.
  6. provide the scripts for creating vagrant user for standard vagrant
    box but do not include it in the default post_install scripts, user
    will have to create their own and pull it in themselves and assume
    responsibility for shooting themselves in the foot security wise.
  7. The above creates a "pluggable" framework for post_install based on
    your bat script.
  8. It allows for the addition of alternative process scripts to be
    pluggable as well (ie pulling scripts from vagrant-softlayer followed
    by custom stuff like pulling internal scripts from private network to
    change admin password.

In the end this should allow us a flexible means to do passwordless ssh
and reset of WinRM password to something non random allowing better out of
the box usage of ssh and WinRM.


Reply to this email directly or view it on GitHub
#24 (comment)
.

@ju2wheels
Copy link
Contributor

Thats why cyg-apt is included, to make package install easier. I havent used it myself but its effectively apt like interface.

http://stackoverflow.com/questions/9751845/apt-get-for-cygwin

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants