Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: SAM CLI from DevContainer fails either on Windows or Mac #5922

Closed
ffMathy opened this issue Sep 12, 2023 · 31 comments
Closed

Bug: SAM CLI from DevContainer fails either on Windows or Mac #5922

ffMathy opened this issue Sep 12, 2023 · 31 comments
Labels
area/docker stage/needs-investigation Requires a deeper investigation

Comments

@ffMathy
Copy link

ffMathy commented Sep 12, 2023

Description:

When I want to run sam local start-api inside a DevContainer, I encounter different failures depending on what platform my host machine is in, and what configuration I have.

Steps to reproduce:

There are two variants (or scenarios) of configurations I can make. One that fails consistently on Windows, and one that fails consistently on Mac. But I can never have it not fail on both.

Scenario 1

Repro: https://github.com/ffMathy/aws-sam-cli-repro/tree/main (main branch)

  • On Windows, the DevContainer itself crashes and disconnects (see attached screenshot below).
  • On Mac, it works as expected.

Scenario 2

Repro: https://github.com/ffMathy/aws-sam-cli-repro/tree/windows_fix (windows_fix branch)

Diff from Scenario 1: ffMathy/aws-sam-cli-repro@main...windows_fix

  • On Windows, the SAM CLI gives a runtime error (error 500) and can't find the application entrypoint, but at least it doesn't crash the whole container (that's what this issue is about).
  • On Mac, I get Error: Lambda functions containers initialization failed.

Observed result:

Scenario 1 failure (Windows)

MicrosoftTeams-image (6)

Scenario 2 failure (Mac)

2023-09-12 09:40:25,608 | Config file location: /workspace/samconfig.toml                                                                                                              
2023-09-12 09:40:25,672 | Loading configuration values from [default.['local', 'start-api'].parameters] (env.command_name.section) in config file at '/workspace/samconfig.toml'...    
2023-09-12 09:40:25,673 | Configuration values successfully loaded.                                                                                                                    
2023-09-12 09:40:25,673 | Configuration values are: {'stack_name': 'lambda-nodejs18.x', 'warm_containers': 'EAGER'}                                                                    
2023-09-12 09:40:25,843 | Using config file: samconfig.toml, config environment: default                                                                                               
2023-09-12 09:40:25,843 | Expand command line arguments to:                                                                                                                            
2023-09-12 09:40:25,844 | --template_file=/workspace/template.yaml --host=0.0.0.0 --container_host=host.docker.internal --container_host_interface=0.0.0.0                             
--docker_volume_basedir=/Users/dkMaLyLo/Documents/src/aws-sam-cli-repro --port=3000 --static_dir=public --layer_cache_basedir=/root/.aws-sam/layers-pkg --warm_containers=EAGER        
2023-09-12 09:40:25,946 | local start-api command is called                                                                                                                            
2023-09-12 09:40:25,965 | No Parameters detected in the template                                                                                                                       
2023-09-12 09:40:25,974 | There is no customer defined id or cdk path defined for resource HelloWorldFunction, so we will use the resource logical id as the resource id               
2023-09-12 09:40:25,974 | There is no customer defined id or cdk path defined for resource ServerlessRestApi, so we will use the resource logical id as the resource id                
2023-09-12 09:40:25,975 | 0 stacks found in the template                                                                                                                               
2023-09-12 09:40:25,975 | No Parameters detected in the template                                                                                                                       
2023-09-12 09:40:25,981 | There is no customer defined id or cdk path defined for resource HelloWorldFunction, so we will use the resource logical id as the resource id               
2023-09-12 09:40:25,982 | There is no customer defined id or cdk path defined for resource ServerlessRestApi, so we will use the resource logical id as the resource id                
2023-09-12 09:40:25,982 | 2 resources found in the stack                                                                                                                               
2023-09-12 09:40:25,982 | Found Serverless function with name='HelloWorldFunction' and CodeUri='/workspace/hello-world/'                                                               
2023-09-12 09:40:26,028 | watch resource /workspace/template.yaml                                                                                                                      
2023-09-12 09:40:26,029 | Create Observer for resource /workspace/template.yaml with recursive True                                                                                    
2023-09-12 09:40:26,030 | watch resource /workspace/template.yaml's parent /workspace                                                                                                  
2023-09-12 09:40:26,030 | Create Observer for resource /workspace with recursive False                                                                                                 
2023-09-12 09:40:26,092 | Initializing the lambda functions containers.                                                                                                                
2023-09-12 09:40:26,093 | Async execution started                                                                                                                                      
2023-09-12 09:40:26,093 | Invoking function functools.partial(<function InvokeContext._initialize_all_functions_containers.<locals>.initialize_function_container at 0xffffa8047ba0>,  
Function(function_id='HelloWorldFunction', name='HelloWorldFunction', functionname='HelloWorldFunction', runtime='nodejs18.x', memory=None, timeout=3, handler='app.lambdaHandler',    
imageuri=None, packagetype='Zip', imageconfig=None, codeuri='/workspace/hello-world/', environment=None, rolearn=None, layers=[], events={'HelloWorld': {'Type': 'Api', 'Properties':  
{'Path': '/hello', 'Method': 'get', 'RestApiId': 'ServerlessRestApi'}}}, metadata={'SamResourceId': 'HelloWorldFunction'}, inlinecode=None, codesign_config_arn=None,                  
architectures=['x86_64'], function_url_config=None, function_build_info=<FunctionBuildInfo.BuildableZip: ('BuildableZip', 'Regular ZIP function which can be build with SAM CLI')>,    
stack_path='', runtime_management_config=None))                                                                                                                                        
2023-09-12 09:40:26,096 | Waiting for async results                                                                                                                                    
2023-09-12 09:40:26,100 | No environment variables found for function 'HelloWorldFunction'                                                                                             
2023-09-12 09:40:26,101 | Loading AWS credentials from session with profile 'None'                                                                                                     
2023-09-12 09:40:26,276 | Resolving code path. Cwd=/Users/dkMaLyLo/Documents/src/aws-sam-cli-repro, CodeUri=/workspace/hello-world/                                                    
2023-09-12 09:40:26,277 | Resolved absolute path to code is /workspace/hello-world/                                                                                                    
2023-09-12 09:40:26,393 | watch resource /workspace/hello-world/                                                                                                                       
2023-09-12 09:40:26,393 | Create Observer for resource /workspace/hello-world/ with recursive True                                                                                     
2023-09-12 09:40:26,459 | watch resource /workspace/hello-world/'s parent /workspace                                                                                                   
2023-09-12 09:40:26,460 | Code /workspace/hello-world/ is not a zip/jar file                                                                                                           
2023-09-12 09:40:29,986 | Local image is out of date and will be updated to the latest runtime. To skip this, pass in the parameter --skip-pull-image                                  
Building image...........................................................................................................................................................................................................................................................................................................................................................................................................................
2023-09-12 09:41:10,808 | Using local image: public.ecr.aws/lambda/nodejs:18-rapid-x86_64.                                                                                             
                                                                                                                                                                                       
2023-09-12 09:41:10,810 | Mounting /workspace/hello-world/ as /var/task:ro,delegated, inside runtime container                                                                         
2023-09-12 09:41:11,120 | Exception raised during the execution                                                                                                                        
2023-09-12 09:41:11,121 | Lambda functions containers initialization failed because of 500 Server Error for                                                                            
http+docker://localhost/v1.35/containers/d55e542097167385c2c5d52cd3651592b12a3ccff5ac62ea4586d39f1cbce068/start: Internal Server Error ("error while creating mount source path        
'/host_mnt/workspace/hello-world': mkdir /host_mnt/workspace: read-only file system")                                                                                                  
2023-09-12 09:41:11,123 | Terminating all running warm containers                                                                                                                      
2023-09-12 09:41:11,123 | Terminate running warm container for Lambda Function 'HelloWorldFunction'                                                                                    
2023-09-12 09:41:11,273 | Cleaning all decompressed code dirs                                                                                                                          
2023-09-12 09:41:11,280 | Telemetry endpoint configured to be https://aws-serverless-tools-telemetry.us-west-2.amazonaws.com/metrics                                                   
2023-09-12 09:41:11,857 | Telemetry endpoint configured to be https://aws-serverless-tools-telemetry.us-west-2.amazonaws.com/metrics                                                   
2023-09-12 09:41:11,857 | Sending Telemetry: {'metrics': [{'commandRun': {'requestId': '228a789b-748b-4bc2-84b8-c7eb8680e85b', 'installationId':                                       
'ac71eba1-d337-48e5-960e-714e9eda4f7e', 'sessionId': 'd9be9397-3c79-42a9-9394-490ec4bc799f', 'executionEnvironment': 'CLI', 'ci': False, 'pyversion': '3.11.2', 'samcliVersion':       
'1.97.0', 'awsProfileProvided': False, 'debugFlagProvided': True, 'region': '', 'commandName': 'sam local start-api', 'metricSpecificAttributes': {'projectType': 'CFN', 'gitOrigin':  
None, 'projectName': '21a3230e03772a58aff1b3709a9e232850916337e1fba95c434076b6668c6e08', 'initialCommit': None}, 'duration': 45436, 'exitReason': 'ContainersInitializationException', 
'exitCode': 1}}]}                                                                                                                                                                      
2023-09-12 09:41:11,858 | Unable to find Click Context for getting session_id.                                                                                                         
2023-09-12 09:41:11,860 | Sending Telemetry: {'metrics': [{'events': {'requestId': '5d1e658a-a542-4ab2-9d04-71573b0bcec7', 'installationId': 'ac71eba1-d337-48e5-960e-714e9eda4f7e',   
'sessionId': 'd9be9397-3c79-42a9-9394-490ec4bc799f', 'executionEnvironment': 'CLI', 'ci': False, 'pyversion': '3.11.2', 'samcliVersion': '1.97.0', 'commandName': 'sam local           
start-api', 'metricSpecificAttributes': {'events': [{'event_name': 'SamConfigFileExtension', 'event_value': '.toml', 'thread_id': '045bb4ff0e3646108ff1d543c4af58c8', 'time_stamp':    
'2023-09-12 09:40:25.607', 'exception_name': None}]}}}]}                                                                                                                               
2023-09-12 09:41:12,616 | HTTPSConnectionPool(host='aws-serverless-tools-telemetry.us-west-2.amazonaws.com', port=443): Read timed out. (read timeout=0.1)                             
2023-09-12 09:41:12,617 | HTTPSConnectionPool(host='aws-serverless-tools-telemetry.us-west-2.amazonaws.com', port=443): Read timed out. (read timeout=0.1)                             
Error: Lambda functions containers initialization failed

Expected result:

I expected the SAM CLI to be launchable from inside Docker. Docker is heavily used for most developers, especially dev-containers.

Additional environment details (Ex: Windows, Mac, Amazon Linux etc)

  1. OS: Windows or Mac
  2. sam --version: SAM CLI, version 1.97.0
  3. AWS region: eu-west-1
{
  "version": "1.97.0",
  "system": {
    "python": "3.11.2",
    "os": "Linux-5.15.49-linuxkit-pr-aarch64-with-glibc2.36"
  },
  "additional_dependencies": {
    "docker_engine": "24.0.5",
    "aws_cdk": "Not available",
    "terraform": "Not available"
  },
  "available_beta_feature_env_vars": [
    "SAM_CLI_BETA_FEATURES",
    "SAM_CLI_BETA_BUILD_PERFORMANCE",
    "SAM_CLI_BETA_TERRAFORM_SUPPORT",
    "SAM_CLI_BETA_RUST_CARGO_LAMBDA"
  ]
}
@ffMathy ffMathy added the stage/needs-triage Automatically applied to new issues and PRs, indicating they haven't been looked at. label Sep 12, 2023
@sriram-mv
Copy link
Contributor

error while creating mount source path        
'/host_mnt/workspace/hello-world': mkdir /host_mnt/workspace: read-only file system")

Seems to be the culprit. Digging deeper to understand, Is this about trying to do docker in docker?

@sriram-mv sriram-mv added area/docker and removed stage/needs-triage Automatically applied to new issues and PRs, indicating they haven't been looked at. labels Sep 12, 2023
@ffMathy
Copy link
Author

ffMathy commented Sep 13, 2023

Well, essentially we just want to use the SAM CLI in a devcontainer in some way.

But no matter what we do, it (the SAM CLI or the devcontainer or vscode) crashes/disconnects in some weird manner, one or the other way.

We find it almost impossible to use, and we've spent many weeks of trying out many different options.

I think it's really important that this becomes officially supported and well-tested by the SAM team. Perhaps even documented with an example.

@ffMathy
Copy link
Author

ffMathy commented Sep 13, 2023

I forgot to mention that to repro, you must:

  • open the devcontainer in VS Code.
  • execute "start.sh" to start the lambda.
  • execute "execute.sh" to invoke the running lambda with curl.

@hawflau hawflau added the stage/needs-investigation Requires a deeper investigation label Sep 18, 2023
@jysheng123
Copy link
Contributor

Hi, thanks for specifying the reproducing steps, I was able to run sam local start-api in the dev container using the repo you provided in scenario 2, however I had no issues starting it and invoking it. I also made sure that it was my local files that were edited by verifying changes on them when re starting and invoking. Could you let me know more about how you reproduced it on scenario 2 and more specifically attach the output of sam --info inside the dev container? I have some theories on what may have happened but I need more information about reproducing it to verify them. Thanks

@jysheng123 jysheng123 added the blocked/more-info-needed More info is needed from the requester. If no response in 14 days, it will become stale. label Sep 26, 2023
@ffMathy
Copy link
Author

ffMathy commented Sep 26, 2023

Scenario 2 only fails on Mac. Scenario 1 only fails on Windows when Docker is used via WSL2.

Did you use the right OS?

We verified with 2 Windows machines and 2 Mac machines.

@ffMathy
Copy link
Author

ffMathy commented Sep 26, 2023

Also note that the repro link to scenario 1 and 2 are not the same. They are linking to two separate branches.

@jysheng123 jysheng123 removed the blocked/more-info-needed More info is needed from the requester. If no response in 14 days, it will become stale. label Sep 27, 2023
@jysheng123
Copy link
Contributor

I have verified using the correct scenario 2 link, I can still run my local server fine with ./start.sh. Just to verify you are failing at that script and not running execute.sh right? Also can you please send the sam --info regardless for the dev container, I am still having issues reproducing it

@ffMathy
Copy link
Author

ffMathy commented Sep 27, 2023

I see. I'll provide the SAM information soon when I'm at my computer.

Start.sh is not enough 🙂 You need to also execute. It's only once it executes that the container disconnects.

@ffMathy
Copy link
Author

ffMathy commented Sep 27, 2023

Here is my sam --info output. I ran it inside the dev-container on my Mac machine:

{
  "version": "1.97.0",
  "system": {
    "python": "3.11.2",
    "os": "Linux-5.15.49-linuxkit-pr-aarch64-with-glibc2.36"
  },
  "additional_dependencies": {
    "docker_engine": "24.0.5",
    "aws_cdk": "Not available",
    "terraform": "Not available"
  },
  "available_beta_feature_env_vars": [
    "SAM_CLI_BETA_FEATURES",
    "SAM_CLI_BETA_BUILD_PERFORMANCE",
    "SAM_CLI_BETA_TERRAFORM_SUPPORT",
    "SAM_CLI_BETA_RUST_CARGO_LAMBDA"
  ]
}

@ffMathy
Copy link
Author

ffMathy commented Sep 27, 2023

My Mac is an M1 Mac by the way, so ARM architecture. Not sure if it matters.

@jysheng123
Copy link
Contributor

Hi, our team is still having issues reproducing your error on scenario 2 with the windows_fix, host_mnt/workspace is interesting because it is not part of our codebase of all in creating new paths like this. This combined with the fact that we cant replicate it with the same environment points it towards being an issue with docker. A couple google searches from similar issues indicate that it may be a problem with not downloading and updating docker through the official supported sources (moby/moby#34427 and https://stackoverflow.com/questions/45764477/docker-compose-error-while-creating-mount-source-path). Could you try uninstalling re installing docker through the official areas and let us know what happens then? How is docker installed on your machine?

@ffMathy
Copy link
Author

ffMathy commented Sep 27, 2023

I'll investigate tomorrow.

As for scenario 1 on Mac, is that possible to replicate for you?

@jysheng123
Copy link
Contributor

jysheng123 commented Sep 27, 2023

Scenario 1 on Mac works properly for me, for windows I will get back to you on this, having some issues with my virtual machine for now

@ffMathy
Copy link
Author

ffMathy commented Sep 28, 2023

Ah yes I meant scenario 1 on Windows.

Sounds good!

@jysheng123
Copy link
Contributor

For Windows, Docker in docker is not working properly for me, as in Docker is not installed in the dev container so I can not verify any of the scenarios on the window machine. Was docker set up correctly for you on your windows machine inside the dev container?

@stefanalexandru02
Copy link

So docker was setup on the windows machine itself, and exposed through WSL2.

In the mac branch, the container connection was somehow broken, while in the windows branch it was fine.

Docker itself was setup correctly, as other contains that were not using sam cli are working fine

@jysheng123
Copy link
Contributor

Great, did the response on scenario 2 for mac change after fixing docker?

@jysheng123
Copy link
Contributor

Hey, just pinging to inquire about an update if you have a fix for both of the scenarios after updating the docker. If so, I can close the ticket :).

@ffMathy
Copy link
Author

ffMathy commented Oct 3, 2023

Hey, just pinging to inquire about an update if you have a fix for both of the scenarios after updating the docker. If so, I can close the ticket :).

We will get back with a response tomorrow. Please don't close yet.

Cc @stefanalexandru02 🙏

@stefanalexandru02
Copy link

stefanalexandru02 commented Oct 4, 2023

No, it's still failing with the latest docker version. Same exact devcontainer crash on Windows

@lucashuy
Copy link
Contributor

Hi, sorry about the delay in response. I can reproduce this issue inside of WSL2 (instead of on Windows). We'll need to investigate a bit deeper as to why this happens on Windows, and why the provided fix works on one OS an not the other before we can provide a definitive fix.

This the provided project works in Windows though (not WSL2), so this could be a potential workaround for now while we investigate.

@mildaniel
Copy link
Contributor

I did some investigating here and came to a similar conclusion to @lucashuy.

On Mac, I was able to run commands successfully both with binding the host socket and without. Without the bind it worked out-of-the-box. With the bind, I had to update the file sharing settings on the host machine that were adopted by the dev container.

On Windows, I am able to get SAM CLI working on the dev container when starting VS Code from Windows itself, not WSL. With WSL, I am seeing the same issue being reported. My question is why use WSL as an intermediary here? It seems to me as though the end goal of using the dev container is the same whether using Windows or WSL.

Is there a use-case we're missing that requires you to run dev container with WSL? If not, I think we can resolve this issue.

@stefanalexandru02
Copy link

Main reason is performance. When running docker using WSL backend, it's much slower starting it from Windows compared to WSL, especially for larger projects.

@ffMathy
Copy link
Author

ffMathy commented Nov 9, 2023

Yes, and WSL is also default for everyone. It's what most people use nowadays.

@mildaniel
Copy link
Contributor

Did some more digging today, and we were able to get it to work with WSL. There were a few issues that we needed to resolve:

  1. Since we bind the host Docker service, subsequent containers started by the dev container are run as side-car containers on WSL. This means that the CodeUri property should match the path corresponding to the WSL filesystem, not the dev container filesystem.
  2. When setting "remoteUser": "root", in the .devcontainer.json, this causes an issue with sam build since it will create an .aws-sam directory owned by root, and subsequent commands won't have sufficient permissions. You can either update the user (recommended) or change the permissions of the .aws-sam directory after build.
  3. I'm not entirely sure why, but Docker kept asking for credentials. I needed to remove the existing credential store config located in ~/.docker/config and then login to Docker with docker login.
  4. Run local emulation commands with the --container-host host.docker.internal flag like you have in your example.

There are a lot of levels of virtualization here that complicate things so let us know if these steps help!

@ffMathy
Copy link
Author

ffMathy commented Nov 10, 2023

Thank you for looking into it. Have not yet verified that.

It would be great if there were some documentation on an approach to SAM in WSL, that was verified working on WSL + Mac. It's a maze right now, and it's really hard to get working for both at the same time.

@mildaniel
Copy link
Contributor

We can look into adding some more documentation about development environments and some of the nuances with WSL and dev container.

I am going to remove the bug tag since this is a system configuration issue and nothing we can really do on the SAM CLI side.

@mildaniel
Copy link
Contributor

Resolving for now. Please create a new issue if anything else comes up!

Copy link
Contributor

github-actions bot commented Dec 4, 2023

⚠️COMMENT VISIBILITY WARNING⚠️

Comments on closed issues are hard for our team to see.
If you need more assistance, please either tag a team member or open a new issue that references this one.
If you wish to keep having a conversation with other community members under this issue feel free to do so.

@ffMathy
Copy link
Author

ffMathy commented Dec 4, 2023

I must say I am quite disappointed in this. Wouldn't currently recommend the SAM CLI to anyone in its current state. I hope things will improve.

@mancinifm
Copy link

mancinifm commented Feb 21, 2024

3. I'm not entirely sure why, but Docker kept asking for credentials. I needed to remove the existing credential store config located in `~/.docker/config` and then login to Docker with `docker login`.

This resolved my issue. Thanks heaps for your investigation, @mildaniel !

So just to confirm. WSL + DevContainer + sam local start-lambda working here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/docker stage/needs-investigation Requires a deeper investigation
Projects
None yet
Development

No branches or pull requests

8 participants