Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

abctl: install, uninstall, install fails #41992

Closed
jonseymour opened this issue Jul 16, 2024 · 21 comments
Closed

abctl: install, uninstall, install fails #41992

jonseymour opened this issue Jul 16, 2024 · 21 comments
Assignees
Labels
area/abctl Issues with the abctl quickstart cli community team/deployments type/bug Something isn't working

Comments

@jonseymour
Copy link

jonseymour commented Jul 16, 2024

What happened?

I can consistently cause abctl to fail by:

  • creating a new AWS EC2 instance
  • running abctl local install once
  • running abctl local uninstall once
  • running abctl local install again

The second invocation of abctl local install consistently fails with:

abctl local install
  INFO    Using Kubernetes provider:
            Provider: kind
            Kubeconfig: /opt/airbyte/.airbyte/abctl/abctl.kubeconfig
            Context: kind-airbyte-abctl
 SUCCESS  Found Docker installation: version 25.0.3
 SUCCESS  Port 8000 appears to be available
  INFO    No existing cluster found, cluster 'airbyte-abctl' will be created
  ERROR   Cluster 'airbyte-abctl' could not be created
  ERROR   unable to create kind cluster: failed to init node with kubeadm: command "docker exec --privileged airbyte-abctl-control-plane kubeadm init --skip-phases=preflight --config=/kind/kubeadm.conf --skip-token-print --v=6" failed with error: exit status 1

A gist containg the logs of the control plane container can be found here:

https://gist.github.com/jonseymour/3e3e4e52c928ddde351b1244028228f6

AFAICT there are no other error messages of any kind. There doesn't appear to be any messages in the logs of the airbyte-abctl-control-plane

Things I have tried:

  • removing the .airbyte directory
  • restarting the docker service
  • rebooting server
  • running docker system prune
  • confirmed that there is no docker volume associated with the original install

It isn't clear to me where the state that is preventing the second install from working is stored.

What did you expect to happen?

Both installs work as expected

Abctl Version

$ abctl version
abctl version
version: v0.7.1

  INFO    A new release of abctl is available: v0.7.1 -> v0.7.2
          Updating to the latest version is highly recommended

Docker Version

$ docker version
Client:
 Version:           25.0.3
 API version:       1.44
 Go version:        go1.20.12
 Git commit:        4debf41
 Built:             Wed Feb 28 00:29:45 2024
 OS/Arch:           linux/amd64
 Context:           default

Server:
 Engine:
  Version:          25.0.3
  API version:      1.44 (minimum version 1.24)
  Go version:       go1.20.12
  Git commit:       f417435
  Built:            Wed Feb 28 00:30:22 2024
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.7.11
  GitCommit:        64b8a811b07ba6288238eefc14d898ee0b5b99ba
 runc:
  Version:          1.1.11
  GitCommit:        4bccb38cc9cf198d52bebf2b3a90cd14e7af8c06
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

OS Version

# On Linux:
$ cat /etc/os-release
NAME="Amazon Linux"
VERSION="2"
ID="amzn"
ID_LIKE="centos rhel fedora"
VERSION_ID="2"
PRETTY_NAME="Amazon Linux 2"
ANSI_COLOR="0;33"
CPE_NAME="cpe:2.3:o:amazon:amazon_linux:2"
HOME_URL="https://amazonlinux.com/"
SUPPORT_END="2025-06-30"
@marcosmarxm
Copy link
Member

Thanks for reporting the issue. @airbytehq/platform-deployments can someone take a look and try to reproduce the steps?

@colesnodgrass
Copy link
Member

@jonseymour could you run the docker command that is failing directly in your terminal and provide the results?

docker exec --privileged airbyte-abctl-control-plane kubeadm init --skip-phases=preflight --config=/kind/kubeadm.conf --skip-token-print --v=6

The --config=/kind/kubeadm.conf also leads me to believe that you may have a pre-existing kind configuration that might be interfering. Could you also provide the contents of that file?

@colesnodgrass colesnodgrass self-assigned this Jul 16, 2024
@jonseymour
Copy link
Author

jonseymour commented Jul 17, 2024

@jonseymour could you run the docker command that is failing directly in your terminal and provide the results?

docker exec --privileged airbyte-abctl-control-plane kubeadm init --skip-phases=preflight --config=/kind/kubeadm.conf --skip-token-print --v=6

The --config=/kind/kubeadm.conf also leads me to believe that you may have a pre-existing kind configuration that might be interfering. Could you also provide the contents of that file?

Thanks for the reply. I am not exactly sure which file system I am meant to be looking at.

  • I am assuming it is not the host file system (and certainly, no such file exists even on a working installation).
  • There are no docker volumes at this point either.
  • It isn't in ~/.airbyte because I have removed that directory.
  • kind get clusters and kind get nodes both report empty.

Where would I find the persistent storage associated with this path?

@jonseymour
Copy link
Author

jonseymour commented Jul 17, 2024

I'll have to create another test system, but I ran that docker command directly in the past

docker exec --privileged airbyte-abctl-control-plane kubeadm init --skip-phases=preflight --config=/kind/kubeadm.conf --skip-token-print --v=6

and it failed because at that point the airbyte-abctl-control-plane container no longer exists.

I did have the log from its execution at one point but it didn't show any interesting (to me, anyway) errors. I will create another instance when I get a chance.

@jonseymour
Copy link
Author

jonseymour commented Jul 17, 2024

I have added a link to a gist which contains the logs the control plan docker container and amended the issue description to include that link

I confirmed that kind get clusters is empty as is kind get nodes

@justinmiller61
Copy link

justinmiller61 commented Jul 19, 2024

I ran into this too. When I run the docker command directly, I get a permission denied when accessing the ~/.airbyte directory. Looking at the permissions, it's missing the user-read bit. So the perms are d-wx-r-x-rx. Adding the user read bit allows the cluster to be create and everything starts up, but the install command never returns. Or eventually it does, but says some condition failed, but doesn't say what the condition is. Even though I can access the Airbyte UI and things seem to be running fine.

This is the error helm eventually returns:

Helm release airbyte not installed. Installing...
Error: INSTALLATION FAILED: failed post-install: 1 error occurred:
	* timed out waiting for the condition

@tturkenitz
Copy link

tturkenitz commented Jul 19, 2024

abctl failing for me as well when performing an install. Running latest version of abctl (0.8.1).

➜  ~ abctl local install
  INFO    Using Kubernetes provider:
            Provider: kind
            Kubeconfig: /Users/nsa_agent/.airbyte/abctl/abctl.kubeconfig
            Context: kind-airbyte-abctl
 SUCCESS  Found Docker installation: version 27.0.3
 SUCCESS  Port 8000 appears to be available
  INFO    No existing cluster found, cluster 'airbyte-abctl' will be created
  ERROR   Cluster 'airbyte-abctl' could not be created
  ERROR   unable to create kind cluster: command "docker run --name airbyte-abctl-control-plane --hostname airbyte-abctl-control-plane --label io.x-k8s.kind.role=control-plane --privileged --security-opt seccomp=unconfined --security-opt apparmor=unconfined --tmpfs /tmp --tmpfs /run --volume /var --volume /lib/modules:/lib/modules:ro -e KIND_EXPERIMENTAL_CONTAINERD_SNAPSHOTTER --detach --tty --label io.x-k8s.kind.cluster=airbyte-abctl --net kind --restart=on-failure:1 --init=false --cgroupns=private --volume=/Users/nsa_agent/.airbyte/abctl/data:/var/local-path-provisioner --publish=0.0.0.0:8000:80/TCP --publish=127.0.0.1:52919:6443/TCP -e KUBECONFIG=/etc/kubernetes/admin.conf kindest/node:v1.29.4@sha256:3abb816a5b1061fb15c6e9e60856ec40d56b7b52bcea5f5f1350bc6e2320b6f8" failed with error: exit status 126

Running the command directly shows a permission error related to mounts:

739e4c52c450dfafd18494fd07315d9af157797edbaee29c4e3fd69f1378360d
docker: Error response from daemon: error while creating mount source path '/host_mnt/Users/nsa_agent/.airbyte/abctl/data': mkdir /host_mnt/Users/nsa_agent/.airbyte/abctl: permission denied.

I love blackboxes that you can't configure and control. So much better than Docker Compose!

@justinmiller61
Copy link

abctl failing for me as well when performing an install. Running latest version of abctl (0.8.1).

➜  ~ abctl local install
  INFO    Using Kubernetes provider:
            Provider: kind
            Kubeconfig: /Users/nsa_agent/.airbyte/abctl/abctl.kubeconfig
            Context: kind-airbyte-abctl
 SUCCESS  Found Docker installation: version 27.0.3
 SUCCESS  Port 8000 appears to be available
  INFO    No existing cluster found, cluster 'airbyte-abctl' will be created
  ERROR   Cluster 'airbyte-abctl' could not be created
  ERROR   unable to create kind cluster: command "docker run --name airbyte-abctl-control-plane --hostname airbyte-abctl-control-plane --label io.x-k8s.kind.role=control-plane --privileged --security-opt seccomp=unconfined --security-opt apparmor=unconfined --tmpfs /tmp --tmpfs /run --volume /var --volume /lib/modules:/lib/modules:ro -e KIND_EXPERIMENTAL_CONTAINERD_SNAPSHOTTER --detach --tty --label io.x-k8s.kind.cluster=airbyte-abctl --net kind --restart=on-failure:1 --init=false --cgroupns=private --volume=/Users/nsa_agent/.airbyte/abctl/data:/var/local-path-provisioner --publish=0.0.0.0:8000:80/TCP --publish=127.0.0.1:52919:6443/TCP -e KUBECONFIG=/etc/kubernetes/admin.conf kindest/node:v1.29.4@sha256:3abb816a5b1061fb15c6e9e60856ec40d56b7b52bcea5f5f1350bc6e2320b6f8" failed with error: exit status 126

Running the command directly shows a permission error related to mounts:

739e4c52c450dfafd18494fd07315d9af157797edbaee29c4e3fd69f1378360d
docker: Error response from daemon: error while creating mount source path '/host_mnt/Users/nsa_agent/.airbyte/abctl/data': mkdir /host_mnt/Users/nsa_agent/.airbyte/abctl: permission denied.

I love blackboxes that you can't configure and control. So much better than Docker Compose!

Yep. Same here. See my response about changing the permissions and re-running the install.

It still ultimately fails, but for different/unknown reasons.

@tturkenitz
Copy link

tturkenitz commented Jul 19, 2024

That seems to be the case from your example, yes. I corrected the permissions on the directory and was able to progress further with the installation. I'll edit the comment with the results of the command once it completes.

Edit: Installation completed successfully and Airbyte is accessible through the UI. The permission issue on the hidden folder is the cause for the initial failure as @justinmiller61 mentioned.

Second Edit: Correcting the permission on ~/.airbyte/ assited in getting the service up and running, however, there seems to be core issues in the installation that cannot be resolved. Specifically, while the service is up, attempting to create any connection fails with error:

  File "/usr/local/lib/python3.10/site-packages/airbyte_cdk/connector.py", line 71, in write_config
    with open(config_path, "w") as fh:
PermissionError: [Errno 13] Permission denied: 'source_config.json'

I tried setting ~/.airbyte to 777 but the error remains.

@billy-cocomo
Copy link

Can confirm this issue remains on macos and I am trying the solutions above will report back if I encounter a bug when setting up source

@justinmiller61
Copy link

Setting the permissions on the .airbyte directory allows everything to deploy, at least as far get pods reports. And I can connect to the UI, setup connections and sync them. But the install command still reports failure after a timeout. It’s not clear what condition it’s waiting on and get events isn’t particularly helpful.

If this isn’t related to the original issue, then let me know and I can open a separate issue.

@td-heinm
Copy link

I experienced the same issue with abctl local install. Here are the steps I have taken to get it working:

Issue:

❯ abctl local install
  INFO    Using Kubernetes provider:
            Provider: kind
            Kubeconfig: /Users/heinm/.airbyte/abctl/abctl.kubeconfig
            Context: kind-airbyte-abctl
 SUCCESS  Found Docker installation: version 27.0.3
 SUCCESS  Port 8000 appears to be available
  INFO    No existing cluster found, cluster 'airbyte-abctl' will be created
  ERROR   Cluster 'airbyte-abctl' could not be created
  ERROR   unable to create kind cluster: command "docker run --name airbyte-abctl-control-plane \
  --hostname airbyte-abctl-control-plane --label io.x-k8s.kind.role=control-plane \
  --privileged --security-opt seccomp=unconfined --security-opt apparmor=unconfined --tmpfs /tmp \
  --tmpfs /run --volume /var --volume /lib/modules:/lib/modules:ro -e KIND_EXPERIMENTAL_CONTAINERD_SNAPSHOTTER \
  --detach --tty --label io.x-k8s.kind.cluster=airbyte-abctl --net kind --restart=on-failure:1 --init=false \
  --cgroupns=private --volume=/Users/heinm/.airbyte/abctl/data:/var/local-path-provisioner \
  --publish=0.0.0.0:8000:80/TCP --publish=127.0.0.1:58566:6443/TCP -e KUBECONFIG=/etc/kubernetes/admin.conf \
  kindest/node:v1.29.4@sha256:3abb816a5b1061fb15c6e9e60856ec40d56b7b52bcea5f5f1350bc6e2320b6f8" \
  failed with error: exit status 126

I used the

abctl local uninstall

command to ensure everything was clean (I was hoping/assuming it would help...), then changed the directory permissions using:

sudo chmod 755 /Users/heinm/.airbyte/abctl

After these changes, reinstalling using the steps explained in the Quickstart (abctl local install) guide was successful.

@SoheilSalmani
Copy link

@td-heinm Thanks! It works now.

@colesnodgrass
Copy link
Member

I believe this should be fixed in the release I just pushed v0.9.1.

Note the callout in the release notes about removing the ~/.airbyte directory if that directory was already created with the wrong permissions.

@justinmiller61
Copy link

FWIW this is the cause of my helm timeout issue that I reported above: #38598

@jprpai
Copy link

jprpai commented Jul 25, 2024

@colesnodgrass can we please have someone looked at #38598? i opened a PR to fix this months ago and it hasn't been looked at

@jonseymour
Copy link
Author

FWIW @colesnodgrass , AFAICT the issue that I reported with this issue (see description above) isn't fixed with v0.13.1 of abctl

The issue persists even if I delete the .airbyte directory completely (which matches the initial condition of the directory).

So, this issue appears to be independent of whatever issue was fixed in v0.9.1

@erpadmin
Copy link

erpadmin commented Sep 3, 2024

Try with -v switch during install. I have a similar issue at #45105 but I only see that same error in the verbose output. Looks like it the control plane fails health check then it deletes it (at least on my end for the specific issue I'm having)

@jonseymour
Copy link
Author

FWIW: I have uprgaded from Amazon Linux 2 to Amazon Linux 2023 and the cycling issues with abctl seem to have abated which may indicate that there is a kernel or at least OS/distro factor involved.

I am not having separate issues with pre-upgrade hook falures but I will raise a separate issue for that.

@tom-mont
Copy link

tom-mont commented Oct 1, 2024

Another potential fix is to temporarily upgrade the size of the EC2 instance. As suggested by Erik:

  • Upgraded from a t3.xlarge to a t3.2xlarge instance
  • ran abctl local uninstall (but there were no changes)
  • ran sudo chmod 755 /home/ec2-user/.airbyte/abctl

After this I was able to run abctl local install.

@tom-mont
Copy link

Hi all, I was also able to resolve this errors by rebooting the EC2 instance

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/abctl Issues with the abctl quickstart cli community team/deployments type/bug Something isn't working
Projects
None yet
Development

No branches or pull requests