Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes local URI for deployments #1695

Closed
wants to merge 4 commits into from
Closed

Conversation

LucasTrg
Copy link
Contributor

Hey ! I've noticed that the network setup for self hosted agenta had to be fixed in my case, I'm not quite sure this is a shared issue for everyone.
The changes in this PR fixes issues in my case with the celery worker not able to reach deployments during evaluations.

The reasoning is that the deployment URI that are stored in the deploymentDB are of the following form http://{host}/{container_name}/{app_name}/{variant_name} and are resolved when using traefik.
When creating a new deployment, the following rule is added f"traefik.http.routers.{container_name}.rule": f"Host('{os.environ['BARE_DOMAIN_NAME']}') && PathPrefix('/{uri_path')", which in the case of a self host adds a Host("localhost") rule. This will of course fail to resolve from an agenta-backend as it's trying to reach the deployment not from localhost but host.docker.internal.

In any case, I don't believe it's necessary to complicate the traefik networking even more as all the deployments, services, etc... are in the same docker network. Instead we can rely on the DNS entries that are automatically managed by docker in a network.

I'm not too familiar with the architectures of the other ways of deploying agenta so feel free to steer this PR any way you'd like

I also took liberty to fix a typo I came across while investigating.

Copy link

vercel bot commented May 22, 2024

@LucasTrg is attempting to deploy a commit to the mmabrouk's projects Team on Vercel.

A member of the Team first needs to authorize it.

@dosubot dosubot bot added the size:S This PR changes 10-29 lines, ignoring generated files. label May 22, 2024
@dosubot dosubot bot added Backend bug Something isn't working labels May 22, 2024
@aakrem
Copy link
Collaborator

aakrem commented May 22, 2024

Hi @LucasTrg, thank you for the PR.
May I ask where are you trying to host agenta? I am asking this because we don't face such issue so I'd like to have more context there.

@LucasTrg
Copy link
Contributor Author

LucasTrg commented May 22, 2024

Hey @aakrem ! I appreciate the help
I've struggled a little to get everything running correctly locally so this might definitely be a setup issue on my end !
I've been using agenta on a ubuntu 24.04 VM using virtualbox and hyper V using the docker-compose.gh.yml as described in the doc. What about you ?
I had some pretty odd issues, for instance I had to downgrade MongoDB to v4.4.6 as v5 is not supported under hyper-V.

I still find it quite odd that the networking would work on your end given the explanation above but I'm quite new to traefik !
Let me know if there's any more detail I'm missing.

@aakrem
Copy link
Collaborator

aakrem commented May 22, 2024

@LucasTrg thanks for the details. Could you please try the command above to run agenta and see if you are still facing the same issues ? I would like to be sure whether the issue is related to the environment where you are running Agenta or not.

docker compose -f "docker-compose.yml" up -d --build

@LucasTrg
Copy link
Contributor Author

LucasTrg commented May 22, 2024

Yep, I can report the exact same problem. The URI using host.docker.internal is still not working, plugging in my fix does the job on my system.
Would you mind sharing your traefik dashboard and a docker inspect of your agenta-network so I can try to figure out what's wrong ?
Thanks !

@bharath97-git
Copy link

@LucasTrg I've been facing the same issue.
Traefik, exposed on port 80 is not accessible from any domain other than localhost.
I added a custom domain to my /etc/hosts to check that and I wasn't able to access.

The fix above has resolved my issue.

@mmabrouk mmabrouk requested a review from aybruhm May 27, 2024 18:29
@LucasTrg
Copy link
Contributor Author

@bharath97-git When you're saying you can't even access traefik itself, do you mean traefik's dashboard or the reverse proxy ?
If it's the latter are you sure that it's a case of not being able to access traefik and not "simply" not matching any of the traefik rule ?
Not being able to access traefik would be very odd as it's minimally configured and is not integrating any of the agenta component as far as I know. Do you run into the same issue if you're only running this ?

services:
    reverse-proxy:
        image: traefik:v2.10
        command: --api.dashboard=true --api.insecure=true --providers.docker --entrypoints.web.address=:80
        ports:
            - "80:80"
            - "8080:8080"
        volumes:
            - /var/run/docker.sock:/var/run/docker.sock
        networks:
            - agenta-network
        restart: always
   hw:
       image: crccheck/hello-world
       ports:"8000:8000"
       labels:
            - "traefik.http.routers.hw.rule=PathPrefix(`/`)"
            - "traefik.http.routers.hw.entrypoints=web"
            - "traefik.http.services.hw.loadbalancer.server.port=8000"

@bharath97-git
Copy link

Hello @LucasTrg,
I was able to access the traefik dasboard, but no the applications for which traefik is acting as proxy. This is happening specifically when the celery worker tries to access the openapi.json using the host.docker.internal DNS. I tried changing the traefik host rules, to include host.docker.internal:
- "traefik.http.routers.backend.rule=Host(${BARE_DOMAIN_NAME}) && PathPrefix(/api/) || HostRegexp(^.+\.docker\.internal$) && PathPrefix(/api/)"

This still doesn't allow celery to access the container using the host.docker.internal domain.

Copy link
Member

@aybruhm aybruhm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR, @LucasTrg! Due to the failing tests, I won't be able to merge it. That being said, I'm going to:

  • close your PR,
  • create a new PR to have the tests run,
  • and ensure the PR is properly QA'ed by the team.

@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Sep 10, 2024
@aybruhm aybruhm closed this Sep 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Backend bug Something isn't working lgtm This PR has been approved by a maintainer size:S This PR changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants