Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allow extending HttpHook with requests adapters #44285

Open
1 of 2 tasks
kiaradlf opened this issue Nov 22, 2024 · 2 comments · May be fixed by #44302
Open
1 of 2 tasks

allow extending HttpHook with requests adapters #44285

kiaradlf opened this issue Nov 22, 2024 · 2 comments · May be fixed by #44302
Assignees
Labels
area:core-operators Operators, Sensors and hooks within Core Airflow good first issue kind:feature Feature Requests

Comments

@kiaradlf
Copy link

Description

One of the factors making Python's requests library quite flexible is its use of transport adapters, allowing the user to further configure behavior.
Airflow's built-in HttpHook tho unfortunately does not currently expose such functionality to the user.
So far, I've found myself working around this by inheriting the hook with one adding such functionality, see below.
It would be nice tho to see the actual HttpHook extended with an extra parameter to expose this functionality to the user, foregoing the need for such sub-classes.

# mountable_http_hook.py
from typing import Any
from urllib.parse import urlparse

from airflow.providers.http.hooks.http import HttpHook
from requests import Session
from requests.adapters import BaseAdapter
from requests_toolbelt.adapters.socket_options import (  # type: ignore[import-untyped]
    TCPKeepAliveAdapter,
)

class MountableHttpHook(HttpHook):
    """Version of AirFlow's HttpHook that allows mounting custom `requests` adapters."""

    _adapter: BaseAdapter | None

    def __init__(
        self,
        *args,
        adapter: BaseAdapter | None = None,
        tcp_keep_alive: bool = True,
        tcp_keep_alive_idle: int = 120,
        tcp_keep_alive_count: int = 20,
        tcp_keep_alive_interval: int = 30,
        **kwargs,
    ) -> None:
        super().__init__(*args, **kwargs)

        # ensure HttpHook won't override our mounts.
        # set manually instead of by constructor in case overrides
        # pass all params on to a class that doesn't know this parameter,
        # such as is the case with Oauth2HttpHook for requests.Session.
        self.tcp_keep_alive = False

        if adapter is None and tcp_keep_alive:
            # default to the HttpHook's adapter
            adapter = TCPKeepAliveAdapter(
                idle=tcp_keep_alive_idle,
                count=tcp_keep_alive_count,
                interval=tcp_keep_alive_interval,
            )
        self._adapter = adapter

    def get_conn(self, headers: dict[Any, Any] | None = None) -> Session:
        """Add our adapter to the `requests.Session`."""
        session = super().get_conn(headers=headers)
        if self._adapter:
            scheme = urlparse(self.base_url).scheme if self.base_url else "https"
            session.adapters = {scheme: self._adapter}  # type: ignore
        return session

Use case/motivation

Adapters may help handle business logic surrounding HTTP requests, including:

  • retries
  • timeouts
  • HTTP status codes, e.g. to specify which error codes to retry (potentially using say exponential back-off) before marking the task as failed
  • SSL version
  • ...

Related issues

No response

Are you willing to submit a PR?

  • Yes I am willing to submit a PR!

Code of Conduct

@kiaradlf kiaradlf added kind:feature Feature Requests needs-triage label for new issues that we didn't triage yet labels Nov 22, 2024
@dosubot dosubot bot added the area:core-operators Operators, Sensors and hooks within Core Airflow label Nov 22, 2024
@potiuk
Copy link
Member

potiuk commented Nov 22, 2024

Feel free to work and contribute it @kiaradlf , otherwise I am marking it as good first issue and hopefully someone will.

@potiuk potiuk added good first issue and removed needs-triage label for new issues that we didn't triage yet labels Nov 22, 2024
@jieyao-MilestoneHub
Copy link

Hi @kiaradlf and @potiuk,

I’ve submitted a PR (#44302) to address issue #44285 by adding an adapter parameter to the HttpHook. This enhancement allows users to mount custom requests adapters, making it easier to manage use cases like retries and timeouts. I’ve also included thorough tests to ensure smooth functionality and maintain backward compatibility.

I appreciate your time and consideration in reviewing this—thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:core-operators Operators, Sensors and hooks within Core Airflow good first issue kind:feature Feature Requests
Projects
None yet
3 participants