Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update dependency apache-airflow to v2.7.3 [SECURITY] #694

Conversation

renovate-bot
Copy link
Contributor

Mend Renovate

This PR contains the following updates:

Package Change Age Adoption Passing Confidence
apache-airflow (source, changelog) ==2.2.5 -> ==2.7.3 age adoption passing confidence

GitHub Vulnerability Alerts

CVE-2023-25754

Privilege Context Switching Error vulnerability in Apache Software Foundation Apache Airflow. This issue affects Apache Airflow: before 2.6.0.

CVE-2022-46651

Apache Airflow, versions before 2.6.3, is affected by a vulnerability that allows an unauthorized actor to gain access to sensitive information in Connection edit view. This vulnerability is considered low since it requires someone with access to Connection resources specifically updating the connection to exploit it. Users should upgrade to version 2.6.3 or later which has removed the vulnerability.

CVE-2023-22887

Apache Airflow, versions before 2.6.3, is affected by a vulnerability that allows an attacker to perform unauthorized file access outside the intended directory structure by manipulating the run_id parameter. This vulnerability is considered low since it requires an authenticated user to exploit it. It is recommended to upgrade to a version that is not affected

CVE-2023-35908

Apache Airflow, versions before 2.6.3, is affected by a vulnerability that allows unauthorized read access to a DAG through the URL. It is recommended to upgrade to a version that is not affected

CVE-2023-36543

Apache Airflow, versions before 2.6.3, has a vulnerability where an authenticated user can use crafted input to make the current request hang. It is recommended to upgrade to a version that is not affected

CVE-2023-39508

Execution with Unnecessary Privileges, : Exposure of Sensitive Information to an Unauthorized Actor vulnerability in Apache Software Foundation Apache Airflow.The "Run Task" feature enables authenticated user to bypass some of the restrictions put in place. It allows to execute code in the webserver context as well as allows to bypas limitation of access the user has to certain DAGs. The "Run Task" feature is considered dangerous and it has been removed entirely in Airflow 2.6.0.

This issue affects Apache Airflow: before 2.6.0.

CVE-2023-40273

The session fixation vulnerability allowed the authenticated user to continue accessing Airflow webserver even after the password of the user has been reset by the admin - up until the expiry of the session of the user. Other than manually cleaning the session database (for database session backend), or changing the secure_key and restarting the webserver, there were no mechanisms to force-logout the user (and all other users with that).

With this fix implemented, when using the database session backend, the existing sessions of the user are invalidated when the password of the user is reset. When using the securecookie session backend, the sessions are NOT invalidated and still require changing the secure key and restarting the webserver (and logging out all other users), but the user resetting the password is informed about it with a flash message warning displayed in the UI. Documentation is also updated explaining this behaviour.

Users of Apache Airflow are advised to upgrade to version 2.7.0 or newer to mitigate the risk associated with this vulnerability.

CVE-2023-37379

Apache Airflow, in versions prior to 2.7.0, contains a security vulnerability that can be exploited by an authenticated user possessing Connection edit privileges. This vulnerability allows the user to access connection information and exploit the test connection feature by sending many requests, leading to a denial of service (DoS) condition on the server. Furthermore, malicious actors can leverage this vulnerability to establish harmful connections with the server.

Users of Apache Airflow are strongly advised to upgrade to version 2.7.0 or newer to mitigate the risk associated with this vulnerability. Additionally, administrators are encouraged to review and adjust user permissions to restrict access to sensitive functionalities, reducing the attack surface.

CVE-2023-39441

Apache Airflow SMTP Provider before 1.3.0, Apache Airflow IMAP Provider before 3.3.0, and Apache Airflow before 2.7.0 are affected by the Validation of OpenSSL Certificate vulnerability.

The default SSL context with SSL library did not check a server's X.509 certificate.  Instead, the code accepted any certificate, which could result in the disclosure of mail server credentials or mail contents when the client connects to an attacker in a MITM position.

Users are strongly advised to upgrade to Apache Airflow version 2.7.0 or newer, Apache Airflow IMAP Provider version 3.3.0 or newer, and Apache Airflow SMTP Provider version 1.3.0 or newer to mitigate the risk associated with this vulnerability

CVE-2023-40611

Apache Airflow, versions before 2.7.1, is affected by a vulnerability that allows authenticated and DAG-view authorized Users to modify some DAG run detail values when submitting notes. This could have them alter details such as configuration parameters, start date, etc.

Users should upgrade to version 2.7.1 or later which has removed the vulnerability.

CVE-2023-40712

Apache Airflow, versions before 2.7.1, is affected by a vulnerability that allows authenticated users who have access to see the task/dag in the UI, to craft a URL, which could lead to unmasking the secret configuration of the task that otherwise would be masked in the UI.

Users are strongly advised to upgrade to version 2.7.1 or later which has removed the vulnerability.

CVE-2023-42663

Apache Airflow, versions before 2.7.2, has a vulnerability that allows an authorized user with access to read specific DAGs only to read information about task instances in other DAGs. Users of Apache Airflow are advised to upgrade to version 2.7.2 or newer to mitigate the risk associated with this vulnerability.

CVE-2023-42792

Apache Airflow, in versions prior to 2.7.2, contains a security vulnerability that allows an authenticated user with limited access to some DAGs, to craft a request that could give the user write access to various DAG resources for DAGs that the user had no access to, thus, enabling the user to clear DAGs they shouldn't.

Users of Apache Airflow are strongly advised to upgrade to version 2.7.2 or newer to mitigate the risk associated with this vulnerability.

CVE-2023-42780

Apache Airflow, versions prior to 2.7.2, contains a security vulnerability that allows authenticated users of Airflow to list warnings for all DAGs, even if the user had no permission to see those DAGs. It would reveal the dag_ids and the stack-traces of import errors for those DAGs with import errors. Users of Apache Airflow are advised to upgrade to version 2.7.2 or newer to mitigate the risk associated with this vulnerability.

CVE-2023-47037

Apache Airflow, versions before 2.7.3, is affected by a vulnerability that allows authenticated and DAG-view authorized Users to modify some DAG run detail values when submitting notes. This could have them alter details such as configuration parameters, start date, etc.  Users should upgrade to version 2.7.3 or later which has removed the vulnerability.

CVE-2023-42781

Apache Airflow, versions before 2.7.3, has a vulnerability that allows an authorized user who has access to read specific DAGs only, to read information about task instances in other DAGs.  This is a different issue than CVE-2023-42663 but leading to similar outcome.
Users of Apache Airflow are advised to upgrade to version 2.7.3 or newer to mitigate the risk associated with this vulnerability.


Release Notes

apache/airflow (apache-airflow)

v2.7.3

Compare Source

Significant Changes
^^^^^^^^^^^^^^^^^^^

No significant changes.

Bug Fixes
"""""""""

  • Fix pre-mature evaluation of tasks in mapped task group (#​34337)
  • Add TriggerRule missing value in rest API (#​35194)
  • Fix Scheduler crash looping when dagrun creation fails (#​35135)
  • Fix test connection with codemirror and extra (#​35122)
  • Fix usage of cron-descriptor since BC in v1.3.0 (#​34836)
  • Fix get_plugin_info for class based listeners. (#​35022)
  • Some improvements/fixes for dag_run and task_instance endpoints (#​34942)
  • Fix the dags count filter in webserver home page (#​34944)
  • Return only the TIs of the readable dags when ~ is provided as a dag_id (#​34939)
  • Fix triggerer thread crash in daemon mode (#​34931)
  • Fix wrong plugin schema (#​34858)
  • Use DAG timezone in TimeSensorAsync (#​33406)
  • Mark tasks with all_skipped trigger rule as skipped if any task is in upstream_failed state (#​34392)
  • Add read only validation to read only fields (#​33413)

Misc/Internal
"""""""""""""

  • Improve testing harness to separate DB and non-DB tests (#​35160, #​35333)
  • Add pytest db_test markers to our tests (#​35264)
  • Add pip caching for faster build (#​35026)
  • Upper bound pendulum requirement to <3.0 (#​35336)
  • Limit sentry_sdk to 1.33.0 (#​35298)
  • Fix subtle bug in mocking processor_agent in our tests (#​35221)
  • Bump @babel/traverse from 7.16.0 to 7.23.2 in /airflow/www (#​34988)
  • Bump undici from 5.19.1 to 5.26.3 in /airflow/www (#​34971)
  • Remove unused set from SchedulerJobRunner (#​34810)
  • Remove warning about max_tis per query > parallelism (#​34742)
  • Improve modules import in Airflow core by moving some of them into a type-checking block (#​33755)
  • Fix tests to respond to Python 3.12 handling of utcnow in sentry-sdk (#​34946)
  • Add connexion<3.0 upper bound (#​35218)
  • Limit Airflow to < 3.12 (#​35123)
  • update moto version (#​34938)
  • Limit WTForms to below 3.1.0 (#​34943)

Doc Only Changes
""""""""""""""""

  • Fix variables substitution in Airflow Documentation (#​34462)
  • Added example for defaults in conn.extras (#​35165)
  • Update datasets.rst issue with running example code (#​35035)
  • Remove mysql-connector-python from recommended MySQL driver (#​34287)
  • Fix syntax error in task dependency set_downstream example (#​35075)
  • Update documentation to enable test connection (#​34905)
  • Update docs errors.rst - Mention sentry "transport" configuration option (#​34912)
  • Update dags.rst to put SubDag deprecation note right after the SubDag section heading (#​34925)
  • Add info on getting variables and config in custom secrets backend (#​34834)
  • Document BaseExecutor interface in more detail to help users in writing custom executors (#​34324)
  • Fix broken link to airflow_local_settings.py template (#​34826)
  • Fixes python_callable function assignment context kwargs example in params.rst (#​34759)
  • Add missing multiple_outputs=True param in the TaskFlow example (#​34812)
  • Remove extraneous '>' in provider section name (#​34813)
  • Fix imports in extra link documentation (#​34547)

v2.7.2

Compare Source

Significant Changes
^^^^^^^^^^^^^^^^^^^

No significant changes

Bug Fixes
"""""""""

  • Check if the lower of provided values are sensitives in config endpoint (#​34712)
  • Add support for ZoneInfo and generic UTC to fix datetime serialization (#​34683, #​34804)
  • Fix AttributeError: 'Select' object has no attribute 'count' during the airflow db migrate command (#​34348)
  • Make dry run optional for patch task instance (#​34568)
  • Fix non deterministic datetime deserialization (#​34492)
  • Use iterative loop to look for mapped parent (#​34622)
  • Fix is_parent_mapped value by checking if any of the parent taskgroup is mapped (#​34587)
  • Avoid top-level airflow import to avoid circular dependency (#​34586)
  • Add more exemptions to lengthy metric list (#​34531)
  • Fix dag warning endpoint permissions (#​34355)
  • Fix task instance access issue in the batch endpoint (#​34315)
  • Correcting wrong time showing in grid view (#​34179)
  • Fix www cluster_activity view not loading due to standaloneDagProcessor templating (#​34274)
  • Set loglevel=DEBUG in 'Not syncing DAG-level permissions' (#​34268)
  • Make param validation consistent for DAG validation and triggering (#​34248)
  • Ensure details panel is shown when any tab is selected (#​34136)
  • Fix issues related to access_control={} (#​34114)
  • Fix not found ab_user table in the CLI session (#​34120)
  • Fix FAB-related logging format interpolation (#​34139)
  • Fix query bug in next_run_datasets_summary endpoint (#​34143)
  • Fix for TaskGroup toggles for duplicated labels (#​34072)
  • Fix the required permissions to clear a TI from the UI (#​34123)
  • Reuse _run_task_session in mapped render_template_fields (#​33309)
  • Fix scheduler logic to plan new dag runs by ignoring manual runs (#​34027)
  • Add missing audit logs for Flask actions add, edit and delete (#​34090)
  • Hide Irrelevant Dag Processor from Cluster Activity Page (#​33611)
  • Remove infinite animation for pinwheel, spin for 1.5s (#​34020)
  • Restore rendering of provider configuration with version_added (#​34011)

Doc Only Changes
""""""""""""""""

  • Clarify audit log permissions (#​34815)
  • Add explanation for Audit log users (#​34814)
  • Import AUTH_REMOTE_USER from FAB in WSGI middleware example (#​34721)
  • Add information about drop support MsSQL as DB Backend in the future (#​34375)
  • Document how to use the system's timezone database (#​34667)
  • Clarify what landing time means in doc (#​34608)
  • Fix screenshot in dynamic task mapping docs (#​34566)
  • Fix class reference in Public Interface documentation (#​34454)
  • Clarify var.value.get and var.json.get usage (#​34411)
  • Schedule default value description (#​34291)
  • Docs for triggered_dataset_event (#​34410)
  • Add DagRun events (#​34328)
  • Provide tabular overview about trigger form param types (#​34285)
  • Add link to Amazon Provider Configuration in Core documentation (#​34305)
  • Add "security infrastructure" paragraph to security model (#​34301)
  • Change links to SQLAlchemy 1.4 (#​34288)
  • Add SBOM entry in security documentation (#​34261)
  • Added more example code for XCom push and pull (#​34016)
  • Add state utils to Public Airflow Interface (#​34059)
  • Replace markdown style link with rst style link (#​33990)
  • Fix broken link to the "UPDATING.md" file (#​33583)

Misc/Internal
"""""""""""""

  • Update min-sqlalchemy version to account for latest features used (#​34293)
  • Fix SesssionExemptMixin spelling (#​34696)
  • Restrict astroid version < 3 (#​34658)
  • Fail dag test if defer without triggerer (#​34619)
  • Fix connections exported output (#​34640)
  • Don't run isort when creating new alembic migrations (#​34636)
  • Deprecate numeric type python version in PythonVirtualEnvOperator (#​34359)
  • Refactor os.path.splitext to Path.* (#​34352, #​33669)
  • Replace = by is for type comparison (#​33983)
  • Refactor integer division (#​34180)
  • Refactor: Simplify comparisons (#​34181)
  • Refactor: Simplify string generation (#​34118)
  • Replace unnecessary dict comprehension with dict() in core (#​33858)
  • Change "not all" to "any" for ease of readability (#​34259)
  • Replace assert by if...raise in code (#​34250, #​34249)
  • Move default timezone to except block (#​34245)
  • Combine similar if logic in core (#​33988)
  • Refactor: Consolidate import and usage of random (#​34108)
  • Consolidate importing of os.path.* (#​34060)
  • Replace sequence concatenation by unpacking in Airflow core (#​33934)
  • Refactor unneeded 'continue' jumps around the repo (#​33849, #​33845, #​33846, #​33848, #​33839, #​33844, #​33836, #​33842)
  • Remove [project] section from pyproject.toml (#​34014)
  • Move the try outside the loop when this is possible in Airflow core (#​33975)
  • Replace loop by any when looking for a positive value in core (#​33985)
  • Do not create lists we don't need (#​33519)
  • Remove useless string join from core (#​33969)
  • Add TCH001 and TCH002 rules to pre-commit to detect and move type checking modules (#​33865)
  • Add cancel_trigger_ids to to_cancel dequeue in batch (#​33944)
  • Avoid creating unnecessary list when parsing stats datadog tags (#​33943)
  • Replace dict.items by dict.values when key is not used in core (#​33940)
  • Replace lambdas with comprehensions (#​33745)
  • Improve modules import in Airflow core by some of them into a type-checking block (#​33755)
  • Refactor: remove unused state - SHUTDOWN (#​33746, #​34063, #​33893)
  • Refactor: Use in-place .sort() (#​33743)
  • Use literal dict instead of calling dict() in Airflow core (#​33762)
  • remove unnecessary map and rewrite it using list in Airflow core (#​33764)
  • Replace lambda by a def method in Airflow core (#​33758)
  • Replace type func by isinstance in fab_security manager (#​33760)
  • Replace single quotes by double quotes in all Airflow modules (#​33766)
  • Merge multiple isinstance calls for the same object in a single call (#​33767)
  • Use a single statement with multiple contexts instead of nested statements in core (#​33769)
  • Refactor: Use f-strings (#​33734, #​33455)
  • Refactor: Use random.choices (#​33631)
  • Use str.splitlines() to split lines (#​33592)
  • Refactor: Remove useless str() calls (#​33629)
  • Refactor: Improve detection of duplicates and list sorting (#​33675)
  • Simplify conditions on len() (#​33454)

v2.7.1

Compare Source

Significant Changes
^^^^^^^^^^^^^^^^^^^

CronTriggerTimetable is now less aggressive when trying to skip a run (#​33404)
""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""

When setting catchup=False, CronTriggerTimetable no longer skips a run if
the scheduler does not query the timetable immediately after the previous run
has been triggered.

This should not affect scheduling in most cases, but can change the behaviour if
a DAG is paused-unpaused to manually skip a run. Previously, the timetable (with
catchup=False) would only start a run after a DAG is unpaused, but with this
change, the scheduler would try to look at little bit back to schedule the
previous run that covers a part of the period when the DAG was paused. This
means you will need to keep a DAG paused longer (namely, for the entire cron
period to pass) to really skip a run.

Note that this is also the behaviour exhibited by various other cron-based
scheduling tools, such as anacron.

conf.set() becomes case insensitive to match conf.get() behavior (#​33452)
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""

Also, conf.get() will now break if used with non-string parameters.

conf.set(section, key, value) used to be case sensitive, i.e. conf.set("SECTION", "KEY", value)
and conf.set("section", "key", value) were stored as two distinct configurations.
This was inconsistent with the behavior of conf.get(section, key), which was always converting the section and key to lower case.

As a result, configuration options set with upper case characters in the section or key were unreachable.
That's why we are now converting section and key to lower case in conf.set too.

We also changed a bit the behavior of conf.get(). It used to allow objects that are not strings in the section or key.
Doing this will now result in an exception. For instance, conf.get("section", 123) needs to be replaced with conf.get("section", "123").

Bug Fixes
"""""""""

  • Ensure that tasks wait for running indirect setup (#​33903)
  • Respect "soft_fail" for core async sensors (#​33403)
  • Differentiate 0 and unset as a default param values (#​33965)
  • Raise 404 from Variable PATCH API if variable is not found (#​33885)
  • Fix MappedTaskGroup tasks not respecting upstream dependency (#​33732)
  • Add limit 1 if required first value from query result (#​33672)
  • Fix UI DAG counts including deleted DAGs (#​33778)
  • Fix cleaning zombie RESTARTING tasks (#​33706)
  • SECURITY_MANAGER_CLASS should be a reference to class, not a string (#​33690)
  • Add back get_url_for_login in security manager (#​33660)
  • Fix 2.7.0 db migration job errors (#​33652)
  • Set context inside templates (#​33645)
  • Treat dag-defined access_control as authoritative if defined (#​33632)
  • Bind engine before attempting to drop archive tables (#​33622)
  • Add a fallback in case no first name and last name are set (#​33617)
  • Sort data before groupby in TIS duration calculation (#​33535)
  • Stop adding values to rendered templates UI when there is no dagrun (#​33516)
  • Set strict to True when parsing dates in webserver views (#​33512)
  • Use dialect.name in custom SA types (#​33503)
  • Do not return ongoing dagrun when a end_date is less than utcnow (#​33488)
  • Fix a bug in formatDuration method (#​33486)
  • Make conf.set case insensitive (#​33452)
  • Allow timetable to slightly miss catchup cutoff (#​33404)
  • Respect soft_fail argument when poke is called (#​33401)
  • Create a new method used to resume the task in order to implement specific logic for operators (#​33424)
  • Fix DagFileProcessor interfering with dags outside its processor_subdir (#​33357)
  • Remove the unnecessary <br> text in Provider's view (#​33326)
  • Respect soft_fail argument when ExternalTaskSensor runs in deferrable mode (#​33196)
  • Fix handling of default value and serialization of Param class (#​33141)
  • Check if the dynamically-added index is in the table schema before adding (#​32731)
  • Fix rendering the mapped parameters when using expand_kwargs method (#​32272)
  • Fix dependencies for celery and opentelemetry for Python 3.8 (#​33579)

Misc/Internal
"""""""""""""

Doc only changes
"""""""""""""""""

  • Add documentation explaining template_ext (and how to override it) (#​33735)
  • Explain how users can check if python code is top-level (#​34006)
  • Clarify that DAG authors can also run code in DAG File Processor (#​33920)
  • Fix broken link in Modules Management page (#​33499)
  • Fix secrets backend docs (#​33471)
  • Fix config description for base_log_folder (#​33388)

v2.7.0

Compare Source

Significant Changes
^^^^^^^^^^^^^^^^^^^

Remove Python 3.7 support (#​30963)
""""""""""""""""""""""""""""""""""
As of now, Python 3.7 is no longer supported by the Python community.
Therefore, to use Airflow 2.7.0, you must ensure your Python version is
either 3.8, 3.9, 3.10, or 3.11.

Old Graph View is removed (#​32958)
""""""""""""""""""""""""""""""""""
The old Graph View is removed. The new Graph View is the default view now.

The trigger UI form is skipped in web UI if no parameters are defined in a DAG (#​33351)
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""

If you are using dag_run.conf dictionary and web UI JSON entry to run your DAG you should either:

  • Add params to your DAG <https://airflow.apache.org/docs/apache-airflow/stable/core-concepts/params.html#use-params-to-provide-a-trigger-ui-form>_
  • Enable the new configuration show_trigger_form_if_no_params to bring back old behaviour

The "db init", "db upgrade" commands and "[database] load_default_connections" configuration options are deprecated (#​33136).
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
Instead, you should use "airflow db migrate" command to create or upgrade database. This command will not create default connections.
In order to create default connections you need to run "airflow connections create-default-connections" explicitly,
after running "airflow db migrate".

In case of SMTP SSL connection, the context now uses the "default" context (#​33070)
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
The "default" context is Python's default_ssl_contest instead of previously used "none". The
default_ssl_context provides a balance between security and compatibility but in some cases,
when certificates are old, self-signed or misconfigured, it might not work. This can be configured
by setting "ssl_context" in "email" configuration of Airflow.

Setting it to "none" brings back the "none" setting that was used in Airflow 2.6 and before,
but it is not recommended due to security reasons ad this setting disables validation of certificates and allows MITM attacks.

Disable default allowing the testing of connections in UI, API and CLI(#​32052)
""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
For security reasons, the test connection functionality is disabled by default across Airflow UI,
API and CLI. The availability of the functionality can be controlled by the
test_connection flag in the core section of the Airflow
configuration (airflow.cfg). It can also be controlled by the
environment variable AIRFLOW__CORE__TEST_CONNECTION.

The following values are accepted for this config param:

  1. Disabled: Disables the test connection functionality and
    disables the Test Connection button in the UI.

This is also the default value set in the Airflow configuration.
2. Enabled: Enables the test connection functionality and
activates the Test Connection button in the UI.

  1. Hidden: Disables the test connection functionality and
    hides the Test Connection button in UI.

For more information on capabilities of users, see the documentation:
https://airflow.apache.org/docs/apache-airflow/stable/security/security_model.html#capabilities-of-authenticated-ui-users
It is strongly advised to not enable the feature until you make sure that only
highly trusted UI/API users have "edit connection" permissions.

The xcomEntries API disables support for the deserialize flag by default (#​32176)
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
For security reasons, the /dags/*/dagRuns/*/taskInstances/*/xcomEntries/*
API endpoint now disables the deserialize option to deserialize arbitrary
XCom values in the webserver. For backward compatibility, server admins may set
the [api] enable_xcom_deserialize_support config to True to enable the
flag and restore backward compatibility.

However, it is strongly advised to not enable the feature, and perform
deserialization at the client side instead.

Change of the default Celery application name (#​32526)
""""""""""""""""""""""""""""""""""""""""""""""""""""""
Default name of the Celery application changed from airflow.executors.celery_executor to airflow.providers.celery.executors.celery_executor.

You should change both your configuration and Health check command to use the new name:

  • in configuration (celery_app_name configuration in celery section) use airflow.providers.celery.executors.celery_executor
  • in your Health check command use airflow.providers.celery.executors.celery_executor.app

The default value for scheduler.max_tis_per_query is changed from 512 to 16 (#​32572)
""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
This change is expected to make the Scheduler more responsive.

scheduler.max_tis_per_query needs to be lower than core.parallelism.
If both were left to their default value previously, the effective default value of scheduler.max_tis_per_query was 32
(because it was capped at core.parallelism).

To keep the behavior as close as possible to the old config, one can set scheduler.max_tis_per_query = 0,
in which case it'll always use the value of core.parallelism.

Some executors have been moved to corresponding providers (#​32767)
""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
In order to use the executors, you need to install the providers:

  • for Celery executors you need to install apache-airflow-providers-celery package >= 3.3.0
  • for Kubernetes executors you need to install apache-airflow-providers-cncf-kubernetes package >= 7.4.0
  • For Dask executors you need to install apache-airflow-providers-daskexecutor package in any version

You can achieve it also by installing airflow with [celery], [cncf.kubernetes], [daskexecutor] extras respectively.

Users who base their images on the apache/airflow reference image (not slim) should be unaffected - the base
reference image comes with all the three providers installed.

Improvement Changes
^^^^^^^^^^^^^^^^^^^

PostgreSQL only improvement: Added index on taskinstance table (#​30762)
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
This index seems to have great positive effect in a setup with tens of millions such rows.

New Features
""""""""""""

  • Add OpenTelemetry to Airflow (AIP-49 <https://github.com/apache/airflow/pulls?q=is%3Apr+is%3Amerged+label%3AAIP-49+milestone%3A%22Airflow+2.7.0%22>_)
  • Trigger Button - Implement Part 2 of AIP-50 (#​31583)
  • Removing Executor Coupling from Core Airflow (AIP-51 <https://github.com/apache/airflow/pulls?q=is%3Apr+is%3Amerged+label%3AAIP-51+milestone%3A%22Airflow+2.7.0%22>_)
  • Automatic setup and teardown tasks (AIP-52 <https://github.com/apache/airflow/pulls?q=is%3Apr+is%3Amerged+label%3AAIP-52+milestone%3A%22Airflow+2.7.0%22>_)
  • OpenLineage in Airflow (AIP-53 <https://github.com/apache/airflow/pulls?q=is%3Apr+is%3Amerged+milestone%3A%22Airflow+2.7.0%22+label%3Aprovider%3Aopenlineage>_)
  • Experimental: Add a cache to Variable and Connection when called at dag parsing time (#​30259)
  • Enable pools to consider deferred tasks (#​32709)
  • Allows to choose SSL context for SMTP connection (#​33070)
  • New gantt tab (#​31806)
  • Load plugins from providers (#​32692)
  • Add BranchExternalPythonOperator (#​32787, #​33360)
  • Add option for storing configuration description in providers (#​32629)
  • Introduce Heartbeat Parameter to Allow Per-LocalTaskJob Configuration (#​32313)
  • Add Executors discovery and documentation (#​32532)
  • Add JobState for job state constants (#​32549)
  • Add config to disable the 'deserialize' XCom API flag (#​32176)
  • Show task instance in web UI by custom operator name (#​31852)
  • Add default_deferrable config (#​31712)
  • Introducing AirflowClusterPolicySkipDag exception (#​32013)
  • Use reactflow for datasets graph (#​31775)
  • Add an option to load the dags from db for command tasks run (#​32038)
  • Add version of chain which doesn't require matched lists (#​31927)
  • Use operator_name instead of task_type in UI (#​31662)
  • Add --retry and --retry-delay to airflow db check (#​31836)
  • Allow skipped task state task_instance_schema.py (#​31421)
  • Add a new config for celery result_backend engine options (#​30426)
  • UI Add Cluster Activity Page (#​31123, #​32446)
  • Adding keyboard shortcuts to common actions (#​30950)
  • Adding more information to kubernetes executor logs (#​29929)
  • Add support for configuring custom alembic file (#​31415)
  • Add running and failed status tab for DAGs on the UI (#​30429)
  • Add multi-select, proposals and labels for trigger form (#​31441)
  • Making webserver config customizable (#​29926)
  • Render DAGCode in the Grid View as a tab (#​31113)
  • Add rest endpoint to get option of configuration (#​31056)
  • Add section query param in get config rest API (#​30936)
  • Create metrics to track Scheduled->Queued->Running task state transition times (#​30612)
  • Mark Task Groups as Success/Failure (#​30478)
  • Add CLI command to list the provider trigger info (#​30822)
  • Add Fail Fast feature for DAGs (#​29406)

Improvements
""""""""""""

  • Improve graph nesting logic (#​33421)
  • Configurable health check threshold for triggerer (#​33089, #​33084)
  • add dag_run_ids and task_ids filter for the batch task instance API endpoint (#​32705)
  • Ensure DAG-level references are filled on unmap (#​33083)
  • Add support for arrays of different data types in the Trigger Form UI (#​32734)
  • Always show gantt and code tabs (#​33029)
  • Move listener success hook to after SQLAlchemy commit (#​32988)
  • Rename db upgrade to db migrate and add connections create-default-connections (#​32810, #​33136)
  • Remove old gantt chart and redirect to grid views gantt tab (#​32908)
  • Adjust graph zoom based on selected task (#​32792)
  • Call listener on_task_instance_running after rendering templates (#​32716)
  • Display execution_date in graph view task instance tooltip. (#​32527)
  • Allow configuration to be contributed by providers (#​32604, #​32755, #​32812)
  • Reduce default for max TIs per query, enforce <= parallelism (#​32572)
  • Store config description in Airflow configuration object (#​32669)
  • Use isdisjoint instead of not intersection (#​32616)
  • Speed up calculation of leaves and roots for task groups (#​32592)
  • Kubernetes Executor Load Time Optimizations (#​30727)
  • Save DAG parsing time if dag is not schedulable (#​30911)
  • Updates health check endpoint to include dag_processor status. (#​32382)
  • Disable default allowing the testing of connections in UI, API and CLI (#​32052, #​33342)
  • Fix config var types under the scheduler section (#​32132)
  • Allow to sort Grid View alphabetically (#​32179)
  • Add hostname to triggerer metric [triggers.running] (#​32050)
  • Improve DAG ORM cleanup code (#​30614)
  • TriggerDagRunOperator: Add wait_for_completion to template_fields (#​31122)
  • Open links in new tab that take us away from Airflow UI (#​32088)
  • Only show code tab when a task is not selected (#​31744)
  • Add descriptions for celery and dask cert configs (#​31822)
  • PythonVirtualenvOperator termination log in alert (#​31747)
  • Migration of all DAG details to existing grid view dag details panel (#​31690)
  • Add a diagram to help visualize timer metrics (#​30650)
  • Celery Executor load time optimizations (#​31001)
  • Update code style for airflow db commands to SQLAlchemy 2.0 style (#​31486)
  • Mark uses of md5 as "not-used-for-security" in FIPS environments (#​31171)
  • Add pydantic support to serde (#​31565)
  • Enable search in note column in DagRun and TaskInstance (#​31455)
  • Save scheduler execution time by adding new Index idea for dag_run (#​30827)
  • Save scheduler execution time by caching dags (#​30704)
  • Support for sorting DAGs by Last Run Date in the web UI (#​31234)
  • Better typing for Job and JobRunners (#​31240)
  • Add sorting logic by created_date for fetching triggers (#​31151)
  • Remove DAGs.can_create on access control doc, adjust test fixture (#​30862)
  • Split Celery logs into stdout/stderr (#​30485)
  • Decouple metrics clients and validators into their own modules (#​30802)
  • Description added for pagination in get_log api (#​30729)
  • Optimize performance of scheduling mapped tasks (#​30372)
  • Add sentry transport configuration option (#​30419)
  • Better message on deserialization error (#​30588)

Bug Fixes
"""""""""

  • Remove user sessions when resetting password (#​33347)
  • Gantt chart: Use earliest/oldest ti dates if different than dag run start/end (#​33215)
  • Fix virtualenv detection for Python virtualenv operator (#​33223)
  • Correctly log when there are problems trying to chmod airflow.cfg (#​33118)
  • Pass app context to webserver_config.py (#​32759)
  • Skip served logs for non-running task try (#​32561)
  • Fix reload gunicorn workers (#​32102)
  • Fix future DagRun rarely triggered by race conditions when max_active_runs reached its upper limit. (#​31414)
  • Fix BaseOpe

Configuration

📅 Schedule: Branch creation - "" (UTC), Automerge - At any time (no schedule defined).

🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied.

Rebasing: Never, or you tick the rebase/retry checkbox.

🔕 Ignore: Close this PR and you won't be reminded about this update again.


  • If you want to rebase/retry this PR, check this box

This PR has been generated by Mend Renovate. View repository job log here.

@forking-renovate forking-renovate bot added the dependencies Dependency changes and updates label Dec 1, 2023
Copy link

⚠ Artifact update problem

Renovate failed to update an artifact related to this branch. You probably do not want to merge this PR as-is.

♻ Renovate will retry this branch, including artifacts, only when one of the following happens:

  • any of the package files in this branch needs updating, or
  • the branch becomes conflicted, or
  • you click the rebase/retry checkbox if found above, or
  • you rename this PR's title to start with "rebase!" to trigger it manually

The artifact failure details are included below:

File name: poetry.lock
installing v2 tool python v3.8.12
[11:14:11.599] INFO (9): Installing tool python v3.8.12...
linking tool python v3.8.12
Python 3.8.12
pip 23.3.1 from /opt/containerbase/tools/python/3.8.12/lib/python3.8/site-packages/pip (python 3.8)
[11:14:18.498] INFO (9): Installed tool python in 6.8s.
[11:14:18.876] INFO (172): Installing tool poetry v1.2.2...
installing v2 tool poetry v1.2.2
linking tool poetry v1.2.2
Poetry (version 1.2.2)
[11:14:26.685] INFO (172): Installed tool poetry in 7.8s.
Creating virtualenv cloud-datasets-0Lmw95G8-py3.8 in /home/ubuntu/.cache/pypoetry/virtualenvs
Updating dependencies
Resolving dependencies...


Because cloud-datasets depends on apache-airflow (==2.7.3) which depends on sqlalchemy (>=1.4.28,<2.0), sqlalchemy is required.
So, because cloud-datasets depends on SQLAlchemy (==1.3.24), version solving failed.

@renovate-bot renovate-bot deleted the renovate/pypi-apache-airflow-vulnerability branch January 1, 2024 17:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dependencies Dependency changes and updates
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant