Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SQLAlchemy 2.0 compatibility #12541

Open
26 of 28 tasks
jdavcs opened this issue Sep 23, 2021 · 3 comments
Open
26 of 28 tasks

SQLAlchemy 2.0 compatibility #12541

jdavcs opened this issue Sep 23, 2021 · 3 comments
Assignees
Labels
area/backend area/database Galaxy's database or data access layer kind/refactoring cleanup or refactoring of existing code, no functional changes planning

Comments

@jdavcs
Copy link
Member

jdavcs commented Sep 23, 2021

This builds on #10892 (SQLAlchemy 1.4 compatibility) and details the specific steps required for preparing Galaxy's codebase for migrating to SQLAlchemy's upcoming 2.0 release. The steps are documented here.

  1. First Prerequisite, step one:

    • A working 1.3 Application (ref)
  2. First Prerequisite, step two:

    • A Working 1.4 Application (ref):
      • Review the significant changes in 1.3 > 1.4 outlined here. (the first - URL object is now immutable - has been addressed)
      • Review and try to remove the root causes for currently emitted SAWarnings (there may be just a handful causes that lead to numerous warnings), unless those causes are non critical and would require an inordinate amount of effort to resolve
  3. Migration to 2.0 Step One:

    • Python 3 only (Python 3.6 minimum) (ref).
  4. Migration to 2.0 Step Two Prerequisites (Galaxy-specific)

  5. Migration to 2.0 Step Two - Use RemovedIn20Warnings (ref)(Enable setting SQLALCHEMY_WARN_20 #13171)
    (To enable warnings, set GALAXY_CONFIG_SQLALCHEMY_WARN_20=1)

    • Setup a warnings filter to handle RemovedIn20Warnings
    • Enable the SQLALCHEMY_WARN_20=1 environment variable for runtime:
      • in local dev
      • on main
    • Enable the SQLALCHEMY_WARN_20=1 environment variable for all tests:
      • local dev (subsets of tests can be run manually with the flag set)
      • all relevant testing workflows
  6. Migration to 2.0 Step Three - Resolve all RemovedIn20Warnings (ref)
    (Helpful documentation on how to filter warnings: sqlalchemy, pytest, python)

    • Resolve the underlying cause for each warning that suggests a future runtime error: add test exposing the issue (if possible within reason), fix it. (NOTE: Do not silence the warning before fixing the underlying cause.).
      See these examples of changes that may potentially have a significant impact on a running Galaxy:
  7. Migration to 2.0 Step Four - Use the future flag on Engine (ref)

    • Pass the future=True flag to the create_engine() function (there are multiple places where we call that function: check entire code base: lib/, scripts/, test/)
    • Consider using connection.commit() where applicable
    • Resolve any new errors or deprecation warnings
  8. Migration to 2.0 Step Four - Use the future flag on Session (ref)

    • Pass the future=True flag to Session wherever we are creating an instance (direct instantiation of Session or calling the sessionmaker factory)
    • Resolve any new errors or deprecation warnings
  9. Once all of the above is complete and no RemovedIn20Warning are emitted:

    • Review the documentation that details specific changes that are to be made to the code base with respect to all major API modifications (What's new in 2.0)
  10. Enable SA 2.0:

  11. Cleanup:

  12. Next steps:

    • Refactor data access code to take advantage of SQLAlchemy 2.0 improvements

UPDATE: Keeping track of fixing RemovedIn20Warnings:

Number of distinct warnings per github workflow (each may be triggered multiple times by different tests):
(last updated/corrected: Dec 11, 2023)

tests 10/26 11/15 11/29 12/04 12/11
api 90 85 85 17 0
dbindexes 6 1 1 0 0
framework 2 1 1 1 0
integration 326 165 142 37 0
integration sel 66 33 31 12 0
selenium 200 100 80 23 0
toolshed 2 1 1 1 0
unit 2 1 1 1 0
TOTAL 694 420 342 92 0

UPDATE: Keeping track of fixing test failures on the SQLAlchemy-2.0 PR (#17180):

(last updated/corrected: Feb 23, 2024)

tests 12/19 12/20 12/21 1/12 1/17 1/18 1/24 1/25 2/12 2/23
integration 10 10 10 10 10 10 10 10 0 0
integration sel 1 0 0 0 0 0 0 0 0 0
toolshed 75 75 75 75 75 75 75 75 75 0
unit w/postgres 32 0 0 0 0 0 0 0 0 0
mypy 1178 911 779 702 538 392 243 173 0 0
TOTAL 1296 996 864 787 623 477 328 258 75 0
@jdavcs jdavcs added kind/refactoring cleanup or refactoring of existing code, no functional changes area/database Galaxy's database or data access layer labels Sep 23, 2021
@jdavcs jdavcs added this to the 22.01 milestone Sep 23, 2021
@jdavcs jdavcs self-assigned this Sep 23, 2021
@jdavcs
Copy link
Member Author

jdavcs commented Oct 13, 2021

Running test/unit/data/model/test_mapping.py with the SQLALCHEMY_WARN_20=1 flag triggers 1561 RemovedIn20Warning warnings. It is likely not as problematic as it looks: the set of root causes that account for these warnings is, I'm sure, much smaller and, therefore, manageable. However, I think it makes sense to address these by enabling the flag manually for each test run: otherwise the warnings would pollute normal output.

@jdavcs
Copy link
Member Author

jdavcs commented Oct 15, 2021

So, we are switching to Alembic before this can be completed. I tried fixing the bound metadata issue (one of the "things" that are removed in 2.0), and that appears to be not possible with our current migration provider (e.g. a bound engine is assumed when SA Migrate executes Column.create(). So, Alembic comes first.

@jdavcs
Copy link
Member Author

jdavcs commented Jan 13, 2022

In 2.0 there will be no backref/back-populates cascading of child objects into the session (ref). In particular: "In 2.0, the default behavior will be that “cascade_backrefs” is False, and additionally there will be no “True” behavior as this is not generally a desirable behavior." This, potentially could wreak havoc on a running Galaxy. To address this, I will try adding an explicit cascade_backref=False to all relationships with back_populates and address any issues on a case-by-case basis.

Once we get to Step Four and enable the future=True flag on the Session, I'll remove these attributes. I think this is the safest way to do it (and manageable too).

@mvdbeek mvdbeek modified the milestones: 23.1, 23.2 Jul 21, 2023
@jdavcs jdavcs modified the milestones: 23.2, 24.0 Jan 10, 2024
@jdavcs jdavcs modified the milestones: 24.0, 24.1 Mar 1, 2024
@jdavcs jdavcs mentioned this issue Mar 18, 2024
4 tasks
@jdavcs jdavcs removed this from the 24.1 milestone May 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/backend area/database Galaxy's database or data access layer kind/refactoring cleanup or refactoring of existing code, no functional changes planning
Projects
None yet
Development

No branches or pull requests

3 participants