Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implements TimerTasks using Promises. #425

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open

Conversation

hms
Copy link
Contributor

@hms hms commented Nov 24, 2024

Replaces the OG TimerTask with a Promises based implementation.

In combination with the previous commit, all of SQ has been moved
to the Promises infrastructure and makes using any of the new
concurrency functionality (Actors, Channels, etc.) more easily available
for future SQ enhancements.

This PR is a continuation of the not yet accepted PR #417. If desired / required,
I am happy to pull the 2 commits into a clean PR based on main.

* Adds to RVM .ruby-gemset to .gitignore for RVM and gemset holdouts.
* Updates Gemfile.lock to include arm64-darwin-24 (weird that it's
  missing)
* Adds 4 development dependencies to quiet deprecation warnings
* Updates rubocop to allow Ruby 3.3 syntax, matching .ruby-version
SolidQueue has excellent built-in error reporting. While this is
fantastic for SQ users, it is less than ideal for testing
SolidQueue because any test that deliberately uses or triggers an
exception produces voluminous error reporting.

This error reporting is hugely valuable when the exception is
not expected, but distracting and of limited value for expected
use-cases, especially when the test confirms the correct outcomes
via assertions.

This commit adds:
* A generic test-specific Exception class: ExpectedTestError
  This allows testing for specific exceptions while retaining
  all error reporting infrastructure for unexpected exceptions.

* Two helper methods for silencing on_thread_error output
  These methods accept an Exception or Array(Exception)
  and simply does not call the output mechanism if the
  exception passed to on_thread_error matches.
  This way, any unexpected error during test still
  reports in a highly visible manner while the exceptions
  being tested are validated via assertions.

* Replaces the stock on_thread_error with one that ignores
  ExpectedTextError.  Updated several tests from using
  the ruby stock RuntimeError to ExpectedTestError.

* Configures tests to run with YJIT enabled
  This is to test under likely production deployment configuration,
  not for performance reasons.

  Note: With the very recent reporting on M4's crashing on Mac's
  with YJIT enabled, we might want to either defer this change
  or add a conditional to opt in until the problem is resolved.
* Replaces a little Unix cleverness with a standard Ruby class.
  This pushes the responsibiity for meeting the SQ requirements
  from SQ to stock Ruby

* Delivers equivelent performance, identical API, and API behaviors
  with the original implementation (see note below on Futures)

* Mostly fixes a *platform / version dependent* issue with
  MySQL (see below)

* Meets 100% of SQ's functional requirements:

  * interruptible_sleep: a potentially blocking operation
    interruptible via either a "wake_event" (possibly requested
    prior to entering interruptible_sleep) or blocking until a
    timeout.

  * wake_up / interrupt: a Signal#trap and thread-safe method that
    does not require user-level synchronization (with the risk of
    not fully understanding all of the complexities required) code
    that either interrupts an inflight-interruptible_sleep or enqueues
    the event to processed in the invocation of interruptible_sleep

* Interruptible's API is trivially reproduceable via Thread::Queue
  * interruptible_sleep => Queue.pop(timeout:) where pushing anything
    into the queue acts as the interrupt event and timeout is reliable
    without any extra code or exception handling.

  * wake_up / interrupt => Queue.push(Anything) is thread, fiber, and
    Signal.trap safe (can be called from anywhere) and captures
    all wake_up events whenever requested, automaticall caching any
    "event" not processed by a currently executing interruptible_sleep
    matching existing functionality exactly.

Why the Future in #interruptible_sleep?

While Thread::Queue micro benchmarks as having the same performance on
the main thread Vs. any form of a sub-thread (or Fiber) and self-pipe,
when running the SQ test suite we see a 35% slow down Vs. the original
self-pipe implenentation.  One assumes this slowdown would manifest
in production. By moving the just the Queue#pop into a separate thread via
Concurrent::Promises.future we get +/- identical performance to the original
self-pipe implementation.

I'm assuming this root causes to Ruby main-thread only housekeeping and/or
possibly triggering a fast/slow path issue.

Why a Future Vs. Thread#new for each interruptible_sleep call?

Every other threaded operation in SQ is implemented using Concurrent
Ruby. Using a Future is for code and architectual consistency. There is
no difference in performance or functionality between the two.

MySQL *only* issues:

There seems to be a *platform specific* or *version specific* problem
with MySQL database connectivity and/or broken self-pipes leading to
randomly failing tests and a stream of distracting backtraces *even
with successful* tests.  Adding to the complexity sometimes, the lost
database connection can self-heal -- HOWEVER -- this takes time and given
how much of the test suite has time based assertions, leads to
additional random test failures.

These, or similar, issues have been observed in the past when changes to
the MySQL client library forced changes in the mysql2 gem.

With the Thread::Queue based implementation of the Interruptible concern,
the random failures and amount of spurious output are dramatically
improved (but not eliminated).
* Upgrade Worker.pool from Future to Promises.future

  Promises.future leverages an improved, non-blocking, and lock-free
  implementation of Concurrent Ruby's Async runtime, enhancing performance
  and future feature compatibility.  Promises.future was moved from
  edge to main in V1.1 (2018).

* Replace ClaimedExecution::Results with Concurrent::Maybe

  `Results` was underutilized and essentially mirrored `Maybe`'s
  functionality. This change simplifies code and reduces redundancy
  by leveraging the `Concurrent::Maybe` class.

* Centralize error reporting in AppExecutor.handle_thread_error

  This change was necessitated by the change to Promises.future.
  Concurrent Ruby has some very strong ideas about exceptions
  within a future with a little code rearranging, this change
  pulls the error / exception reporting responsibilities out
  of the Future execution code path and pushes it to
  AppExecutor#handle_thread_error.

  This change ensures that `Rails.error` is called exactly once
  per `handle_thread_error` invocation regardless of
  on_thread_error calling `Rails.error` or not.

* Update tests to accommodate these changes
Promises are the recommended infrastructure, replacing
several OG APIs, including TimerTasks.

SQ only uses TimerTasks in 3 places (currently) and
a very small subset of their overall functionality.
SolidQueue::TimerTask is a drop-in replacement.

This PR uses AtomicBoolean instead of the recommended
Concurrent::Cancellation to avoid a dependency on,
and the potential API stability issues of, edge features.

This completes the move from the old APIs to Promises and
makes all of the new concurrency features (Actors,
Channels, etc.) available for future SQ features and
enahancements.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant