Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SolaceIO.Write: add data classes, connector interface #31953

Merged
merged 1 commit into from
Jul 29, 2024

Conversation

iht
Copy link
Contributor

@iht iht commented Jul 23, 2024

This is a followup PR to #31906, and part of the issue #31905

This adds the interface of the Write connector and a few classes (data classes, POutput) that are used by the connector.

More changes are coming in a subsequent PR.

GitHub Actions Tests Status (on master branch)

Build python source distribution and wheels
Python tests
Java tests
Go tests

See CI.md for more information about GitHub Actions CI or the workflows README to see a list of phrases to trigger workflows.

…1905

This adds the interface of the Write connector and a few classes (data
classes, POutput) that are used by the connector.
@iht iht changed the title This is a followup PR to #31906, and part of the issue #31905 SolaceIO.Write: add data classes, connector interface Jul 23, 2024
Copy link
Contributor

Checks are failing. Will not request review until checks are succeeding. If you'd like to override that behavior, comment assign set of reviewers

@iht
Copy link
Contributor Author

iht commented Jul 23, 2024

assign set of reviewers

Copy link
Contributor

Assigning reviewers. If you would like to opt out of this review, comment assign to next reviewer:

R: @robertwb for label java.
R: @shunping for label io.

Available commands:

  • stop reviewer notifications - opt out of the automated review tooling
  • remind me after tests pass - tag the comment author after tests pass
  • waiting on author - shift the attention set back to the author (any comment or push by the author will return the attention set to the reviewers)

The PR bot will only process comments in the main thread (not review comments).

@shunping
Copy link
Contributor

Code change looks good.

Are we expecting unit and integration tests to come in the follow-up PR?

@iht
Copy link
Contributor Author

iht commented Jul 26, 2024

@shunping Yes, integration tests, and the actual code of the connector are prepared in an upcoming PR. I am sending them in "chunks" because it is thousands of lines of code, to facilitate the review.

@Abacn Abacn merged commit 493e0ba into apache:master Jul 29, 2024
19 checks passed
@Abacn
Copy link
Contributor

Abacn commented Jul 29, 2024

merge this after received a lgtm review above

@iht iht mentioned this pull request Aug 2, 2024
iht added a commit to iht/beam that referenced this pull request Aug 27, 2024
…31905.

This PR adds the actual writer functionality, and some additional
testing, including integration testing.

This should be final PR for the SolaceIO write connector to be
complete.
bzablocki pushed a commit to iht/beam that referenced this pull request Nov 12, 2024
…31905.

This PR adds the actual writer functionality, and some additional
testing, including integration testing.

This should be final PR for the SolaceIO write connector to be
complete.
Abacn pushed a commit that referenced this pull request Nov 13, 2024
* This is a follow-up PR to #31953, and part of the issue #31905.

This PR adds the actual writer functionality, and some additional
testing, including integration testing.

This should be final PR for the SolaceIO write connector to be
complete.

* Use static imports for Preconditions

* Remove unused method

* Logging has builtin formatting support

* Use TypeDescriptors to check the type used as input

* Fix parameter name

* Use interface + utils class for MessageProducer

* Use null instead of optional

* Avoid using ByteString just to create an empty byte array.

* Fix documentation, we are not using ByteString now.

* Not needed anymore, we are not using ByteString

* Defer transforming latency from nanos to millis.

The transform into millis is done at the presentation moment, when
the metric is reported to Beam.

* Avoid using top level classes with a single inner class.

A couple of DoFns are moved to their own files too, as the
abstract class forthe UnboundedSolaceWriter was in practice a
"package".

This commits addresses a few comments about the structure of
UnboundedSolaceWriter and some base classes of that abstract
class.

* Remove using a state variable, there is already a timer.

This DoFn is a stateful DoFn to force a shuffling with a given
input key set cardinality.

* Properties must always be set.

The warnings are only shown if the user decided to set the
properties that are overriden by the connector.

This was changed in one of the previous commits but it is
actually a bug. I am reverting that change and changing this to a
switch block, to make it more clear that the properties need to be
set always by the connector.

* Add a new custom mode so no JCSMP property is overridden.

This lets the user to fully control all the properties used by the connector,
instead of making sensible choices on its behalf.

This also adds some logging to be more explicit about what the connector is
doing. This does not add too much logging pressure, this only adds logging at
the producer creation moment.

* Add some more documentation about the new custom submission mode.

* Fix bug introduced with the refactoring of code for this PR.

I forgot to pass the submission mode when the write session is created, and I
called the wrong method in the base class because it was defined as public.

This makes sure that the submission mode is passed to the session when the
session is created for writing messages.

* Remove unnecessary Serializable annotation.

* Make the PublishResult class for handling callbacks non-static to handle pipelines with multiple write transforms.

* Rename maxNumOfUsedWorkers to numShards

* Use RoundRobin assignment of producers to process bundles.

* Output results in a GlobalWindow

* Add ErrorHandler

* Fix docs

* Remove PublishResultHandler class that was just a wrapper around a Queue

* small refactors

* Revert CsvIO docs fix

* Add withErrorHandler docs

* fix var scope

---------

Co-authored-by: Bartosz Zablocki <[email protected]>
reeba212 pushed a commit to reeba212/beam that referenced this pull request Dec 4, 2024
…1905 (apache#31953)

This adds the interface of the Write connector and a few classes (data
classes, POutput) that are used by the connector.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants