Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BEAM-3456] Re-enable JDBC performance test #4714

Merged
merged 3 commits into from
Feb 21, 2018

Conversation

lgajowy
Copy link
Contributor

@lgajowy lgajowy commented Feb 20, 2018

This PR re-enables the JDBC IO IT running on jenkins and kubernetes cluster. It's best to think of it as a follow up PR to #4392. back then it was impossible to run kubectl on all Jenkins executors.

The jenkins job:

  • Changes number of processed records to 5 000 000
  • Connects to postgres database hosted in kubernetes cluster
  • Obtains necessary kubernetes credentials if needed
  • Stores the results in bigquery table

Follow this checklist to help us incorporate your contribution quickly and easily:

  • Make sure there is a JIRA issue filed for the change (usually before you start working on it). Trivial changes like typos do not require a JIRA issue. Your pull request should address just this issue, without pulling in other changes.
  • Format the pull request title like [BEAM-XXX] Fixes bug in ApproximateQuantiles, where you replace BEAM-XXX with the appropriate JIRA issue.
  • Write a pull request description that is detailed enough to understand:
    • What the pull request does
    • Why it does it
    • How it does it
    • Why this approach
  • Each commit in the pull request should have a meaningful subject line and body.
  • Run mvn clean verify to make sure basic checks pass. A more thorough check will be performed on your pull request automatically.
  • If this contribution is large, please file an Apache Individual Contributor License Agreement.

- Change number of processed records to 5 000 000 
- Connect to postgres database hosted in kubernetes cluster
- obtain necessary kubernetes credentials if needed
- store the results in bigquery table
@lgajowy
Copy link
Contributor Author

lgajowy commented Feb 20, 2018

Run seed job

@lgajowy
Copy link
Contributor Author

lgajowy commented Feb 20, 2018

Run Java JdbcIO Performance Test

Remove pipeline options duplicates. The options are already defined in pkb_config_local.yaml file.
@lgajowy
Copy link
Contributor Author

lgajowy commented Feb 20, 2018

Run seed job

@lgajowy
Copy link
Contributor Author

lgajowy commented Feb 20, 2018

Run Java JdbcIO Performance Test

Add missing posrtgresPort (the default value is 0)
@lgajowy
Copy link
Contributor Author

lgajowy commented Feb 20, 2018

Run seed job

@lgajowy
Copy link
Contributor Author

lgajowy commented Feb 20, 2018

Run Java JdbcIO Performance Test

@lgajowy
Copy link
Contributor Author

lgajowy commented Feb 20, 2018

@chamikaramj @alanmyrvold could you take a look?

I utilized the io-datastores cluster as discussed here: https://issues.apache.org/jira/browse/BEAM-3561

Judging from the logs the only thing that is missing right now is the bigquery table (same problem as here: https://issues.apache.org/jira/browse/BEAM-3406 ).

@alanmyrvold, could we create the table? Its name is: beam_performance.jdbcioit_pkb_results

@chamikaramj
Copy link
Contributor

chamikaramj commented Feb 21, 2018

LGTM

Nice, let's get this in once we have the BQ table.

@lgajowy
Copy link
Contributor Author

lgajowy commented Feb 21, 2018

Thanks, Chamikara!

Actually @alanmyrvold , I'll be adding similar jenkins job for HadoopInputFormatIOIT in the next few days so if you think we can also create a table for HadoopInputFormatIOIT results in one go that would be super great for me. Proposed name for the second table: beam_performance.hadoopinputformatioit_pkb_results. Thanks a lot!

@DariuszAniszewski
Copy link
Contributor

@lgajowy FYI GoogleCloudPlatform/PerfKitBenchmarker#1554 was just merged - table should now be generated automatically.

@lgajowy
Copy link
Contributor Author

lgajowy commented Feb 21, 2018

Run seed job

@lgajowy
Copy link
Contributor Author

lgajowy commented Feb 21, 2018

Run Java JdbcIO Performance Test

@lgajowy
Copy link
Contributor Author

lgajowy commented Feb 21, 2018

Success! 😁 The table got generated thanks to the --autodetect option in perfkit. Creating them manually is no longer needed, as @DariuszAniszewski said.

@chamikaramj
Copy link
Contributor

Nice. Thanks :)

@chamikaramj chamikaramj merged commit 89afba1 into apache:master Feb 21, 2018
@lgajowy lgajowy deleted the re-enable-jdbc-io-it-on-jenkins branch March 14, 2018 11:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants