Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactoring/7 use new bfs interface #9

Merged
merged 34 commits into from
May 23, 2024

Conversation

ahsimb
Copy link
Collaborator

@ahsimb ahsimb commented May 16, 2024

closes #7

ahsimb added 4 commits March 26, 2024 09:20
# Conflicts:
#	exasol/python_extension_common/deployment/language_container_deployer.py
#	exasol/python_extension_common/deployment/language_container_deployer_cli.py
#	pyproject.toml
#	test/unit/deployment/test_language_container_deployer.py
#	test/unit/deployment/test_language_container_deployer_cli.py
@ahsimb ahsimb added the refactoring Code improvement without behavior change label May 16, 2024
@ahsimb ahsimb self-assigned this May 16, 2024
Comment on lines +272 to +279
bucketfs_host: Optional[str] = None, bucketfs_port: Optional[int] = None,
bucketfs_name: Optional[str] = None, bucket: Optional[str] = None,
bucketfs_user: Optional[str] = None, bucketfs_password: Optional[str] = None,
bucketfs_use_https: bool = True,
saas_url: Optional[str] = None,
saas_account_id: Optional[str] = None, saas_database_id: Optional[str] = None,
saas_token: Optional[str] = None,
path_in_bucket: str = '',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just an idea: Would it be better to have two different class methods: createOnPrem() and createSaas()?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure. The decision of which way to go needs to be taken somewhere. By making two different methods we are pushing this decision to the caller. From the caller's perspective, it might be easier to grab all available parameters, e.g. defined in the config, pass them over to the single point of call and let it make sense of these parameters.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that we would move the responsibility to the client. I'am not sure about the client's case. However, if the client knows that the connection is on-prem or SaaS, it would be cumbersome to pass all the unnecessary parameters.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't have to pass the unnecessary parameters. They are optional. For example, if you know that you work with SaaS the call might look like the following:

deployer = LanguageContainerDeployer.create(dsn, db_user, db_password, language_alias, saas_url='...', saas_acount_id='...', saas_database_id='...', saas_token='...')

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well :), except that the dsn, db_user and db_password are not really required for SaaS. They are derivatives of the other SaaS parameters. But we will sort it out in subsequent PRs. Need to first merge the saas-api-python with the helper function that works out the saas connection parameters.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or maybe I should merge it first and then do things property in this PR,

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, maybe have two different commands, one for OnPrem and one for SaaS

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, I think from the caller's (a human in this case) point of view it's better to have one cli command and different parameters, rather than two. One less name to remember, the rest is the same.

Comment on lines +83 to +89
for extension in ARCHIVE_EXTENSIONS:
if bucket_file.endswith(extension):
bucket_file = bucket_file[: -len(extension)]
break

file_path = bucket_base_path / bucket_file
return PurePosixPath(file_path.as_udf_path())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm, this might be actually logic that needs to go into PathLike.as_udf_path()

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure about that. The PathLike functions return either a string or another PathLike. The PathLike != PurePosixPath.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, I didn't mean the PurePosixPath part, I meant the extension part for archives

Copy link
Collaborator Author

@ahsimb ahsimb May 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a bit confused about what happens with an archive. It gets unzipped into the .dist directory. What do we actually see in the udf, the zipped or unzipped file? If it's the latter, then yes we should move that code to the backetfs. Does the dist get hidden in this case, somehow? I think it's better to do this fix in a separate PR because it requires a change in another repo.

default=lambda: os.environ.get(SAAS_ACCOUNT_ID_ENVIRONMENT_VARIABLE))
@click.option('--saas-database-id', type=str,
default=lambda: os.environ.get(SAAS_DATABASE_ID_ENVIRONMENT_VARIABLE))
@click.option('--saas-token', type=str,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for the token and the passwords, we should maybe add an input prompt in another PR, when they are not provided as cli argument or environment variable

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These values are quite long, entering them manually would be torture.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can copy them into the input prompt. The input prompt is usually used, when you don't want to leak your credentials in your shell history. However, as I said, this can be done later. I think, it simply enough to create a ticket for it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure I understood. According to the click doc, you can offer the user a choice, either to accept the default value stored in a config or env. variable, or enter it manually from a terminal. My point is that entering the saas parameters manually is not practical because they are too long. Hence we don't need a prompt here. It may be useful for the db password though.

pyproject.toml Show resolved Hide resolved
@ahsimb ahsimb merged commit db74636 into main May 23, 2024
15 checks passed
@ahsimb ahsimb deleted the refactoring/7-use-new-bfs-interface branch May 23, 2024 06:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
refactoring Code improvement without behavior change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Start using a new interface in the bucketfs-python
4 participants