-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
new command flepimop-push, flepimop-pull #296
Open
fang19911030
wants to merge
40
commits into
main
Choose a base branch
from
python_script_resume
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
40 commits
Select commit
Hold shift + click to select a range
d85f6e4
add new command flepimop-pull
fang19911030 edfbb68
bug fix
fang19911030 1cb9ef1
change format
fang19911030 d9d59a2
add the test
fang19911030 c18ecde
change argument type
fang19911030 ea59026
add file check
fang19911030 aafc88b
add new file for push command
fang19911030 d3a4cac
Merge branch 'main' into python_script_resume
fang19911030 9e838b0
Merge branch 'main' into python_script_resume
fang19911030 62c0979
add function creating file names for pushing
fang19911030 5407c71
add body for flepimop-push
fang19911030 e8c1c42
add command flepimop-push
fang19911030 cde74d4
change error message
fang19911030 f1a57fb
fix wrong parameter
fang19911030 8c6b65f
rename file
fang19911030 b0d8895
wrong file name
fang19911030 fc8b4fa
update doc and fix format
fang19911030 5fce4d4
fix
fang19911030 ce734c4
black fix format
fang19911030 6cacb69
print message
fang19911030 34b18cf
clean
fang19911030 534d932
correct variable name
fang19911030 72ef61b
correct tests
fang19911030 1cbce6c
Merge branch 'main' into python_script_resume
jcblemai d4ba408
Merge branch 'main' into python_script_resume
jcblemai cee4259
Merge branch 'main' into python_script_resume
fang19911030 da0b989
address comments
fang19911030 3c21a82
address comments 2
fang19911030 1d8879a
Merge branch 'python_script_resume' of https://github.com/HopkinsIDD/…
fang19911030 c9a2307
Merge branch 'main' into python_script_resume
fang19911030 29cf95d
change doc string of file_paths
fang19911030 62c56f0
remove main
fang19911030 8194214
remove main and relocate import
fang19911030 14193ff
add test file
fang19911030 22f5188
change
fang19911030 a417e38
new unit test
fang19911030 f796ce2
add test
fang19911030 25d79d3
change click type
fang19911030 fea9449
change click string to path
fang19911030 1522e8f
format change
fang19911030 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,206 @@ | ||
import os | ||
import click | ||
import shutil | ||
from .file_paths import create_file_name_for_push | ||
|
||
|
||
@click.command() | ||
fang19911030 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
@click.option( | ||
"--s3_upload", | ||
"s3_upload", | ||
envvar="S3_UPLOAD", | ||
help="push files to aws", | ||
required=True, | ||
) | ||
@click.option( | ||
"--data-path", | ||
"data_path", | ||
envvar="PROJECT_PATH", | ||
type=click.Path(exists=True), | ||
required=True, | ||
) | ||
@click.option( | ||
"--flepi_run_index", | ||
"flepi_run_index", | ||
envvar="FLEPI_RUN_INDEX", | ||
type=click.STRING, | ||
required=True, | ||
) | ||
@click.option( | ||
"--flepi_prefix", | ||
"flepi_prefix", | ||
envvar="FLEPI_PREFIX", | ||
type=click.STRING, | ||
required=True, | ||
) | ||
@click.option( | ||
"--flepi_block_index", | ||
"flepi_block_index", | ||
envvar="FLEPI_BLOCK_INDEX", | ||
type=click.STRING, | ||
required=True, | ||
) | ||
@click.option( | ||
"--flepi_slot_index", | ||
"flepi_slot_index", | ||
envvar="FLEPI_SLOT_INDEX", | ||
type=click.STRING, | ||
required=True, | ||
) | ||
@click.option( | ||
"--s3_results_path", | ||
"s3_results_path", | ||
envvar="S3_RESULTS_PATH", | ||
type=click.STRING, | ||
default="", | ||
required=False, | ||
) | ||
@click.option( | ||
"--fs_results_path", | ||
"fs_results_path", | ||
envvar="FS_RESULTS_PATH", | ||
type=click.Path(), | ||
default="", | ||
required=False, | ||
) | ||
def flepimop_push( | ||
s3_upload: str, | ||
data_path: str, | ||
flepi_run_index: str, | ||
flepi_prefix: str, | ||
flepi_slot_index: str, | ||
flepi_block_index: str, | ||
s3_results_path: str = "", | ||
fs_results_path: str = "", | ||
) -> None: | ||
""" | ||
Push files to either AWS S3 or the local filesystem. | ||
|
||
This function generates a list of file names based on the provided parameters, checks which files | ||
exist locally, and uploads or copies these files to either AWS S3 or the local filesystem based on | ||
the specified options. | ||
|
||
Parameters: | ||
---------- | ||
s3_upload : str | ||
String indicating whether to push files to AWS S3. If set to true, files will be uploaded to S3. | ||
If set to False, files will be copied to the local filesystem as specified by `fs_results_path`. | ||
|
||
data_path : str | ||
The local directory path where the data files are stored. | ||
|
||
flepi_run_index : str | ||
The index of the FLEPI run. This is used to uniquely identify the run and generate the corresponding file names. | ||
|
||
flepi_prefix : str | ||
A prefix string to be included in the file names. This is typically used to categorize or identify the files. | ||
|
||
flepi_slot_index : str | ||
The slot index used in the filename. This is formatted as a zero-padded nine-digit number, which helps in | ||
distinguishing different slots of data processing. | ||
|
||
flepi_block_index : str | ||
The block index used in the filename. This typically indicates a specific block or segment of the data being processed. | ||
|
||
s3_results_path : str, optional | ||
The S3 path where the results should be uploaded. This parameter is required if `s3_upload` is set to true. | ||
Default is an empty string, which will raise an error if `s3_upload` is True. | ||
|
||
fs_results_path : str, optional | ||
The local filesystem path where the results should be copied. | ||
Default is an empty string, which means no files will be copied locally unless specified. | ||
|
||
Raises: | ||
------ | ||
ValueError | ||
If `s3_upload` is set to True and `s3_results_path` is not provided. | ||
|
||
ModuleNotFoundError | ||
If `boto3` is not installed when `s3_upload` is set to True. | ||
|
||
Notes: | ||
----- | ||
- This function first checks for the existence of the files generated by `create_file_name_for_push` | ||
in the `data_path` directory. Only the files that exist will be pushed to AWS S3 or copied to the local filesystem. | ||
|
||
- When uploading to AWS S3, the function attempts to create the specified path in the S3 bucket if it does not exist. | ||
|
||
- Local directories specified by `fs_results_path` are created if they do not already exist. | ||
|
||
Example Usage: | ||
-------------- | ||
```bash | ||
flepimop-push --s3_upload true --data-path /path/to/data --flepi_run_index run_01 --flepi_prefix prefix_01 \ | ||
--flepi_slot_index 1 --flepi_block_index 1 --s3_results_path s3://my-bucket/results/ | ||
``` | ||
|
||
This would push the existing files generated by the `create_file_name_for_push` function to the specified S3 bucket. | ||
""" | ||
file_name_list = create_file_name_for_push( | ||
flepi_run_index=flepi_run_index, | ||
prefix=flepi_prefix, | ||
flepi_slot_index=flepi_slot_index, | ||
flepi_block_index=flepi_block_index, | ||
) | ||
exist_files = [] | ||
for file_name in file_name_list: | ||
file_path = os.path.join(data_path, file_name) | ||
if os.path.exists(file_path): | ||
exist_files.append(file_name) | ||
print("flepimos-push find these existing files: " + " ".join(exist_files)) | ||
# Track failed uploads/copies separately | ||
failed_s3_uploads = [] | ||
failed_fs_copies = [] | ||
if s3_upload == "true": | ||
try: | ||
import boto3 | ||
from botocore.exceptions import ClientError | ||
except ModuleNotFoundError: | ||
raise ModuleNotFoundError( | ||
( | ||
"No module named 'boto3', which is required for " | ||
"gempyor.flepimop_push.flepimop_push. Please install the aws target." | ||
) | ||
) | ||
if s3_results_path == "": | ||
raise ValueError( | ||
"argument aws is setted to True, you must use --s3_results_path or environment variable S3_RESULTS_PATH." | ||
) | ||
s3 = boto3.client("s3") | ||
for file in exist_files: | ||
s3_path = os.path.join(s3_results_path, file) | ||
bucket = s3_path.split("/")[2] | ||
object_name = s3_path[len(bucket) + 6 :] | ||
try: | ||
s3.upload_file(os.path.join(data_path, file), bucket, object_name) | ||
print(f"Uploaded {file} to S3 successfully.") | ||
except ClientError as e: | ||
print(f"Failed to upload {file} to S3: {e}") | ||
failed_s3_uploads.append(file) | ||
|
||
if fs_results_path != "": | ||
for file in exist_files: | ||
dst = os.path.join(fs_results_path, file) | ||
os.makedirs(os.path.dirname(dst), exist_ok=True) | ||
try: | ||
shutil.copy(os.path.join(data_path, file), dst) | ||
print(f"Copied {file} to local filesystem successfully.") | ||
except IOError as e: | ||
print(f"Failed to copy {file} to local filesystem: {e}") | ||
failed_fs_copies.append(file) | ||
|
||
# Print failed files for S3 uploads | ||
if failed_s3_uploads: | ||
print("The following files failed to upload to S3:") | ||
for file in failed_s3_uploads: | ||
print(file) | ||
|
||
# Print failed files for local filesystem copies | ||
if failed_fs_copies: | ||
print("The following files failed to copy to the local filesystem:") | ||
for file in failed_fs_copies: | ||
print(file) | ||
|
||
# Success message if no failures | ||
if not failed_s3_uploads and not failed_fs_copies: | ||
print("flepimop-push successfully pushed all existing files.") |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's my intention that these will be shortly replaced by interacting with this capability via the core flepimop cli. makes sense to add them for the time being, but people should be advised that they will migrate soon (ideally) to the overall flepimop cli.