cyhy-feeds consists of two parts: the extractor and the retriever.
cyhy-data-extract
retrieve and compress the specified data, sign the compressed
file, encrypt the file, and optionally push the encrypted, compressed file to an
S3 bucket using provided AWS credentials.
cyhy-data-retriever
take a provided file (optionally stored in an S3 bucket),
decrypt it, and then decompress it to local storage.
Both cyhy-data-extract
and cyhy-data-retriever
require Python 3.
To run the tool locally first install the requirements:
pip install -r requirements.txt
Create compressed, encrypted, signed extract file with Federal CyHy data for
integration with the Weathermap project.
Usage:
COMMAND_NAME --config CONFIG_FILE [--cyhy-config CYHY_CONFIG] [--scan-config SCAN_CONFIG] [--assessment-config ASSESSMENT_CONFIG] [-v | --verbose] [-a | --aws ] [--cleanup-aws] [--date DATE] [--debug]
COMMAND_NAME (-h | --help)
COMMAND_NAME --version
Options:
-h --help Show this screen
--version Show version
-x CYHY_CONFIG --cyhy-config=CYHY_CONFIG CyHy MongoDB configuration to use
-y SCAN_CONFIG --scan-config=SCAN_CONFIG Scan MongoDB configuration to use
-z ASSESSMENT_CONFIG --assessment-config=ASSESSMENT_CONFIG Assessment MongoDB configuration to use
-v --verbose Show verbose output
-a --aws Output results to S3 bucket
--cleanup-aws Delete old files from the S3 bucket
-c CONFIG_FILE --config=CONFIG_FILE Configuration file for this script
-d DATE --date=DATE Specific date to export data from in form: %Y-%m-%d (eg. 2018-12-31) NOTE that this date is in UTC
--debug Enable debug logging
Extract CyHy data for the current day using the MongoDB configuration in cyhy.yml
and the runtime configuration in cyhy-data-extract.cfg
.
python3 cyhy-data-extract.py --cyhy-config cyhy.yml --config cyhy-data-extract.cfg
Extract scan data for the current day using the MongoDB configuration in 'scan.yml'
and the runtime configuration in cyhy-data-extract.cfg
.
python3 cyhy-data-extract.py --scan-config scan.yml --config cyhy-data-extract.cfg
Extract assessment data for the current day using the MongoDB configuration in
assessment.yml
and the runtime configuration in cyhy-data-extract.cfg
.
python3 cyhy-data-extract.py --assessment-config assessment.yml --config cyhy-data-extract.cfg
Extract CyHy and scan data for the current day using the MongoDB configurations
in cyhy.yml
and scan.yml
, respectively, and use the runtime configuration in
cyhy-data-extract.cfg
.
python3 cyhy-data-extract.py --cyhy-config cyhy.yml --scan-config scan.yml
--config cyhy-data-extract.cfg
Extract CyHy and scan data for the current day using the MongoDB configurations
in cyhy.yml
and scan.yml
, upload the results to AWS, and use the runtime
configuration in cyhy-data-extract.cfg
.
python3 cyhy-data-extract.py --cyhy-config cyhy.yml --scan-config scan.yml
--aws --config cyhy-data-extract.cfg
Extract CyHy and scan data for January 25th, 2019 using the MongoDB configurations
in cyhy.yml
and scan.yml
, upload the results to AWS, and use the runtime
configuration in cyhy-data-extract.cfg
.
python3 cyhy-data-extract.py --cyhy-config cyhy.yml --scan-config scan.yml
--aws --config cyhy-data-extract.cfg --date 2019-01-25
Extract CyHy, scan, and assessment data for January 25th, 2019 using the MongoDB
configurations in cyhy.yml
, scan.yml
, and assessment.yml
, upload the results
to AWS, and use the runtime configuration in cyhy-data-extract.cfg
.
python3 cyhy-data-extract.py --cyhy-config cyhy.yml --scan-config scan.yml
--assessment-config assessment.yml --aws --config cyhy-data-extract.cfg
--date 2019-01-25
Retrieve a compressed, encrypted, signed extract file and
verify/decrypt/uncompress it.
NOTES:
* the python modules below must be installed for the script to work
* This script expects to operate on a GPG-encrypted bzip2 tar file
e.g. filename.tbz.gpg
Usage:
COMMAND_NAME [-v | --verbose] [--filename EXTRACT_FILENAME] [-a | --aws]
--config CONFIG_FILE
COMMAND_NAME (-h | --help)
COMMAND_NAME --version
Options:
-h --help Show this screen
--version Show version
-f EXTRACT_FILENAME --filename=EXTRACT_FILENAME Name of extract file to retrieve
-v --verbose Show verbose output
-c CONFIG_FILE --config=CONFIG_FILE Configuration file for this script
-a --aws Output results to S3 bucket
Retrieve the data stored in file cyhy_extract_2019-01-25T000000+0000.tbz.gpg
residing on AWS using the runtime configuration in cyhy-data-retriever.cfg
.
cyhy-data-retriever --filename cyhy_extract_2019-01-25T000000+0000.tbz.gpg --aws
--config cyhy-data-retriever.cfg
FED_ORGS_EXCLUDED
- Orgs to exclude from extractGNUPG_HOME
- Location of GNUPG database (eg. /Users/bob/.gnupg)RECIPIENTS
- Names on the gpg public key(s)SIGNER
- Gpg signer to ensure integritySIGNER_PASSPHRASE
- Passphrase for signer gpg keyOUTPUT_DIR
- Directory to output extract toFILE_RETENTION_NUM_DAYS
- Number of days to hold extractES_AWS_CONFIG_SECTION_NAME
- Name of the AWS config file section containing the configuration to be used when accessing the Elasticsearch dataES_REGION
- Region for DMARC bucketES_URL
- Elasticsearch URLES_RETRIEVE_SIZE
- Elasticsearch size
CLIENT_PRIVATE_KEY_FILE
- Path to gpg private keyGNUPG_HOME
- Location of GPG database (eg. /Users/bob/.gnupg)GPG_DECRYPTION_PASSPHRASE
- Passphrase for private gpg keyAWS_ACCESS_KEY_ID
- User ID used for AWS S3 bucket read accessAWS_SECRET_ACCESS_KEY
- Key for AWS S3 bucket read accessPROXY_CONFIG
- Only needed when proxy is present
We welcome contributions! Please see here for details.
This project is in the worldwide public domain.
This project is in the public domain within the United States, and copyright and related rights in the work worldwide are waived through the CC0 1.0 Universal public domain dedication.
All contributions to this project will be released under the CC0 dedication. By submitting a pull request, you are agreeing to comply with this waiver of copyright interest.