diff --git a/README.md b/README.md index 14e5faf..6cbfc20 100644 --- a/README.md +++ b/README.md @@ -1,7 +1,6 @@ # MWA Demo -This is demonstation of a simple data processing pipeline for Murchison Widefield Array (MWA) data, -from downloading raw data to creating an image. +Demonstration pipeline for Murchison Widefield Array (MWA) data ## Flow @@ -103,7 +102,7 @@ running Docker withing WSL. It may be necessary to change ## Setup -Clone this repository to a macine that meets the [system requirements](#system-requirements). +Clone this repository to a machine that meets the [system requirements](#system-requirements). ```bash git clone https://github.com/MWATelescope/mwa-demo.git @@ -152,7 +151,7 @@ The demo has been tested on Windows 11 with Docker Desktop 4.33.1 on a Git Bash For optimal performance, you should compile the following software dependencies directly on your machine. -Advanced users can provide additional compiler flags during the build process to optimize for their specific CPU microarchitecture. e.g. `-march=native` for C/C++, or `-C target-cpu=native` for Rust. +Advanced users can provide additional compiler flags during the build process to optimize for their specific CPU micro-architecture. e.g. `-march=native` for C/C++, or `-C target-cpu=native` for Rust. The steps in the `Dockerfile` may be a useful guide. @@ -194,19 +193,39 @@ Linux users should also ensure they have permissions to run docker without root: quick start: pull the images from dockerhub. ```bash -docker pull d3vnull0/mwa-demo:latest +docker pull mwatelescope/mwa-demo:latest +``` + +When [running the demo](#running-the-demo), you should run the commands in an interactive Docker shell. + +```bash +docker run -it --rm -v ${PWD}:${PWD} -w ${PWD} -e MWA_ASVO_API_KEY=$MWA_ASVO_API_KEY mwatelescope/mwa-demo:latest ``` #### Docker Troubleshooting +macOS users: if you see this error: `WARNING: The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm64/v8) and no specific platform was requested`, you should pull the image for the correct platform. + +```bash +docker pull --platform linux/arm64 mwatelescope/mwa-demo:latest +``` + If you have any issues, you should delete all traces of the image that was pulled and build the image locally. (this may take a while) ```bash # first remove the image that was pulled from dockerhub -docker rmi d3vnull0/mwa-demo:latest +docker rmi mwatelescope/mwa-demo:latest docker builder prune --all docker buildx prune --all -docker build -t d3vnull0/mwa-demo:latest -f Dockerfile . +docker build -t mwatelescope/mwa-demo:latest -f Dockerfile . +``` + +### Hybrid + +If you have some software dependencies installed locally, you can use Docker to run the rest. Just comment out the packages you don't need in `demo/00_software.sh` and source it in your shell. + +```bash +source demo/00_software.sh ``` ## ASVO account @@ -244,16 +263,17 @@ docker images. ### Running the demo -Below is a walkthrough of the demo. Ensure everything is run from the root of the repository -(don't `cd` into the `demo` directory). +Below is a walkthrough of the demo. Ensure that: + +- (if using [Docker](#docker)) you are in a Docker shell, not your host system. +- (if [hybrid](#hybrid)), you have sourced `demo/00_software.sh` in your host shell. +- everything is run from the root of the repository + (don't `cd` into the `demo` directory). +- you don't `source` the scripts, they are `chmod +x` and should be run directly. ```bash -# DEMO: open a bash shell -# DEMO: change directory into the root of this repository. -# set up the software environment to use Docker for any binaries not on your system -source demo/00_software.sh -# check that everything is working (and pull Docker images) -demo/00_test.sh +# check that everything is working +demo/00_test.sh # don't source me! # query the MWA TAP server with ADQL using the pyvo library clear; demo/01_tap.sh # display giant-squid commands to download observations @@ -285,8 +305,8 @@ export obsid=1341914000 export metafits=${outdir}/${obsid}/raw/${obsid}.metafits export prep_uvfits="${outdir}/${obsid}/prep/birli_${obsid}.uvfits" export cal_ms="${outdir}/${obsid}/cal/hyp_cal_${obsid}.ms" -eval $python ${SCRIPT_BASE}/04_ssins.py $prep_uvfits -eval $python ${SCRIPT_BASE}/04_ssins.py $cal_ms +python ${SCRIPT_BASE}/04_ssins.py $prep_uvfits +python ${SCRIPT_BASE}/04_ssins.py $cal_ms # combine them all into a single image obsid="combined" cal_ms=$(ls -1d ${outdir}/*/cal/hyp_cal_*.ms ) demo/07_img.sh @@ -304,16 +324,16 @@ multiple platforms using `docker buildx`. ```bash # quick start: pull the images from dockerhub. -docker pull d3vnull0/mwa-demo:latest # on macos or linux arm64 (Apple M series), add --platform linux/arm64 +docker pull mwatelescope/mwa-demo:latest # on macos or linux arm64 (Apple M series), add --platform linux/arm64 # if you have any issues, you can override the image with a fresh build on your local machine -# docker rmi d3vnull0/mwa-demo:latest -docker build -t d3vnull0/mwa-demo:latest -f Dockerfile . +# docker rmi mwatelescope/mwa-demo:latest +docker build -t mwatelescope/mwa-demo:latest -f Dockerfile . # If you still encounter issues on macOS arm64 (Apple Silicon, M series), # the same image is also available via Docker x86_64 emulation. Make sure to update # your Docker Desktop to the latest version, as this features is relatively new. -docker pull --platform linux/amd64 d3vnull0/mwa-demo:latest +docker pull --platform linux/amd64 mwatelescope/mwa-demo:latest ``` Here's how to customize and build the image for multiple platforms and push to dockerhub @@ -348,5 +368,11 @@ docker buildx build \ --push \ . -# DEV: docker buildx build --platform linux/amd64,linux/arm64 -t d3vnull0/mwa-demo:latest -f Dockerfile --push . +# DEV: docker buildx build --platform linux/amd64,linux/arm64 -t mwatelescope/mwa-demo:latest -f Dockerfile --push . +``` + +If you add extra raw files, you can add their checksums with + +```bash +md5sum demo/data/*/raw/1*fits | tee demo_data.md5sum ``` diff --git a/demo/00_software.sh b/demo/00_software.sh index 5a45769..14d20b7 100644 --- a/demo/00_software.sh +++ b/demo/00_software.sh @@ -1,7 +1,7 @@ #!/bin/bash export docker_base="docker run --rm -it" -export docker_img=${docker_img:="d3vnull0/mwa-demo:latest"} +export docker_img=${docker_img:="mwatelescope/mwa-demo:latest"} # silly hacks for Windows / Git Bash if [[ -n "$OS" && "$OS" == "Windows_NT" ]]; then diff --git a/demo/00_test.sh b/demo/00_test.sh index 311471d..0c75a36 100755 --- a/demo/00_test.sh +++ b/demo/00_test.sh @@ -1,11 +1,19 @@ #!/bin/bash # test the software dependencies via environment variables +# is this script being sourced? https://stackoverflow.com/a/28776166/565019 +if ( + [[ -n $ZSH_VERSION && $ZSH_EVAL_CONTEXT =~ :file$ ]] || + [[ -n $BASH_VERSION ]] && (return 0 2>/dev/null) +); then + echo "this script is not intended for sourcing, run it with 'bash demo/00_test.sh'" + return 1 +fi + # ### # # ENV # # ### # # see: 00_env.sh - if [ -n "$ZSH_VERSION" ]; then ME="${0:A}"; else ME=$(realpath ${BASH_SOURCE:0}); fi export SCRIPT_BASE=${SCRIPT_BASE:-$(dirname $ME)} source $SCRIPT_BASE/00_env.sh @@ -17,35 +25,47 @@ source $SCRIPT_BASE/00_env.sh # we do this after the giant-squid check to avoid showing the api key in the logs set -eu +# #### # +# DATA # +# #### # +# DEMO: check raw files are present +if command -v md5sum &>/dev/null; then + echo "validating raw files with md5sum" + md5sum -c demo_data.md5sum +else + echo "md5sum or md5 not found. couldn't validate raw files." + exit 1 +fi + # #### # # BINS # # #### # # DEMO: check software is installed -if ! eval $giant_squid --version; then +if ! giant-squid --version; then echo "giant-squid not found. https://github.com/MWATelescope/giant-squid?tab=readme-ov-file#installation " - return 1 + exit 1 fi -if ! eval $wsclean --version; then +if ! wsclean --version; then echo "wsclean not found. https://wsclean.readthedocs.io/en/latest/installation.html " - return 1 + exit 1 fi -if ! eval $hyperdrive --version; then +if ! hyperdrive --version; then echo "hyperdrive not found. https://mwatelescope.github.io/mwa_hyperdrive/installation/intro.html " - return 1 + exit 1 fi -if ! eval $jq --version; then +if ! jq --version; then echo "jq not found. https://jqlang.github.io/jq/download/ " - return 1 + exit 1 fi -if ! eval $python --version; then +if ! python --version; then echo "python not found. https://www.python.org/downloads/ " - return 1 + exit 1 fi ## ensure outdir exists if [[ ! -d "$outdir" ]]; then echo "outdir=$outdir does not exist. try mkdir -p $outdir" - return 1 + exit 1 fi # ####### # @@ -57,7 +77,7 @@ if [[ $srclist =~ srclist_puma && ! -f "$srclist" ]]; then curl -L -o $srclist "https://github.com/JLBLine/srclists/raw/master/${srclist##*/}" fi # DEMO: verify srclist -eval $hyperdrive srclist-verify $srclist +hyperdrive srclist-verify $srclist # #### # # BEAM # @@ -68,7 +88,7 @@ if [[ ! -f "$MWA_BEAM_FILE" ]]; then curl -L -o $MWA_BEAM_FILE "http://ws.mwatelescope.org/static/${MWA_BEAM_FILE##*/}" fi # DEMO: verify beam -eval $hyperdrive beam fee --output /dev/null +hyperdrive beam fee --output /dev/null # ######## # # OPTIONAL # @@ -77,10 +97,10 @@ echo "recommended software, not on the critical path:" set +eux ## verify ASVO API Key -eval $giant_squid list >/dev/null +giant-squid list >/dev/null # DEMO: check wsclean features -eval $wsclean --version | tee .wsclean_version +wsclean --version | tee .wsclean_version while IFS='|' read -r feature details; do if ! grep -q "${feature} is available" .wsclean_version; then echo "warning: wsclean $feature not found. recompile wsclean after installing $feature" @@ -93,14 +113,14 @@ EoF rm .wsclean_version # DEMO: check python version -if ! eval $python -c $'"assert __import__(\'sys\').version_info>=(3,8)"' >/dev/null; then +if ! python -c $'"assert __import__(\'sys\').version_info>=(3,8)"' >/dev/null; then echo "warning: python version 3.8+ not found https://www.python.org/downloads/ " - eval $python --version + python --version fi # DEMO: check python packages while IFS='|' read -r package details; do - if ! eval $python -m pip show $package >/dev/null; then + if ! eval "$python -m pip show $package >/dev/null"; then echo "recommended: python package $package not found." echo details: $details fi @@ -112,7 +132,7 @@ mwa_qa| git clone https://github.com/d3v-null/mwa_qa.git ; pip install mwa_qa EoF # DEMO: is the MWA TAP server accessible? -if ! eval $python -c $'"assert len(__import__(\'pyvo\').dal.TAPService(\'http://vo.mwatelescope.org/mwa_asvo/tap\').tables)"'; then +if ! python -c $'"assert len(__import__(\'pyvo\').dal.TAPService(\'http://vo.mwatelescope.org/mwa_asvo/tap\').tables)"'; then echo "warning: MWA TAP inaccessible. https://wiki.mwatelescope.org/display/MP/MWA+TAP+Service " fi diff --git a/demo/01_tap.sh b/demo/01_tap.sh index efeadea..bc6ce41 100755 --- a/demo/01_tap.sh +++ b/demo/01_tap.sh @@ -17,4 +17,4 @@ export details_csv=${details_csv:-"${outdir}/details.csv"} # ### # # query the MWA TAP server with ADQL using the pyvo library # details: https://mwatelescope.atlassian.net/wiki/spaces/MP/pages/24970532/MWA+ASVO+VO+Services -eval $python $SCRIPT_BASE/01_tap.py $obsids_csv $details_csv +python $SCRIPT_BASE/01_tap.py $obsids_csv $details_csv diff --git a/demo/02_download.sh b/demo/02_download.sh index 877b82b..824a8a2 100755 --- a/demo/02_download.sh +++ b/demo/02_download.sh @@ -31,19 +31,19 @@ fi echo "you should run one of the following commands manually:" echo " -> request raw visibilities" -echo $giant_squid submit-vis $obsids_csv +echo "giant_squid submit-vis $obsids_csv" echo " -> (or) submit preprocessed visibility conversion jobs (uvfits or ms)" -echo $giant_squid submit-conv $obsids_csv -p output=uvfits,avg_freq_res=40,avg_time_res=2,flag_edge_width=80 +echo "giant_squid submit-conv $obsids_csv -p output=uvfits,avg_freq_res=40,avg_time_res=2,flag_edge_width=80" echo " -> get human-readable list of jobs that are queued or processing" -echo $giant_squid list $obsids_csv +echo "giant_squid list $obsids_csv" echo " -> (advanced) get a machine readable list of jobs that are queued or processing" -echo $giant_squid list $obsids_csv --states queued,processing --json >$outdir/jobs.json -echo $jq -r $'\'[.[]|.jobId]|join(" ")\'' $outdir/jobs.json +echo "giant_squid list $obsids_csv --states queued,processing --json >$outdir/jobs.json" +echo "jq -r '[.[]|.jobId]|join(" ")' $outdir/jobs.json" echo " -> wait until a job is ready" -echo $giant_squid wait $obsids_csv +echo "giant_squid wait $obsids_csv" echo " -> download a job" -echo $giant_squid download jobid +echo "giant_squid download jobid" diff --git a/demo/03_mwalib.sh b/demo/03_mwalib.sh index 8a5c050..572a0c6 100755 --- a/demo/03_mwalib.sh +++ b/demo/03_mwalib.sh @@ -33,7 +33,7 @@ if [[ ! -f "$metafits" ]]; then fi # get channel and antenna info -eval $python ${SCRIPT_BASE}/03_mwalib.py $metafits +python ${SCRIPT_BASE}/03_mwalib.py $metafits # DEMO: antenna info diff --git a/demo/04_ssins.sh b/demo/04_ssins.sh index be68efe..1d5938e 100755 --- a/demo/04_ssins.sh +++ b/demo/04_ssins.sh @@ -34,6 +34,6 @@ fi # DEMO: use SSINS (sky-subtracted incoherent noise spectra) to identify RFI # - top plots are baseline-averaged auto amplitudes, differenced in time # - bottom plots are z-score: (subtract mean of each frequency, divide by std dev) -eval $python "${SCRIPT_BASE}/04_ssins.py" "$metafits" "$raw_glob" +python "${SCRIPT_BASE}/04_ssins.py" "$metafits" $raw_glob # DEMO: SSINS can also be used to generate RFI flag files, but this out of scope diff --git a/demo/05_prep.sh b/demo/05_prep.sh index e15eabd..1c688af 100755 --- a/demo/05_prep.sh +++ b/demo/05_prep.sh @@ -46,7 +46,7 @@ export prepqa="${prep_uvfits%%.uvfits}_qa.json" set -eu if [[ ! -f $prep_uvfits ]]; then - eval $birli ${birli_args:-} \ + birli ${birli_args:-} \ -m "${metafits}" \ $([[ -n "${edgewidth_khz:-}" ]] && echo "--flag-edge-width ${edgewidth_khz}") \ $([[ -n "${freqres_khz:-}" ]] && echo "--avg-freq-res ${freqres_khz}") \ @@ -62,16 +62,16 @@ fi # details: https://github.com/d3v-null/mwa_qa (my fork of https://github.com/Chuneeta/mwa_qa/ ) if [[ ! -f "$prepqa" ]]; then - eval $run_prepqa $prep_uvfits $metafits --out $prepqa + run_prepvisqa.py $prep_uvfits $metafits --out $prepqa fi # DEMO: extract bad antennas from prepqa json with jq # - both of the provided observations pass QA, so no bad antennas are reported -prep_bad_ants=$(eval $jq -r $'\'.BAD_ANTS|join(" ")\'' $prepqa) +prep_bad_ants=$(jq -r $'.BAD_ANTS|join(" ")' $prepqa) # DEMO: plot the prep qa results # - RMS plot: RMS of all autocorrelation values for each antenna # - zscore: -eval $plot_prepqa $prepqa --save --out ${prep_uvfits%%.uvfits} +plot_prepvisqa.py $prepqa --save --out ${prep_uvfits%%.uvfits} echo $obsid $prep_bad_ants diff --git a/demo/06_cal.sh b/demo/06_cal.sh index 31912e9..f7148f5 100755 --- a/demo/06_cal.sh +++ b/demo/06_cal.sh @@ -59,7 +59,7 @@ fi set -eu if [[ ! -f "$hyp_soln" ]]; then - eval $hyperdrive di-calibrate ${dical_args:-} \ + hyperdrive di-calibrate ${dical_args:-} \ --data "$metafits" "$prep_uvfits" \ --source-list "$srclist" \ --outputs "$hyp_soln" \ @@ -68,7 +68,7 @@ fi # plot solutions file if [[ ! -f "${hyp_soln%%.fits}_phases.png" ]]; then - eval $hyperdrive solutions-plot \ + hyperdrive solutions-plot \ -m "$metafits" \ --no-ref-tile \ --max-amp 1.5 \ @@ -84,14 +84,14 @@ fi export calqa="${hyp_soln%%.fits}_qa.json" if [[ ! -f "$calqa" ]]; then - eval $run_calqa --pol X --out "$calqa" "$hyp_soln" "$metafits" + run_calqa.py --pol X --out "$calqa" "$hyp_soln" "$metafits" fi # plot the cal qa results -eval $plot_calqa "$calqa" --save --out "${hyp_soln%%.fits}" +plot_calqa.py "$calqa" --save --out "${hyp_soln%%.fits}" # extract bad antennas from calqa json with jq -cal_bad_ants=$(eval $jq -r $'\'.BAD_ANTS|join(" ")\'' "$calqa") +cal_bad_ants=$(jq -r $'.BAD_ANTS|join(" ")' "$calqa") export cal_bad_ants echo "deliberately disabling cal bad ants for the first round :)" @@ -100,7 +100,7 @@ export cal_bad_ants="" # apply calibration solutions to preprocessed visibilities # details: https://mwatelescope.github.io/mwa_hyperdrive/user/solutions_apply/intro.html if [[ ! -d "$cal_ms" ]]; then - eval $hyperdrive apply ${apply_args:-} \ + hyperdrive apply ${apply_args:-} \ --data "$metafits" "$prep_uvfits" \ --solutions "$hyp_soln" \ --outputs "$cal_ms" \ diff --git a/demo/07_img.sh b/demo/07_img.sh index b6c281c..f3db503 100755 --- a/demo/07_img.sh +++ b/demo/07_img.sh @@ -36,7 +36,7 @@ fi export imgname="${outdir}/${obsid}/img/wsclean_hyp_${obsid}" if [ ! -f "${imgname}-image.fits" ]; then - eval $wsclean \ + wsclean \ -name "${imgname}" \ -size 2048 2048 \ -scale 20asec \ diff --git a/demo_data.md5sum b/demo_data.md5sum new file mode 100644 index 0000000..df3260f --- /dev/null +++ b/demo_data.md5sum @@ -0,0 +1,2 @@ +ab0a3040c6adfc482ed1485fa080c18f demo/data/1121334536/raw/1121334536_20150719094841_gpubox20_00.fits +dabf81a21ab53a585e0afab67636fc9f demo/data/1341914000/raw/1341914000_20220715095302_ch137_000.fits