Skip to content

Commit

Permalink
use github-built docker image, remove evals, checksum demo data
Browse files Browse the repository at this point in the history
  • Loading branch information
d3v-null committed Aug 20, 2024
1 parent 2ccaae4 commit 333bff2
Show file tree
Hide file tree
Showing 11 changed files with 112 additions and 64 deletions.
70 changes: 48 additions & 22 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
# MWA Demo

This is demonstation of a simple data processing pipeline for Murchison Widefield Array (MWA) data,
from downloading raw data to creating an image.
Demonstration pipeline for Murchison Widefield Array (MWA) data

## Flow

Expand Down Expand Up @@ -103,7 +102,7 @@ running Docker withing WSL. It may be necessary to change

## Setup

Clone this repository to a macine that meets the [system requirements](#system-requirements).
Clone this repository to a machine that meets the [system requirements](#system-requirements).

```bash
git clone https://github.com/MWATelescope/mwa-demo.git
Expand Down Expand Up @@ -152,7 +151,7 @@ The demo has been tested on Windows 11 with Docker Desktop 4.33.1 on a Git Bash
For optimal performance, you should compile the following software dependencies directly on your
machine.

Advanced users can provide additional compiler flags during the build process to optimize for their specific CPU microarchitecture. e.g. `-march=native` for C/C++, or `-C target-cpu=native` for Rust.
Advanced users can provide additional compiler flags during the build process to optimize for their specific CPU micro-architecture. e.g. `-march=native` for C/C++, or `-C target-cpu=native` for Rust.

The steps in the `Dockerfile` may be a useful guide.

Expand Down Expand Up @@ -194,19 +193,39 @@ Linux users should also ensure they have permissions to run docker without root:
quick start: pull the images from dockerhub.

```bash
docker pull d3vnull0/mwa-demo:latest
docker pull mwatelescope/mwa-demo:latest
```

When [running the demo](#running-the-demo), you should run the commands in an interactive Docker shell.

```bash
docker run -it --rm -v ${PWD}:${PWD} -w ${PWD} -e MWA_ASVO_API_KEY=$MWA_ASVO_API_KEY mwatelescope/mwa-demo:latest
```

#### Docker Troubleshooting

macOS users: if you see this error: `WARNING: The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm64/v8) and no specific platform was requested`, you should pull the image for the correct platform.

```bash
docker pull --platform linux/arm64 mwatelescope/mwa-demo:latest
```

If you have any issues, you should delete all traces of the image that was pulled and build the image locally. (this may take a while)

```bash
# first remove the image that was pulled from dockerhub
docker rmi d3vnull0/mwa-demo:latest
docker rmi mwatelescope/mwa-demo:latest
docker builder prune --all
docker buildx prune --all
docker build -t d3vnull0/mwa-demo:latest -f Dockerfile .
docker build -t mwatelescope/mwa-demo:latest -f Dockerfile .
```

### Hybrid

If you have some software dependencies installed locally, you can use Docker to run the rest. Just comment out the packages you don't need in `demo/00_software.sh` and source it in your shell.

```bash
source demo/00_software.sh
```

## ASVO account
Expand Down Expand Up @@ -244,16 +263,17 @@ docker images.

### Running the demo

Below is a walkthrough of the demo. Ensure everything is run from the root of the repository
(don't `cd` into the `demo` directory).
Below is a walkthrough of the demo. Ensure that:

- (if using [Docker](#docker)) you are in a Docker shell, not your host system.
- (if [hybrid](#hybrid)), you have sourced `demo/00_software.sh` in your host shell.
- everything is run from the root of the repository
(don't `cd` into the `demo` directory).
- you don't `source` the scripts, they are `chmod +x` and should be run directly.

```bash
# DEMO: open a bash shell
# DEMO: change directory into the root of this repository.
# set up the software environment to use Docker for any binaries not on your system
source demo/00_software.sh
# check that everything is working (and pull Docker images)
demo/00_test.sh
# check that everything is working
demo/00_test.sh # don't source me!
# query the MWA TAP server with ADQL using the pyvo library
clear; demo/01_tap.sh
# display giant-squid commands to download observations
Expand Down Expand Up @@ -285,8 +305,8 @@ export obsid=1341914000
export metafits=${outdir}/${obsid}/raw/${obsid}.metafits
export prep_uvfits="${outdir}/${obsid}/prep/birli_${obsid}.uvfits"
export cal_ms="${outdir}/${obsid}/cal/hyp_cal_${obsid}.ms"
eval $python ${SCRIPT_BASE}/04_ssins.py $prep_uvfits
eval $python ${SCRIPT_BASE}/04_ssins.py $cal_ms
python ${SCRIPT_BASE}/04_ssins.py $prep_uvfits
python ${SCRIPT_BASE}/04_ssins.py $cal_ms

# combine them all into a single image
obsid="combined" cal_ms=$(ls -1d ${outdir}/*/cal/hyp_cal_*.ms ) demo/07_img.sh
Expand All @@ -304,16 +324,16 @@ multiple platforms using `docker buildx`.

```bash
# quick start: pull the images from dockerhub.
docker pull d3vnull0/mwa-demo:latest # on macos or linux arm64 (Apple M series), add --platform linux/arm64
docker pull mwatelescope/mwa-demo:latest # on macos or linux arm64 (Apple M series), add --platform linux/arm64

# if you have any issues, you can override the image with a fresh build on your local machine
# docker rmi d3vnull0/mwa-demo:latest
docker build -t d3vnull0/mwa-demo:latest -f Dockerfile .
# docker rmi mwatelescope/mwa-demo:latest
docker build -t mwatelescope/mwa-demo:latest -f Dockerfile .

# If you still encounter issues on macOS arm64 (Apple Silicon, M series),
# the same image is also available via Docker x86_64 emulation. Make sure to update
# your Docker Desktop to the latest version, as this features is relatively new.
docker pull --platform linux/amd64 d3vnull0/mwa-demo:latest
docker pull --platform linux/amd64 mwatelescope/mwa-demo:latest
```

Here's how to customize and build the image for multiple platforms and push to dockerhub
Expand Down Expand Up @@ -348,5 +368,11 @@ docker buildx build \
--push \
.

# DEV: docker buildx build --platform linux/amd64,linux/arm64 -t d3vnull0/mwa-demo:latest -f Dockerfile --push .
# DEV: docker buildx build --platform linux/amd64,linux/arm64 -t mwatelescope/mwa-demo:latest -f Dockerfile --push .
```

If you add extra raw files, you can add their checksums with

```bash
md5sum demo/data/*/raw/1*fits | tee demo_data.md5sum
```
2 changes: 1 addition & 1 deletion demo/00_software.sh
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
#!/bin/bash

export docker_base="docker run --rm -it"
export docker_img=${docker_img:="d3vnull0/mwa-demo:latest"}
export docker_img=${docker_img:="mwatelescope/mwa-demo:latest"}

# silly hacks for Windows / Git Bash
if [[ -n "$OS" && "$OS" == "Windows_NT" ]]; then
Expand Down
60 changes: 40 additions & 20 deletions demo/00_test.sh
Original file line number Diff line number Diff line change
@@ -1,11 +1,19 @@
#!/bin/bash
# test the software dependencies via environment variables

# is this script being sourced? https://stackoverflow.com/a/28776166/565019
if (
[[ -n $ZSH_VERSION && $ZSH_EVAL_CONTEXT =~ :file$ ]] ||
[[ -n $BASH_VERSION ]] && (return 0 2>/dev/null)
); then
echo "this script is not intended for sourcing, run it with 'bash demo/00_test.sh'"
return 1
fi

# ### #
# ENV #
# ### #
# see: 00_env.sh

if [ -n "$ZSH_VERSION" ]; then ME="${0:A}"; else ME=$(realpath ${BASH_SOURCE:0}); fi
export SCRIPT_BASE=${SCRIPT_BASE:-$(dirname $ME)}
source $SCRIPT_BASE/00_env.sh
Expand All @@ -17,35 +25,47 @@ source $SCRIPT_BASE/00_env.sh
# we do this after the giant-squid check to avoid showing the api key in the logs
set -eu

# #### #
# DATA #
# #### #
# DEMO: check raw files are present
if command -v md5sum &>/dev/null; then
echo "validating raw files with md5sum"
md5sum -c demo_data.md5sum
else
echo "md5sum or md5 not found. couldn't validate raw files."
exit 1
fi

# #### #
# BINS #
# #### #
# DEMO: check software is installed
if ! eval $giant_squid --version; then
if ! giant-squid --version; then
echo "giant-squid not found. https://github.com/MWATelescope/giant-squid?tab=readme-ov-file#installation "
return 1
exit 1
fi
if ! eval $wsclean --version; then
if ! wsclean --version; then
echo "wsclean not found. https://wsclean.readthedocs.io/en/latest/installation.html "
return 1
exit 1
fi
if ! eval $hyperdrive --version; then
if ! hyperdrive --version; then
echo "hyperdrive not found. https://mwatelescope.github.io/mwa_hyperdrive/installation/intro.html "
return 1
exit 1
fi
if ! eval $jq --version; then
if ! jq --version; then
echo "jq not found. https://jqlang.github.io/jq/download/ "
return 1
exit 1
fi
if ! eval $python --version; then
if ! python --version; then
echo "python not found. https://www.python.org/downloads/ "
return 1
exit 1
fi

## ensure outdir exists
if [[ ! -d "$outdir" ]]; then
echo "outdir=$outdir does not exist. try mkdir -p $outdir"
return 1
exit 1
fi

# ####### #
Expand All @@ -57,7 +77,7 @@ if [[ $srclist =~ srclist_puma && ! -f "$srclist" ]]; then
curl -L -o $srclist "https://github.com/JLBLine/srclists/raw/master/${srclist##*/}"
fi
# DEMO: verify srclist
eval $hyperdrive srclist-verify $srclist
hyperdrive srclist-verify $srclist

# #### #
# BEAM #
Expand All @@ -68,7 +88,7 @@ if [[ ! -f "$MWA_BEAM_FILE" ]]; then
curl -L -o $MWA_BEAM_FILE "http://ws.mwatelescope.org/static/${MWA_BEAM_FILE##*/}"
fi
# DEMO: verify beam
eval $hyperdrive beam fee --output /dev/null
hyperdrive beam fee --output /dev/null

# ######## #
# OPTIONAL #
Expand All @@ -77,10 +97,10 @@ echo "recommended software, not on the critical path:"
set +eux

## verify ASVO API Key
eval $giant_squid list >/dev/null
giant-squid list >/dev/null

# DEMO: check wsclean features
eval $wsclean --version | tee .wsclean_version
wsclean --version | tee .wsclean_version
while IFS='|' read -r feature details; do
if ! grep -q "${feature} is available" .wsclean_version; then
echo "warning: wsclean $feature not found. recompile wsclean after installing $feature"
Expand All @@ -93,14 +113,14 @@ EoF
rm .wsclean_version

# DEMO: check python version
if ! eval $python -c $'"assert __import__(\'sys\').version_info>=(3,8)"' >/dev/null; then
if ! python -c $'"assert __import__(\'sys\').version_info>=(3,8)"' >/dev/null; then
echo "warning: python version 3.8+ not found https://www.python.org/downloads/ "
eval $python --version
python --version
fi

# DEMO: check python packages
while IFS='|' read -r package details; do
if ! eval $python -m pip show $package >/dev/null; then
if ! eval "$python -m pip show $package >/dev/null"; then
echo "recommended: python package $package not found."
echo details: $details
fi
Expand All @@ -112,7 +132,7 @@ mwa_qa| git clone https://github.com/d3v-null/mwa_qa.git ; pip install mwa_qa
EoF

# DEMO: is the MWA TAP server accessible?
if ! eval $python -c $'"assert len(__import__(\'pyvo\').dal.TAPService(\'http://vo.mwatelescope.org/mwa_asvo/tap\').tables)"'; then
if ! python -c $'"assert len(__import__(\'pyvo\').dal.TAPService(\'http://vo.mwatelescope.org/mwa_asvo/tap\').tables)"'; then
echo "warning: MWA TAP inaccessible. https://wiki.mwatelescope.org/display/MP/MWA+TAP+Service "
fi

Expand Down
2 changes: 1 addition & 1 deletion demo/01_tap.sh
Original file line number Diff line number Diff line change
Expand Up @@ -17,4 +17,4 @@ export details_csv=${details_csv:-"${outdir}/details.csv"}
# ### #
# query the MWA TAP server with ADQL using the pyvo library
# details: https://mwatelescope.atlassian.net/wiki/spaces/MP/pages/24970532/MWA+ASVO+VO+Services
eval $python $SCRIPT_BASE/01_tap.py $obsids_csv $details_csv
python $SCRIPT_BASE/01_tap.py $obsids_csv $details_csv
14 changes: 7 additions & 7 deletions demo/02_download.sh
Original file line number Diff line number Diff line change
Expand Up @@ -31,19 +31,19 @@ fi

echo "you should run one of the following commands manually:"
echo " -> request raw visibilities"
echo $giant_squid submit-vis $obsids_csv
echo "giant_squid submit-vis $obsids_csv"
echo " -> (or) submit preprocessed visibility conversion jobs (uvfits or ms)"
echo $giant_squid submit-conv $obsids_csv -p output=uvfits,avg_freq_res=40,avg_time_res=2,flag_edge_width=80
echo "giant_squid submit-conv $obsids_csv -p output=uvfits,avg_freq_res=40,avg_time_res=2,flag_edge_width=80"

echo " -> get human-readable list of jobs that are queued or processing"
echo $giant_squid list $obsids_csv
echo "giant_squid list $obsids_csv"

echo " -> (advanced) get a machine readable list of jobs that are queued or processing"
echo $giant_squid list $obsids_csv --states queued,processing --json >$outdir/jobs.json
echo $jq -r $'\'[.[]|.jobId]|join(" ")\'' $outdir/jobs.json
echo "giant_squid list $obsids_csv --states queued,processing --json >$outdir/jobs.json"
echo "jq -r '[.[]|.jobId]|join(" ")' $outdir/jobs.json"

echo " -> wait until a job is ready"
echo $giant_squid wait $obsids_csv
echo "giant_squid wait $obsids_csv"

echo " -> download a job"
echo $giant_squid download jobid
echo "giant_squid download jobid"
2 changes: 1 addition & 1 deletion demo/03_mwalib.sh
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ if [[ ! -f "$metafits" ]]; then
fi

# get channel and antenna info
eval $python ${SCRIPT_BASE}/03_mwalib.py $metafits
python ${SCRIPT_BASE}/03_mwalib.py $metafits

# DEMO: antenna info

Expand Down
2 changes: 1 addition & 1 deletion demo/04_ssins.sh
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,6 @@ fi
# DEMO: use SSINS (sky-subtracted incoherent noise spectra) to identify RFI
# - top plots are baseline-averaged auto amplitudes, differenced in time
# - bottom plots are z-score: (subtract mean of each frequency, divide by std dev)
eval $python "${SCRIPT_BASE}/04_ssins.py" "$metafits" "$raw_glob"
python "${SCRIPT_BASE}/04_ssins.py" "$metafits" $raw_glob

# DEMO: SSINS can also be used to generate RFI flag files, but this out of scope
8 changes: 4 additions & 4 deletions demo/05_prep.sh
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ export prepqa="${prep_uvfits%%.uvfits}_qa.json"

set -eu
if [[ ! -f $prep_uvfits ]]; then
eval $birli ${birli_args:-} \
birli ${birli_args:-} \
-m "${metafits}" \
$([[ -n "${edgewidth_khz:-}" ]] && echo "--flag-edge-width ${edgewidth_khz}") \
$([[ -n "${freqres_khz:-}" ]] && echo "--avg-freq-res ${freqres_khz}") \
Expand All @@ -62,16 +62,16 @@ fi
# details: https://github.com/d3v-null/mwa_qa (my fork of https://github.com/Chuneeta/mwa_qa/ )

if [[ ! -f "$prepqa" ]]; then
eval $run_prepqa $prep_uvfits $metafits --out $prepqa
run_prepvisqa.py $prep_uvfits $metafits --out $prepqa
fi

# DEMO: extract bad antennas from prepqa json with jq
# - both of the provided observations pass QA, so no bad antennas are reported
prep_bad_ants=$(eval $jq -r $'\'.BAD_ANTS|join(" ")\'' $prepqa)
prep_bad_ants=$(jq -r $'.BAD_ANTS|join(" ")' $prepqa)

# DEMO: plot the prep qa results
# - RMS plot: RMS of all autocorrelation values for each antenna
# - zscore:
eval $plot_prepqa $prepqa --save --out ${prep_uvfits%%.uvfits}
plot_prepvisqa.py $prepqa --save --out ${prep_uvfits%%.uvfits}

echo $obsid $prep_bad_ants
Loading

0 comments on commit 333bff2

Please sign in to comment.