From 5c46fb2e6d02bbf355b51c303514654dd118ffa7 Mon Sep 17 00:00:00 2001 From: fraser-combe Date: Tue, 19 Nov 2024 10:37:49 -0600 Subject: [PATCH] Update code contribution guidelines --- docs/contributing/code_contribution.md | 212 ++++++++++++++++++------- 1 file changed, 151 insertions(+), 61 deletions(-) diff --git a/docs/contributing/code_contribution.md b/docs/contributing/code_contribution.md index cb7ba5727..3483bf1c2 100644 --- a/docs/contributing/code_contribution.md +++ b/docs/contributing/code_contribution.md @@ -8,8 +8,10 @@ Style guide inspired by Scott Frazer’s [WDL Best Practices Style Guide](http ## General Guidelines -- Put tasks and workflows in separate files in the appropriate folders. -- Always add a description as metadata +***Modularity and Metadata*** + +- **Best Practice:** Place tasks and workflows in separate files to maintain modularity and clarity. +- **Add a `meta` block** to every task and workflow to provide a brief description of its purpose. ```bash meta { @@ -17,113 +19,145 @@ Style guide inspired by Scott Frazer’s [WDL Best Practices Style Guide](http } ``` -- Ensure that the docker container is locked to a given version, not `latest` +***Docker Containers*** + +- Use a specific Docker container version instead of 'latest' to ensure reproducibility and prevent unexpected changes in container behavior. ```bash String docker = "quay.io/docker_image:version" ``` - Preferentially use containers [`Google's Artifact Registry`](https://console.cloud.google.com/artifacts/docker/general-theiagen/us) rather than those from [`quay.io`](http://quay.io) or [`dockerhub`](https://hub.docker.com/) -- Use 2-space indents (no tabs) + +***Indentation and Whitespace*** + +- Use 2-space indentation for all blocks. Avoid using tabs to ensure uniform formatting across editors: ```bash # perform action - if [ this ]; then - action1(variable) + if [ condition ]; then + perform_action(variable) fi ``` -- Do not use line break for opening braces +- Use a single space when defining variables (`this = that` *not* `this= that` (unless a bash variable where `this=that` is required)) + +***Bracket and Spacing Conventions*** + +- Avoid line breaks for opening braces. Keep them on the same line as the declaration. i.e `input {` instead of `input\n{` + + ```bash + # Correct + input { + String input_variable + } + + # Incorrect + input + { + String input_variable + } + ``` + - Use single space when defining input/output variables & runtime attributes (`output {` instead of `output{`) -- Use single-line breaks between non-intended constructs -- Enclose task commands with triple angle brackets (`<<< ... >>>`) -- Consistently use white space with variables (`this = that` *not* `this= that` (unless a bash variable where `this=that` is required)) +- Separate non-indented constructs (like input and output sections) with a single-line break for readability. + +***Command Block Syntax*** + +- Enclose command blocks in triple angle brackets (<<< ... >>>) for consistency and easier handling of multi-line scripts. It also avoids issues with unescaped special characters in the command block: + + ```bash + command <<< + tool --input ~{input} --output ~{output} + >>> + ``` ## Task Blocks -The task should contain the following sections. Include _single_ spaces between input, command, output, and runtime closing and opening curly brackets. +A WDL task block defines a discrete, reusable step in a workflow. To ensure readability and consistency, follow these conventions when writing task blocks. Include single spaces between the input, command, output, and runtime sections and their enclosing curly brackets. ```bash -input { +task example_task { + input { -} -command <<< - ->>> -output { + } + command <<< + + >>> + output { -} -runtime { + } + runtime { + } } ``` ??? toggle "`input` block" - The following conventions are used to expose docker, CPU, memory, and disk size - ```bash - input { - String docker = "..." - Int cpu = x - Int memory = y - Int disk_size = z - } - ``` - - - If additional arguments should be allowed to be passed to the task, this input should follow the convention below: - + ```bash + input { + String docker = "quay.io/example:1.0.0" # Docker container for the task + Int cpu = 4 # Number of CPUs + Int memory = 16 # Memory in GB + Int disk_size = 100 # Disk space in GB + } + ``` + + - If the task accepts additional optional parameters, include an args field. ```bash input { String args = "" } ``` - - Input and output lists should not be formatted to have the equal sign aligned, but instead use a single space before and after the `=` ```bash - output1_x = string1 - output2_that_does_y = string2 + correct_output = "output_file" + long_variable_name = "long_file_name" ``` - - Ensure the docker container is exposed as an input and as an output string + - Expose Docker as an input and runtime variable: ```bash input { - String docker = "" + String docker = "quay.io/example:1.0.0" } ... output { - String XX_docker = docker + String used_docker = docker } runtime { - docker: docker + docker: docker } ``` ??? toggle "`command` block" - Ensure use of line breaks between different sections of code to improve readability - + ```bash - # if this, perform action 1 - if [ this ]; then + # Perform task step 1 + if [ condition ]; then action1(variable) fi - # if that, perform action 2 - if [ that ]; then + # Perform task step 2 + if [ another_condition ]; then action2(variable) fi ``` - Split command calls into multiple lines if they have user input variables and/or if the length of the command is very long to avoid text wrapping and/or side-scrolling, e.g. - - Use indentation as appropriate + - Use backslashes for continuation and indentation to clarify structure: ```bash tool \ + --input ~{input_file} \ + --output ~{output_file} \ --option1 ~{option1} \ - --option2 ~{option2} \ ... - --option999 ~{option999} + --optionN ~{optionN} ``` - Add comments that @@ -137,41 +171,97 @@ runtime { ``` ??? toggle "`output` block" - - File types should be clearly stated in the output name variables + - The output block specifies the files or variables produced by the task. Follow these conventions: - ```bash - output1_csv = file1.csv - output2_tsv = file2.tsv - ``` + ```bash + output { + File result_csv = "output.csv" # CSV file generated + File result_log = "log.txt" # Log file + String + } + ``` - Ensure the docker container is exposed as an output string, e.g. ```bash input { - String docker + String docker = "us-docker.pkg.dev/general-theiagen/tool:version" } ... output { - String XX_docker = docker + String XX_docker = docker } runtime { - docker: docker + docker: docker } ``` ??? toggle "`runtime` block" - - Always use a docker container + - The runtime block defines the compute resources and environment for the task. + - Always specify a Docker: + + ```bash + runtime { + docker: docker + cpu: cpu + memory: memory + disk: disk_size + } + ``` ## Workflow Blocks -The workflow/sub-workflow file should contain: +A WDL workflow block orchestrates the execution of tasks and subworkflows. It defines the inputs, calls tasks or subworkflows, and specifies the final outputs. To ensure readability and consistency, follow these conventions when writing workflow blocks: + +### General Guidelines + +- Include a block of `import` statements (sorted in alphabetical order). + - When a workflow imports a task, ensure it is imported under a unique name to avoid conflicts. + +```bash +import "../tasks/task_task1.wdl" as task1_task +import "../tasks/task_task2.wdl" as task2_task +``` + +- A `workflow` block with: + +- An `input` section: + + ```bash + input { + String input + String task1_docker = "us-docker.pkg.dev/general-theiagen/tool:version" + String? task1_optional_argument + } + ``` + +- `call` sections for specified tasks: + + ```bash + call task1_task.task1 { + input: + input = input, + docker = task1_docker + } + ``` + +- An `output` section: + + - Define all workflow outputs in this section. + - Use descriptive names for each output variable. + + ```bash + output { + # Task 1 outputs + File task1_out_csv = task1.output_csv + String task1_version = task1.version + + # Subworkflow outputs + File subworkflow_out_tsv = subworkflow.task3_out_tsv + String subworkflow_version = subworkflow.task3_version + } + ``` -- a block of `import` statements (alphabetical order), - - When a workflow imports a task, make sure that it is imported under a different name than the task it is calling -- a `workflow` block with - - an `input` section - - `call` sections for specified tasks - - an `output` section Example formatting is shown below.