Skip to content

Optional inputs for DSL2 #1694

@illusional

Description

@illusional

New feature

Pinging @rsuchecki @pditommaso (I couldn't find an issue with this, I hope it's okay that I open a new one).

Based on a small conversation on the Gitter (1 - primary | 2), there's interest (a lot from me) to have more direct support for optional inputs - this seems is inline with the goals of DSL2 to produce reusable tool modules / interfaces.

Other workflow specifications have the concept of tool wrappers, which aim to be a "write once, use in all of your workflows". This means the tool wrapper would contain most (if not all) available configuration options, which then the command line is dynamically constructed. This allows the community to build and contribute high quality tool wrappers, for example: Common Workflow Library (CWLibrary#fastqc), BioWDL (BioWDL#fastqc) with the tools available for other users to use, or upload to stores like Dockstore or the Galaxy toolshed.

Projects like aCLImatise aim to generate tool wrappers, as this process is usually a significant time consuming aspect of building workflows.

The DSL2 makes good strides towards this, and a stronger concept for optional inputs would take this further.

Relevant discussion:

Command line construction sidenote

I think it would be a bad idea to create a new syntax for building or interpolating command lines, but tool developers could use the groovy environment to build strings for each command option.

Usage scenario

Consider fastqc (eg: nf-core module definition), which might have the (simplified) command structure:

fastqc \
    [-c contaminant file] \
    [ ... other config options ] \
    seqfile1 .. seqfileN

I could build a process definition to encapsulate these ways to optionally configure the tool.

This process definition is just hypothetical, just one way I could think to do it.

process FASTQC {
    input:
        tuple val(name),
        Optional[path(contaminant)],
        path(reads)

    output:
        path("*.zip"), emit: zip

    script:
    contaminant_script = (contaminant != null) ? "--contaminant ${contaminant}" : ""
    reads_script = reads.join(' ')
    """
    fastqc \
        ${contaminant_script} \
        ${reads_script}
    """
}

But usage of imported modules in DSL2 in a workflow requires positional arguments, so you would have something like:

include { FASTQC as fastqc } from './tools/fastqc'

workflow {
    fastqc(params.name, null, params.reads)
}

Suggest implementation

As @rsuchecki noted in gitter:

Things are very flexible for val inputs, but understandably get more complex when files/paths are involved as they need to be staged. Tuples are nice and keep things organised but are still an extension of the same idea of positional inputs.

I'd hope to avoid the use of positional arguments, because you can't ascertain context for a variable.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions