-
Notifications
You must be signed in to change notification settings - Fork 737
Description
New feature
Pinging @rsuchecki @pditommaso (I couldn't find an issue with this, I hope it's okay that I open a new one).
Based on a small conversation on the Gitter (1 - primary | 2), there's interest (a lot from me) to have more direct support for optional inputs - this seems is inline with the goals of DSL2 to produce reusable tool modules / interfaces.
Other workflow specifications have the concept of tool wrappers, which aim to be a "write once, use in all of your workflows". This means the tool wrapper would contain most (if not all) available configuration options, which then the command line is dynamically constructed. This allows the community to build and contribute high quality tool wrappers, for example: Common Workflow Library (CWLibrary#fastqc), BioWDL (BioWDL#fastqc) with the tools available for other users to use, or upload to stores like Dockstore or the Galaxy toolshed.
Projects like aCLImatise aim to generate tool wrappers, as this process is usually a significant time consuming aspect of building workflows.
The DSL2 makes good strides towards this, and a stronger concept for optional inputs would take this further.
Relevant discussion:
Command line construction sidenote
I think it would be a bad idea to create a new syntax for building or interpolating command lines, but tool developers could use the groovy environment to build strings for each command option.
Usage scenario
Consider fastqc (eg: nf-core module definition), which might have the (simplified) command structure:
fastqc \
[-c contaminant file] \
[ ... other config options ] \
seqfile1 .. seqfileN
I could build a process definition to encapsulate these ways to optionally configure the tool.
This process definition is just hypothetical, just one way I could think to do it.
process FASTQC {
input:
tuple val(name),
Optional[path(contaminant)],
path(reads)
output:
path("*.zip"), emit: zip
script:
contaminant_script = (contaminant != null) ? "--contaminant ${contaminant}" : ""
reads_script = reads.join(' ')
"""
fastqc \
${contaminant_script} \
${reads_script}
"""
}
But usage of imported modules in DSL2 in a workflow requires positional arguments, so you would have something like:
include { FASTQC as fastqc } from './tools/fastqc'
workflow {
fastqc(params.name, null, params.reads)
}
Suggest implementation
As @rsuchecki noted in gitter:
Things are very flexible for val inputs, but understandably get more complex when files/paths are involved as they need to be staged. Tuples are nice and keep things organised but are still an extension of the same idea of positional inputs.
I'd hope to avoid the use of positional arguments, because you can't ascertain context for a variable.