Influenza
Omnifluss can perform genome reconstruction of Influenza viruses using Illumina short-read NGS data.
Usage
Basic usage of omnifluss to reconstruct virus genomes:
nextflow run rki-mf1/omnifluss \
-profile singularity \
--input samplesheet.csv \
--outdir results
This command launches a basic omnifluss run with samples from the samplesheet, tasks executed within singularity containers, and results stored in an output folder called results. We configured and optimised many settings and Parameters to reconstruct Influenza virus genomes from Illumina paired-end (PE) short-read data. These configurations can be trivially added to the basic omnifluss run via another profile:
nextflow run rki-mf1/omnifluss \
-profile singularity,INV_illumina \
--input samplesheet.csv \
--outdir results
See the Output chapter for the documentation of omnifluss' outputs.
Remember that the commands above use your last cached version (see Updating the pipeline) of omnifluss.
If you like to run omnifluss at a specific release version, use the -r
parameter of Naxtflow:
nextflow run rki-mf1/omnifluss \
-profile singularity,INV_illumina \
--input samplesheet.csv \
--outdir results \
-r v0.2.0
Further, you can resume an interrupted or broken pipeline runs via resume.
Updating the pipeline
When you run the commands above, Nextflow automatically pulls the pipeline code from GitHub and stores it as a cached version. When running the pipeline, it will use a cached version by default if available - even if the pipeline has been updated on the developers' side. To make sure that you're running the latest version of the omnifluss, you can manually update the cached version of the pipeline via
nextflow pull rki-mf1/omnifluss
Again, you can add -r
for a specific version
nextflow pull -r v0.2.0 rki-mf1/omnifluss
Reproducibility
To ensures that a specific version of the pipeline is used when running the pipeline, you can specify the pipeline version.
If you keep using the same tag, you'll be running the same version of the pipeline, even if there have been changes to the code afterwards.
You can visit the release page at rki-mf1/omnifluss releases and find the latest pipeline version at the top of the website.
When running the pipeline with -r
(one hyphen), use eg. -r v0.2.1
.
You can switch to another version any time by changing the vertion tag after the -r
flag.
The version of the release will be written to the nextlow log for reproducibility.
Inputs
Sample sheet
The samples to be analysed are provided to omnifluss via a sample sheet (.csv) using the --input
parameter.
It specifies the raw read sequence data files (.fastq) used by omnifluss.
For instance:
sample,fastq_1,fastq_2
INV_ILL_NB1,/path/to/experiment_NB1_R1.fastq.gz,/path/to/experiment_NB1_R2.fastq.gz
INV_ILL_NB2,/path/to/experiment_NB2_R1.fastq.gz,/path/to/experiment_NB2_R2.fastq.gz
INV_ILL_NB3,/path/to/experiment_NB3_R1.fastq.gz,/path/to/experiment_NB3_R2.fastq.gz
which refers to the structured information
sample | fastq_1 | fastq_2 |
---|---|---|
INV_ILL_NB1 | /path/to/experiment_NB1_R1.fastq.gz | /path/to/experiment_NB1_R2.fastq.gz |
INV_ILL_NB2 | /path/to/experiment_NB2_R1.fastq.gz | /path/to/experiment_NB2_R2.fastq.gz |
INV_ILL_NB3 | /path/to/experiment_NB3_R1.fastq.gz | /path/to/experiment_NB3_R2.fastq.gz |
The argument parser will auto-detect the sample and paired-end information provided by the samples sheet. Technically, the sample sheet can have as many columns as desired, however, only the first three columns are required and have to match the definition table below.
Column | Description |
---|---|
sample |
Custom sample name. This entry might be identical for multiple sequencing libraries/runs from the same sample. Spaces in sample names are automatically converted to underscores (_ ). |
fastq_1 |
Full path to FastQ file for Illumina short reads 1. File has to be gzipped and have the extension ".fastq.gz" or ".fq.gz". |
fastq_2 |
Full path to FastQ file for Illumina short reads 2. File has to be gzipped and have the extension ".fastq.gz" or ".fq.gz". |
Segment database
\<WIP>
Kraken database
\<WIP>
Adapter file
You can especify a plain FASTA file for adapter clipping. E.g. for Illumina Nextera Transposase adapter
>Illumina Nextera Transposase adapter fwd
TCGTCGGCAGCGTCAGATGTGTATAAGAGACAG
>Illumina Nextera Transposase adapter rev
GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAG
Parameters
Note: The documentation of pipeline parameters is generated automatically from the pipeline schema. Options are part of Nextflow and use a single hyphen (pipeline parameters use a double-hyphen).
Besides the very crucial parameters explained in Inputs, various parameters can be finetuned thoughout the workflow.
You can find the full list of parameters via nextflow run rki-mf1/omnifluss -r <release-tag> --help
.
In order to not bloat the omnifluss run command and save time when typing the run for repeatedly you can provide pipeline parameters in JSON
or YAML
format via -params-file <file>
.
Warning: Do not use the
-c <file>
to specify pipeline parameters as this will result to errors! Custom config files specified in-c
must only be used for tuning process resource specifications or module arguments (args).
For instance, the basic use case of omnifluss above can be specified with a params-file in yaml format:
nextflow run rki-mf1/omnifluss -profile singularity -params-file params.yaml
with:
input: 'samplesheet.csv'
outdir: 'results'
resume
Specify -resume when restarting a pipeline. Nextflow will use cached results from any pipeline steps where the inputs are the same, continuing from where it got to previously. For input to be considered the same, not only the names must be identical but the files' contents as well. For more info about this parameter, see this blog post.
You can also supply a run name to resume a specific run: -resume [run-name]
.
Use the nextflow log
command to show previous run names.
Output
After a successful run, the pipeline creates the following files and folders in your working directory:
work # Directory containing the Nextflow working files
<outdir> # Results in specified location (defined with --outdir)
.nextflow_log # Log file from Nextflow
\<WIP>