|
## DADA2
|
|
## DADA2
|
|
|
|
|
|
Before we begin, there's one thing we need to download, which is the formatted silva database used by DADA2 to assign taxonomy.
|
|
Before we begin using DADA2, we need to tell our bash profile which version of R we want to use. Thankfully, the IT department previously set up a specific version of R that already has DADA2 and most of the necessary dependencies already installed, which saves us a lot of time. Run the following lines of code to direct your profile to the correct version of R and then start Rstudio:
|
|
|
|
|
|
$ cd /bioinf/home/your_username/marmic_NGS2020/day_1/demultiplexed
|
|
$ export RSTUDIO_WHICH_R=/opt/software/R/R-4.0.3/bin/R
|
|
$ wget https://zenodo.org/record/3986799/files/silva_nr99_v138_train_set.fa.gz
|
|
$ rstudio
|
|
|
|
|
|
### 3.1 Processing our data with DADA2
|
|
### 3.1 Processing our data with DADA2
|
|
Now to get going with DADA2. DADA2 is a very large software with an entire integrated pipeline that takes raw reads (minus primers and barcodes) as input. It also needs lots of dependencies to do all of its things. So in order to avoid wasting several hours installing it, I have a version of R with everything installed that you'll be using for this practical.
|
|
Now to get going with DADA2. DADA2 is a very large software with an entire integrated pipeline that takes raw reads (minus primers and barcodes) as input.
|
|
|
|
In order for us to get started, we need to set our working directory to the location of our 16S rRNA gene sequence data.
|
|
$ /home/tfrancis/miniconda3/bin/R
|
|
|
|
|
|
|
|
and if you're not already there, go to the data:
|
|
|
|
|
|
|
|
> setwd("/bioinf/home/your_username/marmic_NGS2021/day_1/demultiplexed")
|
|
> setwd("/bioinf/home/your_username/marmic_NGS2021/day_1/demultiplexed")
|
|
|
|
|
|
Then start dada2:
|
|
Then start dada2 and some necessary dependencies for producing plots
|
|
|
|
|
|
> library(dada2)
|
|
> library(dada2)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
The first parts of the process are now just preparing the files and file names. Look at the commands and the objects that are produced, and figure out what each one is doing.
|
|
The first parts of the process are now just preparing the files and file names. Look at the commands and the objects that are produced, and figure out what each one is doing.
|
|
|
|
|
|
> path <- "."
|
|
> path <- "."
|
... | | ... | |