... | ... | @@ -8,7 +8,7 @@ Before we begin, there's one thing we need to download, which is the formatted s |
|
|
### 3.1 Processing our data with DADA2
|
|
|
Now to get going with DADA2. DADA2 is a very large software with an entire integrated pipeline that takes raw reads (minus primers and barcodes) as input. It also needs lots of dependencies to do all of its things. So in order to avoid wasting several hours installing it, I have a version of R with everything installed that you'll be using for this practical.
|
|
|
|
|
|
$ /home/tfrancis/R-3.6.2/bin/R
|
|
|
$ /home/tfrancis/miniconda3/bin/R
|
|
|
|
|
|
and if you're not already there, go to the data:
|
|
|
|
... | ... | @@ -30,18 +30,11 @@ What are fnFs and fnRs? Think about why we might want to make lists of files. (R |
|
|
|
|
|
This one's a bit funny syntactically. Use the `?` to try and figure out what sapply does. The oddest bit is that single square bracket `[`. In R, the square bracket is actually a function much like `+` and `-`! Here it's the subset function.
|
|
|
|
|
|
Now we have our file names, we can look at the quality values of the reads. Like I said at the beginning, installation of DADA2 wasn't easy, and so in this R installation we don't get plots appearing on screen, we have to write them to file and look at them separately. I know. So we're going to create a pdf of the plot by first opening a pdf file with the `pdf()` command, then run the `plotQualityProfile()` function, then switch off the plotting device.
|
|
|
Now we have our file names, we can look at the quality values of the reads.
|
|
|
|
|
|
> pdf("QualityPlot.pdf")
|
|
|
> plotQualityProfile(fnFs[1:2])
|
|
|
> dev.off()
|
|
|
|
|
|
To view the plot, open a new terminal **but don't close R, since we still need it to finish the analysis** and run:
|
|
|
|
|
|
# Don't forget to add the path to this file!
|
|
|
$ atril QualityPlot.pdf
|
|
|
|
|
|
Modify the stuff inside the `plotQualityProfile()` function above (re-running the pdf part too) to look at the other samples, and also to look at the reverse reads. How are the scores different between read 1 and read 2?
|
|
|
Modify the stuff inside the `plotQualityProfile()` function above to look at the other samples, and also to look at the reverse reads. How are the scores different between read 1 and read 2?
|
|
|
|
|
|
Next we want to make file paths for the filtering and trimming step that will be coming up.
|
|
|
|
... | ... | @@ -66,9 +59,7 @@ Now we need to learn the error rates of our data. From the DADA2 documentation: |
|
|
|
|
|
Now we have these errors estimated, we can plot them to see if they're reasonable (try both forward and reverse reads):
|
|
|
|
|
|
> pdf("ErrorPlot.pdf")
|
|
|
> plotErrors(errF, nominalQ=TRUE)
|
|
|
> dev.off()
|
|
|
|
|
|
We saw this in the lecture, but now the estimated errors (black dots and line) might look a bit different. The main reason for this is that our sequences only have seven quality scores associated with them (40, 38, 32, 27, 22, 13, 0), rather than the full 43. This shouldn't make a big difference to our end results though, because 16S amplicon data is generally pretty robust.
|
|
|
|
... | ... | @@ -124,7 +115,7 @@ Start by getting the full processed dada data: |
|
|
|
|
|
Then R
|
|
|
|
|
|
$ /home/tfrancis/R-3.6.2/bin/R
|
|
|
$ /home/tfrancis/miniconda3/bin/R
|
|
|
|
|
|
load the data:
|
|
|
|
... | ... | @@ -167,16 +158,10 @@ then do the plotting :) |
|
|
theme(axis.text.x=element_text(angle=45, hjust=1))
|
|
|
}
|
|
|
|
|
|
So the bit above creates the function called `plotTaxon()`, which we can then use to generate the plots (and view them like we did before).
|
|
|
So the bit above creates the function called `plotTaxon()`, which we can then use to generate the plots.
|
|
|
|
|
|
> pdf("Marmic-amplicon-plot.pdf")
|
|
|
> plotTaxon(seqtab.nochim, taxa, "Bacteroidota", 0.01, "Genus")
|
|
|
> dev.off()
|
|
|
|
|
|
Once you have your plot, you can open a new terminal (don't quit R because we will want to create different plots for different taxa - just change the file name for the pdf) and run:
|
|
|
|
|
|
$ atril Marmic-amplicon-plot.pdf
|
|
|
|
|
|
to view it. Happy plotting!
|
|
|
Happy plotting!
|
|
|
|
|
|
(Hint - try Deltaproteobacteria and Campilobacterota and see what inferences you can make about your samples) |
|
|
(Hint - try Deltaproteobacteria and Campilobacterota, or try changing the taxonomic level you view to see what inferences you can make about your samples) |