... | @@ -30,11 +30,18 @@ What are fnFs and fnRs? Think about why we might want to make lists of files. (R |
... | @@ -30,11 +30,18 @@ What are fnFs and fnRs? Think about why we might want to make lists of files. (R |
|
|
|
|
|
This one's a bit funny syntactically. Use the `?` to try and figure out what sapply does. The oddest bit is that single square bracket `[`. In R, the square bracket is actually a function much like `+` and `-`! Here it's the subset function.
|
|
This one's a bit funny syntactically. Use the `?` to try and figure out what sapply does. The oddest bit is that single square bracket `[`. In R, the square bracket is actually a function much like `+` and `-`! Here it's the subset function.
|
|
|
|
|
|
Now we have our file names, we can look at the quality values of the reads:
|
|
Now we have our file names, we can look at the quality values of the reads. Like I said at the beginning, installation of DADA2 wasn't easy, and so in this R installation we don't get plots appearing on screen, we have to write them to file and look at them separately. I know. So we're going to create a pdf of the plot by first opening a pdf file with the `pdf()` command, then run the `plotQualityProfile()` function, then switch off the plotting device.
|
|
|
|
|
|
|
|
> pdf("QualityPlot.pdf")
|
|
> plotQualityProfile(fnFs[1:2])
|
|
> plotQualityProfile(fnFs[1:2])
|
|
|
|
> dev.off()
|
|
|
|
|
|
|
|
To view the plot, open a new terminal _but don't close R, since we still need it to finish the analysis_ and run:
|
|
|
|
|
|
Modify the above to look at the other samples, and also to look at the reverse reads too. How are the scores different between read 1 and read 2?
|
|
# Don't forget to add the path to this file!
|
|
|
|
$ atril QualityPlot.pdf
|
|
|
|
|
|
|
|
Modify the stuff inside the `plotQualityProfile()` function above (re-running the pdf part too) to look at the other samples, and also to look at the reverse reads. How are the scores different between read 1 and read 2?
|
|
|
|
|
|
Next we want to make file paths for the filtering and trimming step that will be coming up.
|
|
Next we want to make file paths for the filtering and trimming step that will be coming up.
|
|
|
|
|
... | @@ -59,7 +66,9 @@ Now we need to learn the error rates of our data. From the DADA2 documentation: |
... | @@ -59,7 +66,9 @@ Now we need to learn the error rates of our data. From the DADA2 documentation: |
|
|
|
|
|
Now we have these errors estimated, we can plot them to see if they're reasonable (try both forward and reverse reads):
|
|
Now we have these errors estimated, we can plot them to see if they're reasonable (try both forward and reverse reads):
|
|
|
|
|
|
|
|
> pdf("ErrorPlot.pdf")
|
|
> plotErrors(errF, nominalQ=TRUE)
|
|
> plotErrors(errF, nominalQ=TRUE)
|
|
|
|
> dev.off()
|
|
|
|
|
|
We saw this in the lecture, but now the estimated errors (black dots and line) might look a bit different. The main reason for this is that our sequences only have seven quality scores associated with them (40, 38, 32, 27, 22, 13, 0), rather than the full 43. This shouldn't make a big difference to our end results though, because 16S amplicon data is generally pretty robust.
|
|
We saw this in the lecture, but now the estimated errors (black dots and line) might look a bit different. The main reason for this is that our sequences only have seven quality scores associated with them (40, 38, 32, 27, 22, 13, 0), rather than the full 43. This shouldn't make a big difference to our end results though, because 16S amplicon data is generally pretty robust.
|
|
|
|
|
... | @@ -105,7 +114,6 @@ Now lets save our results, so that we can use them again later without having to |
... | @@ -105,7 +114,6 @@ Now lets save our results, so that we can use them again later without having to |
|
Now that we have these outputs, we need to make them useful! One way to do that is to use the R package phyloseq, which will integrate all kinds of fun things. You can check out the phyloseq/DADA2 interaction on the DADA2 website: https://benjjneb.github.io/dada2/tutorial.html
|
|
Now that we have these outputs, we need to make them useful! One way to do that is to use the R package phyloseq, which will integrate all kinds of fun things. You can check out the phyloseq/DADA2 interaction on the DADA2 website: https://benjjneb.github.io/dada2/tutorial.html
|
|
|
|
|
|
For now though, we're just going to use a (relatively) simple plotting function to inspect the sequence diversity in our samples. If you get bored you can try and puzzle through what the plot function is doing and start changing things. But if you're not bothered, you still get some nice plots :)
|
|
For now though, we're just going to use a (relatively) simple plotting function to inspect the sequence diversity in our samples. If you get bored you can try and puzzle through what the plot function is doing and start changing things. But if you're not bothered, you still get some nice plots :)
|
|
Like I said at the beginning, installation of DADA2 wasn't easy, and so in this R installation we don't get plots appearing on screen, we have to write them to file and look at them separately. I know.
|
|
|
|
|
|
|
|
# If you didn't make it to the plotting stage on Monday, start from here on Tuesday
|
|
# If you didn't make it to the plotting stage on Monday, start from here on Tuesday
|
|
We're all going to start today by making some plots. Sorry if the full dada processing took too long and you didn't get to run it all yourself. If you didn't, all you missed was running some R code and inspecting big unwieldy tables.
|
|
We're all going to start today by making some plots. Sorry if the full dada processing took too long and you didn't get to run it all yourself. If you didn't, all you missed was running some R code and inspecting big unwieldy tables.
|
... | @@ -159,10 +167,7 @@ then do the plotting :) |
... | @@ -159,10 +167,7 @@ then do the plotting :) |
|
theme(axis.text.x=element_text(angle=45, hjust=1))
|
|
theme(axis.text.x=element_text(angle=45, hjust=1))
|
|
}
|
|
}
|
|
|
|
|
|
|
|
So the bit above creates the function called `plotTaxon()`, which we can then use to generate the plots (and view them like we did before).
|
|
|
|
|
|
|
|
|
|
So the bit above creates the function called `plotTaxon()`, which we can then use to generate the plots. We're going to create a pdf of the plot by first opening a pdf file with the `pdf()` command, then run the `plotTaxon()` function, then switch off the plotting device.
|
|
|
|
|
|
|
|
> pdf("Marmic-amplicon-plot.pdf")
|
|
> pdf("Marmic-amplicon-plot.pdf")
|
|
> plotTaxon(seqtab.nochim, taxa, "Bacteroidota", 0.01, "Genus")
|
|
> plotTaxon(seqtab.nochim, taxa, "Bacteroidota", 0.01, "Genus")
|
... | | ... | |