... | ... | @@ -4,7 +4,7 @@ Now that we have a couple of bins, let's do some basic statistics and compare: g |
|
|
|
|
|
If you were not able to get bins, check the following folders in the result directory:
|
|
|
|
|
|
```
|
|
|
```plaintext
|
|
|
$ day_4/01.Illumina/03.binning
|
|
|
$ day_4/02.PacBio/03.binning
|
|
|
```
|
... | ... | @@ -27,7 +27,7 @@ $ checkm lineage_wf -f checkm_MaxBin.txt --tab_table -x fasta -t 10 --pplacer_th |
|
|
|
|
|
See the output file generated by checkM:
|
|
|
|
|
|
```
|
|
|
```plaintext
|
|
|
$ day_4/03.checkM/out_checkM_marmic2021-allbins.tab
|
|
|
```
|
|
|
|
... | ... | @@ -142,12 +142,6 @@ Now it's tree time. Click 'Tree' from the top line of menus, then 'Tree admin'. |
|
|
|
|
|
For checking relatedness among genomes/MAGs, we will perform pairwise average nucleotide identity (ANI) and average amino acid identity (AAI) comparisons. For instance, the former values are good to measure if two or more (draft) genomes belong to the same species. In case things didn't work for you we have also previously selected a group of MAGs to work in this section. We have also added the predicted protein sequences for each MAG or feel free to do the protein prediction yourself.
|
|
|
|
|
|
Compare AAI values among MAGs by running the aai.rb script:
|
|
|
|
|
|
```plaintext
|
|
|
$ software/aai.rb -1 {Predicted protein sequences for genome 1} -2 {Predicted protein sequences for genomes 2}
|
|
|
```
|
|
|
|
|
|
For ANI calculations, we can just run the ani.rb script using the genomic sequences (i.e., don’t need to predict genes)
|
|
|
|
|
|
```plaintext
|
... | ... | @@ -157,3 +151,11 @@ $ software/ani.rb -1 {Genome 1} -2 {Genomes 2} |
|
|
Feel free to explore other options available in the ani.rb and aai.rb script.
|
|
|
|
|
|
What can you say about the possible level of novelty of these bins? What is the level of relatedness between the bins and references detected by checkM? Make sure to make meaningful comparisons: For instance, how bins identified by GTDB-tk as belonging to same species compare between Illumina and PacBio using ANI? or AAI? how would you explain these results?
|
|
|
|
|
|
# Activity for advanced students
|
|
|
|
|
|
Predict proteins sequences for selected bins using Prodigal and compare AAI values among MAGs by running the aai.rb script:
|
|
|
|
|
|
```
|
|
|
$ software/aai.rb -1 {Predicted protein sequences for genome 1} -2 {Predicted protein sequences for genomes 2}
|
|
|
``` |
|
|
\ No newline at end of file |