|
|
# Bin statistics
|
|
|
|
|
|
Now that we have a couple of bins, let's do some basic statistics and compare: genome length, N50, and number of contigs. How bins compare between technologies? do you see any big differences in terms of statistic metrics? For instance, try to come up with a comparison of Illumina vs PacBio bins. Use a couple and check if you see big differences between technologies.
|
|
|
|
|
|
If you were not able to get bins, check the following folders in the result directory:
|
|
|
|
|
|
```
|
|
|
$ day_4/01.Illumina/03.binning
|
|
|
$ day_4/02.PacBio/03.binning
|
|
|
```
|
|
|
| ILMN or PACB bin | Number of contigs | N50 | genome length |
|
|
|
|------------------|-------------------|-----|---------------|
|
|
|
| ILMN | | | |
|
... | ... | @@ -18,7 +25,13 @@ $ checkm lineage_wf -f checkm_MaxBin.txt --tab_table -x fasta -t 10 --pplacer_th |
|
|
|
|
|
**Like we mentioned above, we have previously run checkM for all bins .** Find the bins and result of checkM in today's folder. From this point forward, we will be referring to this last group of bins.
|
|
|
|
|
|
See the output file generated by checkM (<span dir="">out_checkM_marmic2021-allbins.tab</span>). It should look something like this:
|
|
|
See the output file generated by checkM:
|
|
|
|
|
|
```
|
|
|
$ day_4/03.checkM/out_checkM_marmic2021-allbins.tab
|
|
|
```
|
|
|
|
|
|
It should look something like this:
|
|
|
|
|
|
```plaintext
|
|
|
Bin Id Marker lineage # genomes # markers # marker sets 0 1 2 3 4 5+ Completeness Contamination Strain heterogeneity
|
... | ... | |