... | @@ -78,7 +78,7 @@ Open a new text file with: |
... | @@ -78,7 +78,7 @@ Open a new text file with: |
|
|
|
|
|
$ nano slurm-submit.sh
|
|
$ nano slurm-submit.sh
|
|
|
|
|
|
Then copy and paste the following into it:
|
|
Then copy and paste the following into it, and change the path to your MAGs, and maybe also the -x option if you have MAGs with `.fasta` rather than `.fa` extensions:
|
|
|
|
|
|
```
|
|
```
|
|
#!/bin/bash
|
|
#!/bin/bash
|
... | @@ -93,18 +93,13 @@ Then copy and paste the following into it: |
... | @@ -93,18 +93,13 @@ Then copy and paste the following into it: |
|
#SBATCH --array=1 # Array range
|
|
#SBATCH --array=1 # Array range
|
|
#SBATCH --partition=CLUSTER # Partition
|
|
#SBATCH --partition=CLUSTER # Partition
|
|
|
|
|
|
|
|
gtdbtk classify_wf --cpus 16 --genome_dir /path/to/MAGs/ --out_dir gtdbtk_classify -x fa
|
|
|
|
gtdbtk de_novo_wf --cpus 16 --genome_dir /path/to/MAGs/ --out_dir gtdbtk_denovo -x fa --bac120_ms --outgroup_taxon p__Deinococcota
|
|
```
|
|
```
|
|
|
|
|
|
GTDB-tk because it requires too much compute resources! For this section, we are just going to explore the results for a collection of MAGs previously selected by us. The installation is not hard but it requires too much free space to run. Alternatively, you could also run it using Kbase. However, feel free to install it using conda (these are the instructions from https://github.com/Ecogenomics/GTDBTk).<br>
|
|
Now you can run `squeue` and see if your job is running!
|
|
|
|
|
|
1. Create a new conda environment: `conda create -n gtdbtk`
|
|
|
|
2. Activate the environment: `conda activate gtdbtk`
|
|
|
|
3. Install GTDB-Tk: `conda install -c bioconda gtdbtk`
|
|
|
|
4. Download the reference package either manually or by running `download-db.sh`.
|
|
|
|
5. Set the `GTDBTK_DATA_PATH` environment variable in `{gtdbtk environment path}/etc/conda/activate.d/gtdbtk.sh` to the reference package location.<br>
|
|
|
|
|
|
|
|
**Explore the results of GTDB-tk in today's folder.** There's an arb tree you can view (you first have to go to an arb server `$ ssh arb-X` (put in an number from 1 to 3 in place of the X), then run `arb` from the command line). How do these results compare to checkM estimations?
|
|
It's possible to load the tree output into arb, but it's a bit of a faff. We prepared an example of what it looks like for you to explore. There's an arb tree you can view (you first have to go to an arb server `$ ssh arb-X` (put in an number from 1 to 3 in place of the X), then run `arb` from the command line).
|
|
|
|
|
|
## ANI and AAI
|
|
## ANI and AAI
|
|
|
|
|
... | | ... | |