... | ... | @@ -106,7 +106,17 @@ Now we need to submit the script to Slurm with: |
|
|
|
|
|
Now you can run `squeue` and see if your job is running! Also you'll see a new file has been created for the output that would otherwise print to screen. Check this file to make sure you don't have any errors. Run time for GTDB-tk, especially the `de_novo_wf`, is pretty long (after all it has to compare your MAGs to a database of over 20,000 other genomes - check out the run-time we've reserved in the slurm script!), so carry on with the stuff below and you can take a look at the GTDB-tk outputs later today (`classify_wf` should finish today), or tomorrow.
|
|
|
|
|
|
It's also possible to load the phylogenetic tree output from GTDB-tk into arb, but it's a bit of a faff. We prepared an example of what it looks like for you to explore. There's an arb tree you can view (you first have to go to an arb server `$ ssh arb-X` (put in an number from 1 to 3 in place of the X), then run `arb` from the command line).
|
|
|
It's also possible to load the phylogenetic tree output from GTDB-tk into arb, but it's a bit of a faff. We prepared an example of what it looks like for you to explore, it's in the `marmic_NGS2021/results/day_4/` directory. There's an arb database in there you can view (you first have to go to an arb server `$ ssh arb-X` (put in an number from 1 to 3 in place of the X), then run `arb` from the command line) and open the `MarMic-gtdbtk-example.arb` database.
|
|
|
|
|
|
#(
|
|
|
If you _really_ want to view your own tree (you have to be a bit masochistic but whatever, you do you), you'll need to click 'create and import', then choose the file `gtdbtk.bac120.msa.fasta` in your `gtdbtk_denovo` directory, set the type in the dropdown menu to protein, and choose 'fasta_wgap.ift' from the list on the right. When prompted, select 'Generate unique species IDs', then 'None (only acc)'. Now you have a database. In the top row of icons you'll see a padlock, and below it a dropdown set of numbers. Change that from '0' to '6'. Then in the top row of the menu, click 'Species', then 'Search and Query'. In the new window, click the 'Search' button. Then in the 'More functions' tab, click 'Set Protection of Fields of Listed Species'. This gives you yet another window. Here click on 'name' in the right panel, then '0 temporary' from the list on the left, then click the 'Assign protection to field of listed' button. Nothing will obviously happen but trust me, that's how it's meant to be. Go ahead and click close on this window and the 'Search and Query' window. Now click 'File' from the top row, then 'Export', then 'Export fields (to calc-sheet using NDS)', make sure the Column output at the bottom is 'TAB separated', then hit 'Save', then 'close'.
|
|
|
|
|
|
_Now we need to open a new terminal, but don't close ARB!_ In your new terminal, go to the location of the `export.nds` file we just created (most likely in the `gtdbtk_denovo` directory). Then run the following: `sed -i 's/ /\t/' export.nds`. Now return to your ARB window.
|
|
|
|
|
|
Click 'File' again, then 'Import', then 'Import fields from calc-sheet'. For the top line, click 'Browse', change the file extension from 'csv' to 'nds', then select the `export.nds` file we just modified. Then select 'close'. Now we want to Write content of column '2', to field 'name', of species for which field 'name' matches content of column '1'. Then hit 'GO'. It should tell you it imported some 30,000ish entries.
|
|
|
|
|
|
Now it's tree time. Click 'Tree' from the top line of menus, then 'Tree admin'. Then click 'Import', and there should be only the `gtdbtk.bac120.decorated.tree` available. So click that, then the 'Load' button. Finally, in the top row of icons, you'll see some buttons that look a bit like tiny trees. Click the second of those three to view your tree! Was all that pain truly worth it though?
|
|
|
#)
|
|
|
|
|
|
## ANI and AAI
|
|
|
|
... | ... | |