--- title: "Getting started with Quartet" author: "Martin R. Smith" date: "`r Sys.Date()`" output: rmarkdown::html_document: bookdown::pdf_document2: includes: in_header: ../inst/preamble.tex number_sections: false bibliography: ../inst/REFERENCES.bib csl: ../inst/apa-old-doi-prefix.csl vignette: > %\VignetteIndexEntry{Getting started with Quartet} %\VignetteEncoding{UTF-8} %\VignetteEngine{knitr::rmarkdown} --- This document should contain all you need to get started measuring tree distances with 'Quartet'. If you get stuck, please [let me know](https://github.com/ms609/Quartet/issues/new?title=Suggestion:+) so I can improve this documentation. ## Loading trees Instructions for loading phylogenetic trees into R can be found in a [separate vignette](https://ms609.github.io/TreeTools/articles/load-trees.html). For these examples, we'll enter two simple trees by hand: ```{r set-up} tree1 <- ape::read.tree(text = '(A, ((B, (C, (D, E))), ((F, G), (H, I))));') tree2 <- ape::read.tree(text = '(A, ((B, (C, (D, (H, I)))), ((F, G), E)));') ``` ## Calculating distances We can calculate distances between pairs of trees using the 'Quartet' package. First we'll install the package. We can either install the stable version from the CRAN repository: ```r install.packages('Quartet') ``` or the development version, from GitHub -- which will contain the latest features but may not be as extensively tested: ```r devtools::install_github('ms609/Quartet') ``` Then we'll load the package into R's working environment: ```{r load-package, message=FALSE} library('Quartet') ``` Now the package's functions are available within R. Let's proceed to calculate some tree distances. ### Pairs of trees Calculating the distance between two trees is a two stage process. For a quartet distance, we first have to calculate the status of each quartet: ```{r quartet-status} statuses <- QuartetStatus(tree1, tree2) ``` Then we convert these counts into a distance metric (or similarity measure) that suits our needs -- perhaps the Quartet Divergence: ```{r measure-distance} QuartetDivergence(statuses, similarity = FALSE) ``` We can calculate all similarity metrics at once using: ```{r all-metrics} SimilarityMetrics(statuses, similarity = TRUE) ``` It can be instructive to visualize how each split in the tree is contributing to the quartet similarity: ```{r visualize-quartets} VisualizeQuartets(tree1, tree2) ``` Rather than using quartets, we might want to use partitions as the basis of our comparison: ```{r partitions} SimilarityMetrics(SplitStatus(tree1, tree2)) ``` ### Multiple comparisons If you have more than two trees to compare, you can send a list of trees (class: `list` or `multiPhylo`) to the distance comparison function. You can calculate the similarity between one tree and a forest of other trees: ```{r multi-trees} library('TreeTools', quietly = TRUE, warn.conflicts = FALSE) oneTree <- CollapseNode(as.phylo(0, 11), 14) twoTrees <- structure(list(bal = BalancedTree(11), pec = PectinateTree(11)), class = 'multiPhylo') status <- SharedQuartetStatus(twoTrees, cf = oneTree) QuartetDivergence(status) ``` Or between one tree and (itself and) all other trees in the forest: ```{r one-to-many} forest <- as.phylo(0:5, 11) names(forest) <- letters[1:6] status <- SharedQuartetStatus(forest) QuartetDivergence(status) ``` Or between each pair of trees in a forest: ```{r many-to-many} status <- ManyToManyQuartetAgreement(forest) QuartetDivergence(status, similarity = FALSE) ``` Or between one list of trees and a second: ```{r pairwise} status <- TwoListQuartetAgreement(forest[1:4], forest[5:6]) QuartetDivergence(status, similarity = FALSE) ``` ## Trees with different tip labels "Quartet" can compare trees of different sizes or with non-identical sets of taxa. Quartets pertaining to a leaf that does not occur in one tree are treated as unresolved. ```{r different-tips, fig.height = 1.5, fig.width = 4, fig.align = "center"} treeAG <- PectinateTree(letters[1:7]) treeBI <- PectinateTree(letters[2:9]) treeEJ <- PectinateTree(letters[5:10]) par(mfrow = c(1, 3), mar = rep(0.3, 4), cex = 1) plot(treeAG); plot(treeBI); plot(treeEJ) QuartetState(letters[1:4], treeAG) # 3: C is closest to D QuartetState(letters[1:4], treeBI) # 0: unresolved in this tree # Calculate status for all leaves observed in trees: here, A..I QuartetStatus(treeAG, treeBI, nTip = TRUE) # Calculate status for specified number of leaves # Here, we have ten taxa A..J, but J does not occur in either of these trees QuartetStatus(treeAG, treeBI, nTip = 10) # Compare a list of trees with different numbers of leaves to a reference QuartetStatus(c(treeAG, treeBI, treeEJ), cf = treeAG, nTip = TRUE) # Compare all pairs of trees in a list. # "u" shows how many possible quartets are unresolved in both trees ManyToManyQuartetAgreement(c(treeAG, treeBI, treeEJ), nTip = TRUE)[, , "u"] ``` ## Other calculations To calculate how many quartets are unique to a certain tree (akin to the partitionwise equivalent `ape::prop.clades`), use: ```{r in-one-only} interestingTree <- as.phylo(42, 7) referenceTrees <- list(BalancedTree(7), PectinateTree(7)) status <- CompareQuartetsMulti(interestingTree, referenceTrees) ``` `status['x_only']` = `r status['x_only']` quartets are resolved in a certain way in `interestingTree`, but not resolved that way in any `referenceTrees`. ## What next? You may wish to: - [Read more](https://ms609.github.io/Quartet/articles/Quartet-Distance.html) about Quartet distances - Review [alternative distance measures](https://ms609.github.io/TreeDist/index.html) and corresponding [functions](https://ms609.github.io/TreeDist/reference/index.html#section-tree-distance-measures) - [Interpret or contextualize](https://ms609.github.io/TreeDist/articles/using-distances.html) tree distance metrics