Phylogeny is cool!

PhyloSynth

Phylogeny is cool!

PhyloSynth

We are marching on Phylosynth project! Some primary data, code and results will share here. Our goal is reconstructing a larger-scale plant Tree of Life for all seed plants (Spermatophyta), using methods described in Smith and Brown (2018) and ideas described in Eiserhardt et al. (2018; see below), and integrating the phylogenetic backbone from the Plant and Fungal Trees of Life Project (PAFTOL) and robust taxonomy database from World Checklist of Selected Plant Families (WCSP). We endeavor to push the boundary of the knowledge of Tree of Life, keeping this tree portable and dynamically updated, providing knowledge of the plant tree of life to science community and the public education.

Pipeline Schema from Eiserhardt et al. (2018)

Some key feactures:

  • Flexible
    Easy for other pipelines to integrate

  • Dynamically updated
    Establish a schedule for running this pipeline at regular intervals, producing up-to-date trees. For this, we need to decide an initial frequency for generating trees. This frequency can later be adjusted based on download statistics and user feedback.

  • Portable for different audiences
    Establish one or more outlet(s) for PhyloSynth trees. This needs to take into consideration where different audiences would be looking for trees, and ensure (for scientific audiences) that there is a citable paper.

  • High quality

    • Build module that maps NCBI taxonomy to a widely accepted botanical taxonomy. This should in the first place be the WCSP/”names backbone” at Kew, but we need to consider the fact that other lists are in circulation.
    • Build a module that filters NCBI data automatically according to certain rules. This could be a simple decision tree based on metadata, or a more complex machine learning approach.
    • Build a module that evaluates resulting trees automatically using a set of statistics. This could include, among other things, monophyly statistics for higher ranks from the taxonomy used (genera and families in the case of WCSP).
    • Establish a procedure for manual quality control by taxon experts. This would need to include a procedure for storing decisions/annotations and avoiding duplication of effort.
    • Establish a procedure for user feedback. This would need to include a procedure for storing decisions/annotations and avoiding duplication of effort.
Avatar
Miao Sun
Postdoc Fellow