-
Notifications
You must be signed in to change notification settings - Fork 7
adding extinct taxa to the synthetic tree
Versions of the tree <= 10.4 have omitted extinct taxa. This is basically a hack to deal with the fact that many of these taxa are very crudely placed in the taxonomic hierarchy and not included in any of our curated phylogenetic studies. Including them causes phenomena such as the Fungi node in the tree becoming un-browse-able because of having too many children (ideally, we'd just have the browser view filter these taxa out, but we have not implemented that yet).
MTH has (in early Feb, 2019) implemented an (still not deployed nor thoroughly tested) work-around that entails:
-
processing OTT to add the incertae_sedis flag to every extinct taxon (as well as adding the extinct flag to non-tips whose descendants are all flagged as extinct or extinct_inherited).
-
pointing the propinquity configuration to the modified OTT
-
removing
extinct
andextinct_inherited
from the propinquity config'scleaning_flags
property. Those flags should remain inadditional_regrafting_flags
so that extinct taxa that are not in any input phylogeny are not added to to the supertree - this avoids the huge number of Fungi children. -
rebuilding the tree
This seems to add 300 leaves (see below) because we don't have many trees with fossil taxa.
We could change the logic in propinquity/otcetera tools to treat extinct taxa as incertae sedis, but that feels like a bit of a hack because the concept of "extinct" is certainly distinct from "incertae sedis". Thus, I'd prefer for the treatment of fossil taxa as incertae sedis to be "shallow" in the software architecture. Hopefully, we could make the taxonomy richer such that those fossil taxa which are incertae sedis are labelled as such. To that end, I've tried to attack the problem by pre-processing the taxonomy.
- propinquity
fossil_taxa
branch on mtholder fork: https://github.com/mtholder/propinquity/tree/fossil_taxa - otcetera
extinct-to-incert-sed
branch on mtholder fork: https://github.com/mtholder/otcetera/tree/extinct-to-incert-sed - peyotl
python3
branch on mtholder fork: https://github.com/mtholder/peyotl/tree/python3 Hopefully all of the code involved would work on Python2.7 too, but I have not tested that. - build otcetera
- create virtualenv using python 3.6 (or greater probably works)
- run:
otc-taxonomy-parser ott3.0draft6 -E --write-taxonomy=ott3.0.6-extinct-mod 2>err-extinction-mod-3.0.6.txt
in the parent of the OTT dir (here the parent of ott3.0draft6
) to create the modified OTT taxonomy flags
-
point
ott3.0.6-extinct-mod
usingott = %(home)s/ott/ott3.0.6-extinct-mod
in~/.opentree
-
run propinquity, and don't forget to cross your fingers!
Spot checks indicates that we'd added the 300 tips shown below. Unfortunately, we end up with a weird result in Homo sapiens as a species because (for reasons that are not clear to me) H. sapiens sapiens is hidden
(so the tree just ends up containing Denisovans and Neanderthals as subsp of Homo). https://tree.opentreeoflife.org/taxonomy/browse?id=770315
I suspect that the reason behind hiding the subspecies is the fact that when extinct taxa were pruned humans would just be a monotypic taxon, so we should probably not show the subspecies name in that context.
This issue can probably be dealt with simply by removing the hidden
flag from H. sapiens sapiens whenever we build with extinct taxa included. (or have a general mechanism for hiding the sole subspecies for a monotypic species).
The new tips:
6523, 20881, 45812, 45818, 81069, 84218, 102587, 106258, 124432, 196162, 200067,
208456, 211375, 220186, 222067, 271376, 303038, 306515, 365642, 370488, 370492,
370493, 372585, 374222, 437198, 447620, 447653, 459222, 465032, 466809, 469451,
534480, 558503, 564710, 567111, 576651, 576657, 587772, 588438, 607972, 621571,
623176, 625192, 645879, 653155, 707061, 727203, 754373, 816657, 816660, 816665,
816669, 840265, 869089, 933436, 937214, 964061, 964908, 964911, 982349, 1001940,
1009608, 1021848, 1036062, 1066976, 1083365, 3600100, 3600110, 3600120, 3600124,
3600125, 3600127, 3600128, 3600129, 3600131, 3600825, 3607245, 3607484, 3607521,
3607522, 3607676, 3607796, 3610308, 3610315, 3612189, 3612191, 3612195, 3612196,
3612203, 3612205, 3612207, 3612210, 3612259, 3612262, 3612266, 3612406, 3612408,
3612420, 3612428, 3612433, 3612436, 3612500, 3612501, 3612502, 3612503, 3612507,
3612509, 3612510, 3612516, 3612519, 3612521, 3612524, 3612525, 3612529, 3612533,
3612535, 3612536, 3612538, 3612539, 3612541, 3612543, 3612544, 3612547, 3612550,
3612554, 3612558, 3612559, 3612561, 3612562, 3612564, 3612567, 3612569, 3612571,
3612574, 3612579, 3612580, 3612584, 3612586, 3612587, 3612588, 3612589, 3612591,
3612592, 3612594, 3612595, 3612596, 3612597, 3612599, 3612600, 3612601, 3612603,
3612605, 3612606, 3612608, 3612609, 3612610, 3612611, 3612612, 3612613, 3612614,
3612615, 3612616, 3612617, 3612618, 3612619, 3612620, 3612621, 3612624, 3612625,
3612626, 3612628, 3612629, 3612631, 3612632, 3612633, 3612634, 3612635, 3614203,
3614207, 3615450, 3615459, 3615461, 3616017, 3616019, 3616020, 3617145, 3636488,
3636492, 3636495, 3676862, 3676865, 3677021, 4117716, 4117718, 4117748, 4117981,
4117983, 4117984, 4117986, 4117987, 4117988, 4117990, 4117994, 4117996, 4118000,
4118004, 4118005, 4118007, 4118010, 4118012, 4118013, 4118749, 4118794, 4119380,
4119411, 4119429, 4119560, 4119733, 4124528, 4124686, 4125739, 4125746, 4125764,
4125794, 4125820, 4125835, 4126044, 4126058, 4126060, 4126066, 4126085, 4130813,
4130815, 4130817, 4130828, 4130831, 4130832, 4130833, 4130835, 4130836, 4130848,
4941006, 4941266, 4941433, 4941594, 4941696, 4941850, 4941925, 4941926, 4941927,
4941929, 4941930, 4942030, 4942032, 4942359, 4942380, 4942409, 4942412, 4942414,
4942417, 4942432, 4942441, 4942444, 4942547, 4942565, 4942579, 4942613, 4943497,
4944931, 4945957, 4946043, 4949707, 5093185, 5668910, 5773930, 5782046, 5800006,
5833975, 5839494, 5839497, 5925662, 5936119, 5936581, 6140364, 6142887, 6145836,
6145840, 6145853, 6145860, 6145868, 6145876, 6145894, 6145898, 6145900, 6145904,
6146017, 6146167, 6151092, 6157964, 6157997