Phylogenetic Model Selection via Machine Learning
Date
2024
Authors
Dong, Yanghe
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Phylogenetic inference, which reconstructs evolutionary trees from DNA or amino acid sequences, is crucial for understanding the evolutionary histories of species on Earth. Model selection is a fundamental step in this process, determining the best-fit model for the data. However, classic maximum likelihood-based methods for model selection are computationally intensive. This study introduces a machine learning-based framework for amino acid model selection, consisting of three components: protFinder for selecting the best-fit substitution model, RHASFinder for identifying the appropriate rate heterogeneity model, and protFFinder for determining the use of empirical pre-estimated frequencies. Our framework is an order of magnitude faster than the widely used ModelFinder, while maintaining comparable accuracy.
Description
Deposited by the author 27.10.24
Keywords
phylogenetics, amino acid, model selection, rate heterogeneity, neural network
Citation
Collections
Source
Type
Thesis (Masters)
Book Title
Entity type
Access Statement
License Rights
Restricted until
Downloads
File
Description