KNAW Repository

Cophenetic correlation analysis as a strategy to select phylogenetically informative proteins: an example from the fungal kingdom

Kuramae, E.E. and Robert, V. and Echavarri-Erasun, C. and Boekhout, T. (2007) Cophenetic correlation analysis as a strategy to select phylogenetically informative proteins: an example from the fungal kingdom. BMC Evolutionary Biology, 7, 134-. ISSN 1471-2148.

[img]PDF - Published Version
Available under License Creative Commons Attribution.

314Kb

Official URL: http://dx.doi.org/10.1186/1471-2148-7-134

Abstract

The construction of robust and well resolved phylogenetic trees is important for our understanding of many, if not all biological processes, including speciation and origin of higher taxa, genome evolution, metabolic diversification, multicellularity, origin of life styles, pathogenicity and so on. Many older phylogenies were not well supported due to insufficient phylogenetic signal present in the single or few genes used in phylogenetic reconstructions. Importantly, single gene phylogenies were not always found to be congruent. The phylogenetic signal may, therefore, be increased by enlarging the number of genes included in phylogenetic studies. Unfortunately, concatenation of many genes does not take into consideration the evolutionary history of each individual gene. Here, we describe an approach to select informative phylogenetic proteins to be used in the Tree of Life (TOL) and barcoding projects by comparing the cophenetic correlation coefficients (CCC) among individual protein distance matrices of proteins, using the fungi as an example. The method demonstrated that the quality and number of concatenated proteins is important for a reliable estimation of TOL. Approximately 40–45 concatenated proteins seem needed to resolve fungal TOL. Results In total 4852 orthologous proteins (KOGs) were assigned among 33 fungal genomes from the Asco- and Basidiomycota and 70 of these represented single copy proteins. The individual protein distance matrices based on 531 concatenated proteins that has been used for phylogeny reconstruction before [14] were compared one with another in order to select those with the highest CCC, which then was used as a reference. This reference distance matrix was compared with those of the 70 single copy proteins s Conclusion This study provides candidate protein sequences to be considered as phylogenetic markers in different branches of fungal TOL. The selection procedure described here will be useful to select informative protein sequences to resolve branches of TOL that contain few or no species with completely sequenced genomes. The robust phylogenetic trees resulting from this method may contribute to our understanding of organismal diversification processes. The method proposed can be extended easily to other b

Item Type:Article
Institutes:Nederlands Instituut voor Ecologie (NIOO)
Centraalbureau voor Schimmelcultures (CBS)
ID Code:4667
Deposited On:16 Sep 2009 02:00
Last Modified:31 Mar 2014 10:27

Repository Staff Only: item control page