Home » Software / Algorithms / Databases

Software / Algorithms / Databases


The GEneSTATION database integrates diverse types of omics data across mammals to advance understanding of the genetic basis of gestation and pregnancy-associated phenotypes and accelerate the translation of discoveries from model organisms to humans. GEneSTATION contains curated life history information on pregnancy and reproduction from 23 mammalian genomes. For every human gene, GEneSTATION contains diverse evolutionary (e.g., gene age, population genetic and molecular evolutionary statistics), organismal (e.g., tissue-specific gene and protein expression, differential gene expression, disease phenotype), and molecular data types (e.g., protein interactions), as well as links to many general and pregnancy disease-specific databases.

Kim M., B. A. Cooper, R. Venkat, J. B. Phillips, H. R. Eidem, J. Hirbo, S. Nutakki, S. M. Williams, L. J. Muglia, J. A. Capra, K. Petren, P. Abbot, A. Rokas, & K. L. McGary. GEneSTATION 1.0: a synthetic resource of diverse evolutionary and functional genomic data for studying the evolution of pregnancy-associated tissues and phenotypes. Nucleic Acids Res. 44, Database issue: D908-916. Article


iWGS: in silico Whole Genome Sequencer and Analyzer

iWGS (in silico Whole Genome Sequencer and Analyzer) is an automated pipeline for guiding the choice of appropriate sequencing strategy and assembly protocols. iWGS seamlessly integrates the four key steps of a de novo genome sequencing project: data generation (through simulation), data quality control, de novo assembly, and assembly evaluation and validation. The last three steps can also be applied to the analysis of real data. iWGS is designed to enable the user to have great flexibility in testing the range of experimental designs available for genome sequencing projects, and supports all major sequencing technologies and popular assembly tools.

Zhou, X., D. Peris, C. T. Hittinger, & A. Rokas (2015). in silico Whole Genome Sequencer & Analyzer (iWGS): a computational pipeline to guide the design and analysis of de novo genome sequencing studies. G3: in press. Article

 download iWGS

Internode Certainty and Related Measures

Internode Certainty and related measures are novel measures that use information theory to quantify the degree of conflict or incongruence among all nontrivial bipartitions present in a set of trees. These measures can be calculated from different types of data that contain nontrivial bipartitions, including from bootstrap replicate trees to gene trees or individual characters. Given a set of phylogenetic trees, the Internode Certainty of a given internode reflects its specific degree of incongruence. More recently, calculation of these measures has been extended to trees that contain only some but not all taxa. All measures are implemented and freely available in the latest versions of the widely used program RAxML.

Kobert, K., L. Salichos, A. Rokas, & A. Stamatakis (2016). Computing the Internode Certainty and related measures from partial gene trees. Mol. Biol. Evol.: in press. Preprint on BioRxiv server Salichos, L., A. Stamatakis, & A. Rokas (2014). Novel information theory-based measures for quantifying incongruence among phylogenetic trees. Mol. Biol. Evol. 31: 1261-1271. Article Salichos, L.,& A. Rokas (2013). Inferring ancient divergences requires genes with strong phylogenetic signals. Nature 497: 327-331. Article

download RAxML