Software I'm Developing:

The Bio++ libraries

Bio++: A C++ library for sequence analysis, phylogenetics, molecular evolution and population genetics:  see the Bio++ Website. This project, i nitially started in the GPIA lab in Montpellier, has grown and now involves several developers from different labs, making it a large collaborative effort.

The goal of Bio++ is to provide "bricks" for building efficient software for molecular evolution (including phylogenetics, population genetics, genomics, etc). We therefore research both on the d esign of the code (object orientation, ontology) and implementation (efficient algorithms and data structures).

Most of the following programs use these libraries, starting with the Bio++ Program Suite, which contains several utilitary programs, notably to work with phylogenetics maximum likelihood methods (See  ;BppSuite's website on GNA!)


PhySamp is a package dedicated to phylogenetic sampling, that is, the filtering of sequence alignments and corresponding phylogenetic trees. It contains so far one program called bppAlnOptim, which allows to optimize the size of an alignment while minimizing the occurrence of missing data.


MafFilter is a program allowing to design advanced filtering and processing pipeline for genome alignments.


CoMap is a package dedicated to coevolution analysis by Cosubtitution Mapping. In contains the comap program, containing the coevolution detection methods I published, and the MICA program (M utual Information Coevolution Analysis), implementing several MI-based methods from the literature (see my review on coevolution methods in Brieffings in Bioinformatic, 2012). CoMap needs the Bio++ libraries, and can be found here. Executable and packages for linux distributions are found on the Bio++ forge, together with the Bio++ libraries.


In another program making use of substitution mapping procedures to detect positions in a molecule under biochemical constraint (see Dutheil 2007). See ConTest 9;s website on GNA! for more information and source code. Packages and executable are available on the Bio++ forge.


The CoalHMM software implement the CoalHMM model described in (Hobolth et all 2007, Dutheil et al 209, and Mailund et al 2011). No official stable release is done yet, but the code compiles against the lat est stable Bio++ release. See CoalHMM's website on GNA!


The TestNH package contains programs for testing and fitting non-homogeneous models of sequence evolution. Its website can be found here. The testnh p rogram is described in Dutheil and Boussau (2008), the mapnh program is described in Romiguier et al (2012), and the partnh program in Dutheil et al (2012).


These pieces of software are not maintained anymore, because better software have been developed since. They are however open source, and still distributed if someone would find them useful:

  • Baobab: a Java phylogenetic trees editor. See this link. TheBio++ project contains a tree editor which, while still in an early version number, tend to offer a much better user experience.
  • B3: Bibliography Base for Biologists. See B3's Project description page. This project was strated a long time ago, before software like J abRef, Zotero or Mendeley ever exist. There is no more need for this software nowadays when those programs are there. B3, however, was in my opinion a good approach as it used only standard formats for data storag e and formatting, namely XML and XSLT. I still think this was a good strategy...