Compare and contrast parsimony, maximum likelihood, UPGMA and neighbor-joining methods.

Maximum parsimony: –

  • This is a character based method.
  • In it, the topology that requires the smallest number of substitutions is the best tree.
  • It takes care of insertions and deletions in the sequence.
  • When we have a particular sequence, then maximum parsimony method, constructs all the possible trees. Among all these possible trees, maximum parsimony method looks for that tree which has least number of substitutions, and these calls it as the most parsimonious tree.
  • “Informative sites” constitute a special part of these trees.

Maximum likelihood: –

  • This method was first presented by Cavalli-Sforza and Edwards in 1981; later on Fischer studied it in greater detail.
  • This is the most statistically suitable method for phylogenetic analysis.
  • Maximum likelihood tree is the one which has highest probability of producing the observed sequences.
  • It is the most computationally intensive method known so far.

UPGMA and neighbor-joining methods are distance based methods.


  • UPGMA stands for “Unweighted pair group method with arithmetic mean”.
  • It is a simple method.
  • It always makes a rooted tree in a very simple hierarchical method.
  • This method assumes the validity of molecular clock, and therefore considers that there is equal rate of evolution of all sequences.
  • It relies on formation of distance matrix. Please note that in other methods like maximum parsimony and maximum likelihood, distance matrix is not formed.


Neighbor-joining method: –

  • This method was given by Saitou and Nei in 1987.
  • This method also relies on distance data and formation of distance matrix.
  • It produces unrooted tree, and is different from UPGMA method in this way.
  • It does not assume the validity of molecular clock, and considers a different rate of evolution for every lineage under consideration.
  • This method involves formation/identification of neighbors or OTUs (operational taxonomic units).
  • When two neighbors are identified, then they are connected by a node, and this node now functions as a single node; and finds its neighbor.
  • After finding the next neighbor, the original node connects to this neighbor to form another node. Now, this entire node functions as a single node, and starts finding its neighbor.
  • This process continues until all lineages are connected with the primary neighbour.