The polymerase (P) and surface (S) genes of hepatitis B virus (HBV) are the longest gene overlap in animal viruses. Gene overlaps originate by the overprinting of a novel frame onto an ancestral pre-existing frame. Identifying which frame is ancestral and which one is de novo (the genealogy of the overlap) is an appealing topic. However, the P/S overlap of HBV is an intriguing paradox, because both genes are indispensable for virus survival. Thus, the hypothesis of a primordial virus without the surface protein or without the polymerase has no biological sense. With the aim to determine the genealogy of the overlap, I compared the codon usage of the overlapping frames P and S with that of the non-overlapping region. I found that the overlap of human HBV had two patterns of codon usage. One was localized in the 5' one-third of the overlap, and the other in the 3' two-thirds. By extending the analysis to non-human HBVs, I found that this feature occurred in all hepadnaviruses. Under the assumption that the ancestral frame has a codon usage significantly closer to that of the non-overlapping region than the de novo frame, I could predict the ancestral frames in the 5' and 3' region of the overlap. They were, respectively, the frame S and the frame P. These results suggest that the spacer domain of the polymerase and the S domain of the surface protein originated de novo by overprinting. They support a modular-evolution hypothesis on the origin of the overlap.
|Appare nelle tipologie:||1.1 Articolo su rivista|