The evolution of the hemagglutinin amino acids sequences of Influenza A virus is studied by a method based on an informational metrics, originally introduced by Rohlin for partitions in abstract probability spaces. This metrics does not require any previous functional or syntactic knowledge about the sequences and it is sensitive to the correlated variations in the characters disposition. Its efficiency is improved by algorithmic tools, designed to enhance the detection of the novelty and to reduce the noise of useless mutations. We focus on the USA data from 1993/94 to 2010/2011 for A/H3N2 and on USA data from 2006/07 to 2010/2011 for A/H1N1. We show that the clusterization of the distance matrix gives strong evidence to a structure of domains in the sequence space, acting as weak attractors for the evolution, in very good agreement with the epidemiological history of the virus. The structure proves very robust with respect to the variations of the clusterization parameters, and extremely coherent when restricting the observation window. The results suggest an efficient strategy in the vaccine forecast, based on the presence of ‘‘precursors’’ (or ‘‘buds’’) populating the most recent attractor.
|Tipologia ministeriale:||Articolo su rivista|
|Appare nelle tipologie:||1.1 Articolo su rivista|