At a glance
CDC provides estimates of SARS-CoV-2 variant proportions for 4-week periods on Tuesdays. These proportions are calculated in two ways: empiric estimates (real world-based projections) and Nowcast estimates (model-based projections).
Monitoring variant proportions
SARS-CoV-2, the virus that causes COVID-19, is constantly accumulating mutations in its genetic sequence over time. These mutations can result in new variants with different traits. New variants of SARS-CoV-2 are expected to continue to emerge. Some variants will emerge and disappear, while others will emerge and continue to spread and may replace previous variants.
It is important to track the emergence of variants so that those of interest can be further characterized in the lab to determine if they evade immunity and how they spread.
SARS-CoV-2 is classified using its genetic sequence. To identify and track SARS-CoV-2 variants, CDC uses genomic surveillance. The National SARS-CoV-2 Strain Surveillance program receives SARS-CoV-2 specimens for genetic sequencing at CDC. These are combined with surveillance sequences contributed by other labs and entered into public databases. SARS-CoV-2 genetic sequences are analyzed and classified based on how they are related to one another and presented as proportions of variants detected.
Scientists use viruses' genetic sequences (genomic data), combined with observable characteristics (phenotypic data), to determine whether COVID-19 tests, treatments, and vaccines that are authorized or approved for use in the United States will work against emerging variants.
Types of variant proportion data
Empiric Estimates
Empiric estimates are variant proportions that are based on observed genomic data. These estimates are not available for the most recent periods because of the time it takes to generate the sequencing data, including sample collection, specimen treatment, shipping, analysis, and uploading into public databases.
Lineages with estimates that are less than 1% of circulating variants are combined with their parent lineage. When the observed proportion estimate of a lineage with substitutions in the spike protein that could affect vaccine efficacy, transmission or severity, crosses the 1% threshold, it may be separated from its parent lineage and displayed on its own in the variant proportions data.
Nowcast Estimates
Nowcast estimates are model-based projections of variant proportions. They are provided for the most recent period when the "Nowcast on" option is selected below. CDC uses Nowcast to forecast variant proportions before the weighted empiric estimates are available for a given period.
Projections for an emerging lineage with a high growth rate may have a higher degree of uncertainty (wider prediction interval) when it is just beginning to spread and still has low weighted estimates. Projections may also be uncertain during times of delayed reporting (e.g., around holidays). CDC frequently evaluates Nowcast to improve performance.
Using CDC’s Variant Proportions Dashboard
Instructions: Data in the chart and table show the estimated variant proportions for the most common variants in the specific timeframe. To see the proportions and confidence intervals/prediction intervals for all the common variants in the specific timeframe, hover over a bar (timeframe) in the chart.
Nowcasting: The default setting for the chart is to display CDC's Nowcast estimates. Estimates of variant proportions for more recent periods may change as more data are reported.
Empiric Proportions: To provide more representative estimates of circulating variant proportions, CDC uses a subset of the sequence data available. For example, sequences generated from outbreak investigations (such as in a long-term care facility) or from airport surveillance may not represent what is circulating in those communities or nationally. To be included in CDC's analysis, sequences either must be generated at CDC from SARS-CoV-2 specimens submitted for surveillance, or they must be submitted to and tagged in a public database (such as GenBank) as baseline surveillance by state, local, academic, healthcare, or commercial laboratory submitter.
PANGO Lineages
The Pango nomenclature is used by researchers and public health agencies worldwide to track the transmission and spread of SARS-CoV-2, including variants of concern.
The diagram below shows how the Pango lineages are related to each other.
Find a full list of the current Pango lineages.
CDC monitors SARS-CoV-2 viruses from every lineage, but the lineages that are below 1% prevalence or do not have critical differences in the spike protein are included with their parent lineage on this diagram.
Published SARS-CoV-2 sequences
CDC collaborates with state and local public health laboratories and partners, such as the Association of Public Health Laboratories, to increase the number of specimens that are sequenced as part of the National SARS-CoV-2 Strain Surveillance (NS3) program. These collaborative sequencing efforts provide a better understanding of the SARS-CoV-2 variants in the US.