Genome Sequencing Of SARS-CoV-2 Variants

In News

The Government of India (GoI) has notified the genome sequencing of 5% of the COVID positive cases and to maintain a record of it.

About

  • The work began in January by sequencing samples of people who had a history of travel from the United Kingdom and a proportion of positive samples in the community.
  • The institutes involved in the Indian SARS-CoV-2 Genomics Consortium (INSACOG) have expertise in genome sequencing.
  • A new variant, which was found in the UK, is defined by multiple mutations in the Spike region, as well as mutations in other genomic regions. 
  • As per Department of Biotechnology (DBT), these mutations are rapidly increasing the number of variants of the virus and are more transmissible than previous ones.

Findings of DBT

  • The DBT has identified “foreign” variants namely the B.1.1.7 (first identified in the United Kingdom) and the B.1.351 (first found in South Africa) and a small number of P2 variants (from Brazil). 
  • Various labs have verified the presence of ‘double mutant’ variant B.1.617 (primarily due to two mutations E484Q and L452R – on the spike protein) in India. 
  • B.1.617 was marked as an international ‘variant of concern’ but there is no evidence yet to show that the variant is associated with increased disease severity. 
  • INSACOG labs also found that the B.1.1.7 variant, marked by increased infectivity, is distinctly more prevalent in several northern and central Indian States.

Genome Sequencing

  • A genome is a complete set of genetic instructions which are present in an organism in its DNA. Sequencing is the sequence of occurrences of the four nucleotide bases i.e., adenine (A), cytosine (C), guanine (G), and thymine (T)
  • The human genome is made up of over 3 billion of these genetic letters. The whole genome can’t be sequenced all at once because available methods of DNA sequencing can only handle short stretches of DNA at a time.
  • While human genomes are made of DNA (Deoxyribonucleic acid), a virus genome can be made of either DNA or RNA (Ribonucleic acid). Coronavirus is made of RNA. Every organism has a unique genome sequence. 
  • Genome sequencing is a technique that reads and interprets genetic information found within DNA or RNA.

Approaches for Genome Sequencing

  • There are two approaches to the task of cutting up the genome and putting it back together again. 
  • The “clone-by-clone” approach involves first breaking the genome up into relatively large chunks, called clones, about 150,000 base pairs (bp) long. Scientists use genome mapping techniques to figure out where in the genome each clone belongs. 
  • Next they cut each clone into smaller, overlapping pieces the right size for sequencing—about 500 BP each. Finally, they sequence the pieces and use the overlaps to reconstruct the sequence of the whole clone.
  • The “whole-genome shotgun” method involves breaking the genome up into small pieces, sequencing the pieces, and reassembling the pieces into the full genome sequence.

Image Courtesy: Thinglinks

Significance of Genome Sequencing

  • Understands the Virus: The purpose of genome sequencing is to understand the role of certain mutations in increasing the virus’s infectivity. Some mutations explain immune escape or the virus’s ability to evade antibodies which has consequences for vaccines.
  • Studying Efficacy: It helps in studying whether the vaccines developed so far are effective against such mutant strains of the virus and if can prevent re­infection and transmission. 
  • Tracing Mutations: Sequencing of the genomes of viral strains is important from a “know-thy-enemy” point of view as it becomes easier to trace the mutations. Scientists can find mutations much more easily and quickly.
  • Developing Vaccines: Knowledge generated through the vital research assists in developing diagnostics and potential therapeutics and vaccines now and for the potential diseases in the future.
  • Vital Information: important information and findings can be derived from the Genome sequencing of those who tested positive for COVID.

Challenges in Genome Sequencing in India

  • Very High target: The aim was to sequence at least 5% of the samples, the minimum required to keep track of the virus variants. This has so far been only around 1%, primarily due to insufficient reagents and tools necessary to scale up the process. 
  • Low Capacity: The ten laboratories together have a capacity to sequence about 30,000 samples a month, or 1,000 a day, six times less than what is needed to meet the target.
  • Fund crunch: Funding is being delayed repeatedly. INSACOG asked for Rs 100 crore; but it was not until March that any funding arrived and it received Rs 70 crore.
  • Sample Collection: The healthcare system is already over stretched and this is one additional task for them to sort and package samples and RNA preparations regularly for shipping in a cold chain to sequencing centres along with recording extensive metadata to make sequence information useful.
  • Dependence on Imports: The process of genome sequencing slowed down due to the Atma Nirbhar scheme which banned imports of goods worth less than Rs 200 crore to promote local procurement. Even after the exemption, some special plastics inadvertently remained within the import ban affecting the process.
  • International aspect: The poor progress in genome sequencing also affects India’s image abroad, as all countries are required to upload data into a common global repository, called the ‘Global Initiative on Sharing all Influenza data’, or GISAID.

Way Forward

  • The Number of laboratories should be increased to get the research going at the speed required. The Union Health Minister has announced the opening of 17 more laboratories for the same.
  • The data collected from genome sequencing of the virus will further aid in studying linkages between the variants and epidemiological waves (super-spreader events, outbreaks) of the virus.

Comparison of DNA and RNA

As both DNA and RNA are used to store genetic information but there are clear differences between them. The following table summarizes the key points:

Comparison

DNA

RNA

Function

Long-term storage of genetic information; transmission of genetic information to make other cells and new organisms.

Used to transfer the genetic code from the nucleus to the ribosomes to make proteins. RNA is used to transmit genetic information in some organisms 

Structural Features

B-form double helix. DNA is a double-stranded molecule consisting of a long chain of nucleotides.

A-form helix. RNA usually is a single-strand helix consisting of shorter chains of nucleotides.

Composition of Bases and Sugars

deoxyribose sugar

phosphate backbone

adenine, guanine, cytosine, thymine bases

ribose sugar

phosphate backbone

adenine, guanine, cytosine, uracil bases

Propagation

DNA is self-replicating

RNA is synthesized from DNA on an as-needed basis.

Base Pairing

AT (adenine-thymine)

GC (guanine-cytosine)

AU (adenine-uracil)

GC (guanine-cytosine)

Ultraviolet Damage

DNA is susceptible to UV damage.

Compared with DNA, RNA is relatively resistant to UV damage.

Indian SARS­CoV­2 Genomic Consortia (INSACOG)

  • Established by MOHFW in Dec 2020
  • INSACOG is a consortium of 10 labs across the country tasked with scanning COVID­19 samples from swathes of patients and flagging the presence of variants that were known to have spiked transmission internationally. 
  • It has also been tasked with checking whether certain combinations of mutations were becoming more widespread in India. 
  • Aim: To monitor the genomic variations in the SARS-CoV-2 on a regular basis through a multi-laboratory network.
    • Assist in developing potential vaccines in the future.
    • The NCDC will maintain a database of all samples of the new variants of public health significance. 
    • The data will be epidemiologically analysed, interpreted and shared with state/district for investigation, contact tracing and planning response strategies.

(Image Courtesy: http://dbtindia.gov.in/insacog )

Triple Mutant

  • The terms double or triple mutants are colloquial. 
  • Double or triple mutations signify the number of mutations relevant as immune escape mutant. 
  • B.1.617, initially termed as double mutant, has three new spike protein mutations, namely S: E484Q, L452R and P681R on the background of D614G lineage that was the dominant lineage since last year. 
  • Technically double or triple mutants refer to the same variant.

Global Initiative on Sharing all Influenza data (GISAID)

  • It was launched on the occasion of the Sixty-first World Health Assembly in May 2008.
  • It is a public platform for sharing genome – sequences by countries.
  • In 2010, the Federal Republic of Germany became the official host of the GISAID platform
  • It promotes the rapid sharing of data from all influenza viruses and the coronavirus causing COVID-19.
  • In 2013, the European Commission recognized GISAID as a research organization and partner in the PREDEMICS consortium, a project on the Preparedness, Prediction and the Prevention of Emerging Zoonotic Viruses with Pandemic Potential using multidisciplinary approaches.

Source: TH