Whole Genome Sequencing (WGS) vs. Whole Exome Sequencing (WES)

Abstract DNA background

“Should I choose whole genome sequencing (WGS) or whole exome sequencing (WES) for my project?” . WGS is a clear winner as it allows you to interrogate single-nucleotide variants (SNVs), indels, structural variants (SVs) and copy number variants (CNVs) in both the ~1% part of the genome that encodes protein sequences and the ~99% of remaining non-coding sequences. WES still costs a lot less than WGS, allowing researchers to increase sample number, an important factor for large population studies. WES does however have its limitations

Advantages of Whole Genome Sequencing

  • Allows examination of SNVs, indels, SV and CNVs in coding and non-coding regions of the genome. WES omits regulatory regions such as promoters and enhancers.
  • WGS has more reliable sequence coverage. Differences in the hybridization efficiency of WES capture probes can result in regions of the genome with little or no coverage.
  • Coverage uniformity with WGS is superior to WES. Regions of the genome with low sequence complexity restrict the ability to design useful WES capture baits, resulting in off target capture effects.
  • PCR amplification isn’t required during library preparation reducing the potential of GC bias. WES frequently requires PCR amplification as the bulk input amount needed to capture is generally ~1 ug of DNA.
  • Sequencing read length isn’t a limitation with WGS. Most target probes for exome-seq are designed to be less than 120 nt long, making it meaningless to sequence using a greater read length.
  • A lower average read depth is required to achieve the same breath of coverage as WES.
  • WGS doesn’t suffer from reference bias. WES capture probes tend to preferentially enrich reference alleles at heterozygous sites producing false negative SNV calls.
  • WGS is more universal. If you’re sequencing a species other than human your choices for exome sequencing are pretty limited.

Advantages of Whole Exome Sequencing

  • WES is targeted to protein coding regions, so reads represent less than 2% of the genome. This reduces the cost to sequence a targeted region at a high depth and reduces storage and analysis costs.
  • Reduced costs make it feasible to increase the number of samples to be sequenced, enabling large population based comparisons.

Most functional related disease variants can be detected at a depth of between 100-120x (1) which definitely makes the cost case for exome sequencing. Today on Genohub if you want to perform whole human genome sequencing at a depth of ~35X, the cost is roughly $1700/sample. If you were to request human exome-sequencing services with 100x coverage, using a 62 Mb target region, your cost would be $550/sample. Both of these prices include library preparation. So in terms of producing data WES is still significantly cheaper than WGS. It’s important to note that this doesn’t include your data storage and analysis costs which can also be quite a bit higher with whole genome sequencing.

It’s also important to remember that depth isn’t everything. The better your uniformity of reads and breath of coverage, the higher the likelihood you’ll actually find de novo mutations and call them. And that’s the main goal, if you can’t call SNPs or INDELs with high sensitivity and accuracy, then the most high depth sequencing runs are worthless.

To conclude, whole genome sequencing typically offers better uniformity and balanced allele ratio calls. While greater exome-seq depth can match this, sufficient mapped depth or variant detection in specific regions may never reach the quality of WGS due to probe design failures or protocol shortcomings. These are important considerations when examining tissues like primary tumors where copy number changes and heterogeneity are confounding factors.