Celemics, Inc.

Blogs

Discover our Innovative Stories

NGS Glossary Part 1: Sample Prep & Library Construction

  • Post category:Blogs

NGS Key Terminology Guide Part 1: Sample Preparation and Library Construction

Glossary of common NGS terms

The foundation of a successful NGS experiment starts with high-quality sample preparation and well-constructed libraries. This post covers essential terms such as fragmentation, end repair, adapter ligation, and indexing, breaking down each step for greater clarity. Whether you’re working with double-stranded or single-stranded protocols, or using tagmentation and UMIs, this guide will help you understand the techniques and why they matter.

The total amount of DNA required to initiate library preparation.
This can vary depending on the application and kit.
Accurate input quantification ensures successful library construction and consistent data yield.

The process of breaking down high molecular weight DNA or RNA into smaller, more uniform fragments suitable for sequencing.
It is a crucial step in library preparation, as most sequencing platforms have optimal fragment size ranges
(e.g., ~100–500 bp for Illumina 150 bp paired-end sequencing).
Common fragmentation methods include mechanical shearing (e.g., sonication), enzymatic cleavage, or chemical treatment.
The resulting fragment size can significantly influence sequencing efficiency and the resolution of downstream analyses,
such as genome assembly or variant detection.

A process that modifies the fragmented DNA ends to produce blunt-ended, 5’-phosphorylated molecules.
This step ensures that the DNA fragments are compatible with adapter ligation.
Typically, it involves filling in 5’ overhangs and removing 3’ overhangs using DNA polymerases and exonucleases.
Proper end repair increases ligation efficiency and consistency across the library

Also known as dA-tailing, this step involves the addition of a single adenine (A) nucleotide to the 3′ ends of the blunt-ended DNA fragments.
This overhang is complementary to the thymine (T) overhang on sequencing adapters, facilitating directional ligation.
In addition, A-tailing helps prevent self-ligation of DNA fragments and reduces adaptor dimer formation, thereby improving ligation specificity and overall library quality.
A-tailing is commonly used in Illumina-compatible library construction protocols.

The enzymatic addition of synthetic DNA adapters to the ends of DNA fragments.
These adapters include platform-specific sequences required for amplification and sequencing.
Efficient adapter ligation is vital for generating high-quality sequencing libraries.

A process that isolates DNA fragments within a specific size range to ensure uniformity in sequencing.
Size selection can be performed using gel electrophoresis, magnetic bead-based methods (e.g., AMPure XP, CeleMag Clean-up Bead),
or specific instuments (e.g., BluePippin).
Selecting the appropriate fragment size can improve sequencing efficiency and data quality.

A method using magnetic beads to purify DNA during library prep, removing enzymes, salts, and small fragments.
The beads are coated with functional groups (e.g., carboxyl groups) that selectively bind DNA under specific buffer conditions.
Bead-based cleanup is used after steps like ligation or PCR to ensure high-quality DNA is carried forward. It offers scalability and reproducibility.

Single-stranded library preparation is a specialized method designed for highly fragmented or damaged DNA.
Unlike standard protocols that rely on double-stranded DNA for end repair and adapter ligation,
this technique attaches adapters directly to single-stranded DNA using single-strand ligation.
The resulting molecules are then converted into double-stranded form through PCR amplification.
This approach increases the recovery of usable sequence data from compromised samples.

Double-stranded library preparation is a standard method for constructing sequencing libraries from intact or high-quality DNA.
It involves enzymatic fragmentation, followed by end repair, adapter ligation, and PCR amplification.
This approach is specifically optimized for double-stranded DNA, allowing both strands to serve as templates throughout the process.
It is widely used in applications such as whole genome, exome, and targeted sequencing,
offering high efficiency and broad compatibility across diverse sample types.

A library preparation method that simultaneously fragments DNA and adds sequencing adapters using a transposase enzyme.
This one-step process is employed in kits such as Nextera, enabling rapid library construction with minimal hands-on time.
By combining multiple steps into a single reaction, tagmentation reduces sample loss and increases workflow efficiency,
particularly in high-throughput or low-input applications.

A polymerase chain reaction (PCR) step performed after adapter ligation to selectively amplify DNA fragments that have been properly ligated.
This step serves two main purposes: (1) to increase the total amount of library DNA and (2) to incorporate index sequences (barcodes) that enable sample multiplexing during sequencing.
Careful optimization of the number of PCR cycles is essential to avoid amplification bias,
overamplification, or the formation of chimeric reads, all of which can negatively affect sequencing quality.

Dual indexing utilizes a combination of two indices (i5 and i7) for each library, enabling a larger number of samples to be multiplexed without index collision.
This combinatorial indexing approach increases sample capacity and reduces the risk of misassignment due to index overlap.
For improved demultiplexing accuracy and to mitigate index hopping, unique dual indices
—where each index pair is used only once across the run—can be employed.

UMIs help distinguish between original DNA fragments and PCR duplicates, improving accuracy in variant calling and
quantitative measurements in applications like cfDNA analysis or single-cell sequencing.
This is achieved by incorporating short sequence tags—either random or from a predefined set—into each original molecule prior to amplification.

The process of accurately measuring the concentration of the prepared sequencing library.
Common quantification methods include fluorometry (e.g., Qubit), electrophoresis-based systems (e.g., Bioanalyzer or TapeStation), and qPCR.
Electrophoresis-based instruments provide not only quantification data but also assess the fragment size distribution.  
Accurate quantification ensures appropriate input amounts for downstream applications such as hybridization-based target enrichment,
and is also critical for optimal loading amounts to sequencer.

Adjusting the concentration of multiple libraries to the same molarity before pooling.
Normalization is essential for multiplexed sequencing, ensuring that each library contributes equally to the sequencing output.
It helps prevent data loss from underrepresented samples.

A technique that enables the simultaneous sequencing of multiple samples in a single sequencing run by tagging each library with a different index(barcode).
After sequencing, each sample’s sequencing data are separated based on these barcodes.

Multiplexing is the process of pooling multiple indexed libraries into a single reaction, which can be performed prior to hybridization-based target enrichment.
This approach enables simultaneous capture and sequencing of multiple samples, leading to reduced reagent usage, hands-on time, and per-sample costs.

Confirm the library’s quantity and quality.
A series of checkpoints throughout the library prep process to assess DNA quality, DNA quantity, size distribution, and presence of adapter dimer.
QC tools include spectrophotometers, fluorometers, and gel-based/capillary electrophoresis systems.
Proper QC prevents library prep failure and sequencing run failure.

A measure of the diversity of unique DNA fragments within a library.
High complexity indicates minimal duplication and good representation of the input sample.
Low complexity can result from over-amplification or low input amounts, reducing data usefulness.

A selective sequencing strategy where specific genomic regions of interest are isolated prior to sequencing.
It increases efficiency by focusing only on biologically relevant targets, allowing deeper coverage and reducing data volume for analysis.

A method where DNA or RNA probes hybridize to complementary sequences in a sample.
These hybrids are captured and isolated using streptavidin-coated beads.
This approach is ideal for detecting a wide range of mutations and works well with fragmented or degraded DNA.

A method where DNA or RNA probes hybridize to complementary sequences in a sample.
These hybrids are captured and isolated using streptavidin-coated beads.
This approach is ideal for detecting a wide range of mutations and works well with fragmented or degraded DNA.

Single-Cell Sequencing involves isolating and sequencing the genome or transcriptome of individual cells,
enabling detailed analysis of cellular heterogeneity within complex tissues or microbial communities.

Contact Us