Next-generation sequencing (NGS) technology is widely used across various fields for generating large-scale genomic data at a cost-effective rate. Due to the vast amount of data produced in a single sequencing run, index sequences are employed to distinguish and allocate data for each sample. This allows for simultaneous sequencing of multiple samples in single run, enabling efficient processing. However, in some sequencing platforms, there is a phenomenon known as Index Hopping, where the indices assigned to samples are swapped, leading to the mixing of data between samples. For instance, Illumina’s sequencing platforms using patterned flow cells may show around 0.5-3% of index hopping.
Index hopping can occur due to residual adapters and index primer oligos remaining within the sequencing library. When adapters or primers combine with the NGS library and replace the original index sequences, index hopping occurs. Removing these residual adapters during the experiment to increase the purity of the NGS library can minimize index hopping. In typical NGS library preparation, several purification steps are conducted to remove excess adapters and primers. However, in PCR-free library preparation, where such purification steps are omitted, higher rates of index hopping are reported.
Additionally, Unique Dual Index (UDI) can be used to prevent the influence of index hopping. Dual index refers to having indices on both ends of the NGS library. The UDI ensures that each sample uses unique sequences for both indices, avoiding any sharing of indices between samples. Consequently, even if index hopping occurs and one index is swapped, the altered combination does not overlap with other samples, ergo can remove the affected data to prevent contamination.
Figure. Overview of index hopping and prevention effect by UDI
In cases of germline variant analyses, which target variants with high allele frequencies, the impact of index hopping in results may be negligible. However, for analyses involving low-frequency data, such as somatic variant analysis or pathogen detection, there is a high risk of reporting non-existent variants or pathogens due to index hopping. To enhance accuracy in such analyses, using UDI is crucial for ensuring the precision of results.