Ion Torrent™ barcode design
We designed Ion Torrent™ barcode sets to provide at least 1-error correction (hamming distance 3) in flow space for a large set of barcodes, and 2-error correction (hamming distance 5) for a usefully sized subset of such codes. This goal is accomplished by taking the ternary hamming code on 13 characters and assigning codewords to flows 9-22 to generate flow sequences (flows 1-8 are used for the library key and are not considered here). These flow sequences then have hamming distance 3 and are 1-error correcting. The codewords are further reduced by the constraint of requiring that they correspond to legitimate flow sequences. We also apply the constraint that the flow sequences must correspond to base sequences that are 9 to 11 bases in length. Finally, within the set that satisfies all these constraints, a subset is chosen (by greedy aggregation) such that any pair of flow sequences has hamming distance 5.
To insulate these sequences from the target sequences, a ligation adaptor CGAT is added. The ligation adaptor performs two functions. First, the C in flow 22 provides a synchronized flow that both marks the end of the barcodes and ensures that barcodes ending with "0" do not have sequence overwrite those flows. Secondly, this adapter mitigates any sequence-specific biases caused by the differing barcode sequences.
We provide a tool that classifies barcode reads by finding the flow-space representation of the read and comparing it to the flow-space representation of the barcodes. Classification standardly occurs after the last flow of the key (G), and continues to the end of the barcode sequence provided in flowspace. IonTorrent barcode sets are designed to be synchronous so that they all are classified using the same set of flows.
For flow space classification of custom barcodes, the barcodes should be designed to be compatible with the flow order, be synchronized at a final flow, and be well separated. However, the Torrent Suite™ Software attempts to classify any reasonable set of sequences that are separated in flow space. Many standard software packages also classify usefully in sequence space, and have been found to work well with Ion Torrent™ data.