Overview of BaseCaller functionality
In addition to creating a sequence of bases from the 1.wells file information, the BaseCaller module also performs read filtering and read trimming.
Notes on read filtering:
-
Filters out low-quality reads that were marked during signal processing.
-
Filters out reads that fail basecalling filters.
-
Filtered out reads do notappear in the BAM file. The BaseCaller keeps counts of these reads but there is no record of specific reads that are filtered out.
Notes on read trimming:
-
Removes certain bases from the read for quality reasons.
-
The read appears in the BAM file.
-
The removed bases do not appear in theBAM file.
These are the steps performed in the BaseCaller:
-
Remove low-quality reads that were marked during the signal processing step.
-
Trim 5' unique molecular tag (only done if --trim-barcodes=on).
-
Trim extra bases at the 5' end. Controlled by --extra-trim-left (default is 0, meaning no extra trimming).
-
Filter out reads that are too short. Controlled by --min-read-length and -- trim-min-read-len.
-
Trim the P1 adapter (at the 3' end).
-
Perform quality trimming. Affect ed by --trim-qual-window-size and -- trim-qual-cutoff.
Notes about quality trimming:
-
The purpose of quality trimming is to identify where quality issues begin at the end of a read. We try to identify when bases fall below a quality threshold and trim both those bases and a bit before those bases.
-
The parameter --trim-qual-window-size sets the window size for quality trimming. The algorithm slides through the sequence of bases and, each time the window shifts, computes the mean Base QV value for all bases in the window.
-
If the mean Base QV value for all bases in the window falls below a threshold (set the by parameter --trim-qual-cutoff, default 16), then we trim all bases from the center of the window at that time to the 5' end.