By default, the allelotype recalculates the allele supported by each read by adding up all gaps from the entire read, including in the flanking regions. This gives the most concordant results when comparing to STR genotypes made by traditional capillary electrophoresis methods and gives the most reliable genotypes for ~100bp reads.
There is an option to calculate STR allele supported by each read by only including gaps that allelotype determines to be within the boundary of the STR. You can turn this on using the
--dont-include-flank option. While theoretically more accurate than the approach described above, the process of ascertaining the boundary of the STR is still error prone and this option is usually not recommended. However, if you are using very long reads that may span more than a single STR, then it is recommended to set this option.
--min-het-freq INT: Refuse to call something a heterozygote unless the minor allele meets a frequency threshold for supporting reads. Setting this to 0.2 gives good results.
--unit: require length differences to be a multiple of the repeat unit. Reads supporting a non-unit allele are much likely to be due to errors.
--min-border INT: Filter reads that do not extend past both ends of the STR region by at least INT bp.
--min-bp-before-indel INT: Filter reads with an indel occurring less than INT bp from either end of the read
--min-read-end-match INT: Filter reads whose alignments don't exactly match the reference for at least INT bp at both ends.
--maximal-end-match INT: Filter reads whose prefix/suffix matches to the reference are less than or equal to those obtaind when shifting the read ends by distances within INT bp.
--max-diff-ref INT: don't use reads different in length by more than INT bp from the reference allele. If you are specifically interested in big expansions, you must set this to a number significantly higher than the expansion length you are looking for.
--mapq INT: maximum allowed mapq score. This is calculated as the sum of qualities at base mismatches.
--unit: require length differences to be a multiple of the repeat unit.
--min-border 5 --min-bp-before-indel 7 --maximal-end-match 15 --min-read-end-match 5has given good results.
--chromoption and merging the result VCF files afterwards. For example: