lobSTR: a short tandem repeat profiler for next generation sequencing data

genotyping y-str/codis
validation sets

Best practices for using lobSTR with BWA-MEM alignments


The original lobSTR pipeline involves running two steps:

When lobSTR was developed, existing aligners like bwa aln and bowtie were very insentitive to large insertions or deletions from the reference genome. Therefore, these aligners missed a large fraction of polymorphism at STRs and we needed to create an aligner specific to STR regions. However, the more recently developed BWA-MEM has proven to be quite sensitive to large indels. It is now possible to use BAMs generated by BWA-MEM as input to allelotype. This has a number of advantages: A number of caveats of using external aligners are explained below. We also provide recommended options to allelotype when using BWA-MEM.

Caveats when using external aligners

Different aligners behave differently in terms of their alignment algorithms, output formats, and scoring conventions. The following points are caveats that we found using other aligners:

Recommended allelotype options

Based on comparisons to capillary electrophoresis and on physical inspection of alignments for erroneous calls, we recommend the following options to allelotype:

allelotype \
  ... \
  --filter-mapq0 \
  --filter-clipped \
  --max-repeats-in-ends 3 \
  --min-read-end-match 10
where ... refers to required arguments to allelotype (see usage).