New in version 4.0.6
Minor bug fixes and several new features:
- Long-awaited fix handling the case when read group IDs are replicated across BAM files input to
allelotype now recognizes either
*.bam.bai BAM index files.
- Added option
allelotype, which creates VCFs with no PL field, greatly reducing their size.
- Added option
allelotype, which controls how many possible alleles are considered when computing genotype likelihoods.
- Added VCF fields PQ and DPA.
- Bug fix for precision error leading to low quality scores for high coverage heterozygous sites.
New in version 4.0.0
The major change for this version is the ability for
allelotype to take in BAM files from external aligners, mainly from
BWA-MEM. This allows STR genotyping across large panels of samples that already have existing alignments for only a fraction of the computational cost of running the whole lobSTR pipeline.
We have also added:
- A VCF filtering script and more documentation of our Recommended practices for filtering STR calls.
- More details about comparing lobSTR calls to capillary electrophoresis calls (see Validation sets).
- The option
--output-bams is a very helpful debugging option that has been added to
allelotype. This option outputs two BAM files, one containing all reads that were included for analysis, and another that includes all reads that were filtered with a tag explaining why they were not included.
New in version 3.0.3
This version had minor changes.
- Updated copyright dates
- Cleaned up python scripts
- Added check to allelotyper to test whether period is in range
- Scripts to install test reference and lobSTR hg19 reference
- Checks in lobstr_index.py to make sure required binaries are installed
New in version 3.0.2
This version has major algorithmic changes: we have removed the FFT period detection step and attempt to align STR-containing reads to all STRs at once, rather than only those with a specified motif. As a result, we have changed the index structure, and you will need to download a new index from the downloads page to use lobSTR 3. This new version significantly increases our alignment sensitivity.
lobSTR 3 usage remains largely the same. However we have made some significant additional improvements:
- greatly expanded documentation pages with best practice recommendations, including a tutorial on calling Y-STRs using lobSTR.
- removed blas and fftw library dependencies
- lobSTR should now compile and run cleanly on most modern Mac OSX computers.
- added many options for filtering potentially problematic reads before allelotyping.
New in version 2.0.8
This version should have big alignment speedups compared to previous versions (up to 5-10x faster than before). This version also fixes a minor bug for an off-by-one error in how allele sequences are reported in the REF and ALT fields of the VCF file.
New in version 2.0.5
This version contains minor changes. Most notably we changed the way we package binaries, and the binaries should hopefully run smoothly on most systems.
New in version 2.0.4
This version mostly gives bug fixes:
- Bug in coverage values reported in VCF files
- Fixed problems parsing BAM tags in the allelotyper on BAMs that have been modified after running lobSTR.
- A number of configuration cleanups that should allow lobSTR to compile on most systems.
New in version 2.0.3
- Greatly expanded STR reference now includes mononucleotide repeats.
- Updated VCF format automatically generated by the allelotype step.
- The allelotype step can now process multiple samples at once.
- Quality metrics automatically reported without the need to run additional scripts.
- Updated stutter noise scoring scheme. Full documentation is on the documentation page.
- Allow for processing paired-end bams if some mate pairs are missing.
- Parameters for each run are now uploaded to Amazon S3. To turn this off, use --noweb.
- Redundant aligned.tab and genotypes.tab files are no longer generated. Only BAM and VCF files are output.
- Removed feature to report reads that partially span an STR
New in version 2.0.2
- Removed R dependency of allelotype step and improved stutter model.
- Added option to add a read group tag to alignments, plus set map quality in BAM files to 255 in order to allow compatibility with downstream analysis tools.
- Changed --sex option to --haploid, which allows users to determine which chromosomes should be treated as haploid or hemizygous. This allows for training on organisms for which the hemizygous chromosomes are not necessarily chrX and chrY.
- Added the option to exclude information on partially spanning reads from the allelotype output file.
New in version 2.0
Previous versions are available on github. Contact mgymrek AT mit DOT edu for versions before v2.0.2.
- Support for paired end reads. lobSTR uses mate pair information to
increase confidence in STR alignments. lobSTR can still run in
- Paired-end read stitching: when possible, lobSTR will stitch
overlapping paired-end reads that in some cases can span longer STRs
than possible with a single read.
- Support for reading gzipped FASTA and FASTQ files.
- (beta) Improved mapping quality and allelotype confidence scores
- (beta) Alignment of partially spanning reads: lobSTR
reports alignments of reads that only partially span STR
regions. The allele length reported gives a lower bound on the true
- (beta) Conversion of allelotype output to VCF format
- (beta) Support for running on Amazon Web Services
- October 5, 2013: (index v3.0.2) edited Y-STR reference for markers with multiple forms to avoid these being called as multi-mappers:
- removed chrY:16167356-16167402, other form of DYS413a/b
- removed chrY:27883520-27883558, other form of DYS459a/b
- removed chrY:20801586-20801745, other form of DYS385a/b
- removed chrY:20557622-20557659, other form of YCAIIa/b
- removed chrY:26874618-26874799, chrY:25240873-25241054, chrY:25471939-25472124, other form of DYS464a-d
- annotated chrY:20440393-20440433 as DYS395S1a/b with ref allele 15
- September 23, 2013: changes were made to both indices:
In both versions of the index, the following Y-STR and CODIS annotation corrections were made:
Additionally, the extraneous file "last_reference.bed" was removed from the Version 2 index.
- The motif for DYS590 was changed from "TTTG" to "GTTTT"
- The start coordinate of PentaE was changed from 97374244 to 97374245.
- The start coordinate of PentaD was changed from 45056091 to 45056086.
- The coordinates of D5S818 were changed from chr5:123111246-123111289 to chr5:123111250-123111293.
- The reference allele of D8S1179 was changed from 11 to 13.
- The reference allele of D21S11 was changed from 31.75 to 29.
- September 10, 2013: uploaded initial version of the lobSTR v3 index lobSTR_v3_hg19_resource_bundle.tar.gz
Version 2 (2.0.3+)
Old lobSTR indices, compatible with version 2.0.0-2.0.2: