lobSTR_filter_vcf.pyto filter VCF files generated by lobSTR based on a number of metrics. It then describes our recommended filters for multi-sample calling. Note this script will only work with VCFs generated by lobSTR v4.0.0+.
If you followed the instructions on the install page,
lobSTR_filter_vcf.py should be installed in
$PREFIX/share/lobSTR/scripts/lobSTR_filter_vcf.py. This tool applies locus and call level filters which are annotated in the output VCF file.
where VCFFILE is a VCF file generated by lobSTR and OUTFILE is an output VCF file. By default no filters are applied. A number of options allow for specific locus and call level filters. All locus level filters are of the form
--loc-XX and all call level filters are of the form
--call-XX. For example:
will filter all loci with a reference length of greater than 80bp and all calls with less than 10x coverage.
The output file from this script is in standard VCF format. Loci remaining after filtering will have "PASS" in the FILTER column of the VCF. Non-passing loci will have a comma-separated list of filters that failed. If it was not present, an "FT" field will be added to the FORMAT field. This is a string field that will say:
||(REQUIRED) Input VCF file (generated by
||Min mean -log10(1-Q) cutoff to include a locus||float||0.0|
||Min mean coverage to include a locus.||int||0|
||Max reference length of a locus to include||int||10000|
||Min call rate to include a locus.||float||0.0|
||Max mean absolute difference in distance from read ends to include a call.||float||100.0|
||Min mean -log10(1-Q) cutoff to include a call||float||0.0|
||Min coverage to include a call||int||0|
||Ignore these samples when appying filters. File with one sample/line.||string||-|