top of page
  • Writer's pictureDr Edin Hamzić

bcftools: A Short Introduction

Updated: Jan 10

  • ⚠️ This is a short blog post, more like an overview of the bcftools in general. For those unfamiliar with the tool, bcftools is a suite of tools used to work with variant call format (VCF) and the binary variant call format (BCF), which is the binary version of VCF files.

  • ℹ️ VCF and BCF files are used to store genetic variation data. As one might expect, bcftools is widely used in genomics and bioinformatics for different purposes.

  • ℹ️ Bcftools is a command-line tool and can run on various operating systems (Linux, macOS, and Windows).

👨🏼‍🏫 My BCFTOOLS Tutorials

Here is the list of bcftools tutorials that I have created so far:

🤖 What Does bcftools Do?

bcftools provides a range of capabilities for manipulating and analyzing VCF and BCF files, including, among other things:

  • 1️⃣ Converting between VCF and BCF formats

  • 2️⃣ Viewing and filtering variant data stored in VCF and BCF tools

  • 3️⃣ Performing data manipulation operations like merging and intersecting variant sets

  • 4️⃣ Annotating variant data with additional information, such as gene and functional impact

  • 5️⃣ Performing population-level analyses, such as calculating allele frequencies and Hardy-Weinberg equilibrium


🖥️ BCFTOOLS Commands

As I mentioned above, bcftools is a suite of tools or utilities (as those are described in bcftools documentation). Currently, there are 22 bcftools commands/utilities, and those are grouped into three main groups:

  • 1️⃣ indexing command

  • 2️⃣ VCF/BCF manipulation commands, and

  • 3️⃣ VCF/BCF analysis commands.


  • Indexing tools contain only one command, which is bcftools index. To learn more about this command, check out my tutorial about the bcftools index here.

  • The group of VCF/BCF manipulation commands within which there are 11 commands, all listed below:

    • bcftools annotate: annotate and edit VCF/BCF files

    • bcftools concat: concatenate VCF/BCF files from the same set of samples

    • bcftools convert: convert VCF/BCF files to different formats and back

    • bcftools isec: intersections of VCF/BCF files

    • bcftools merge: merge VCF/BCF files files from non-overlapping sample sets

    • bcftools norm: left-align and normalize indels

    • bcftools plugin: user-defined plugins

    • bcftools query: transform VCF/BCF into user-defined formats

    • bcftools reheader: modify VCF/BCF header, change sample names

    • bcftools sort: sort VCF/BCF file.

    • bcftools view: VCF/BCF conversion, view, subset, and filter VCF/BCF files

  • The group of VCF/BCF analysis commands within which there are 10 commands, all listed below:

    • bcftools call: SNP/indel calling

    • bcftools consensus: create a consensus sequence by applying VCF variants

    • bcftools cnv: HMM CNV calling

    • bcftools csq: call variation consequences

    • bcftools filter: filter VCF/BCF files using fixed thresholds

    • bcftools gtcheck: check sample concordance, detect sample swaps and contamination

    • bcftools mpileup: multi-way pileup producing genotype likelihoods

    • bcftools polysomy: detect number of chromosomal copies

    • bcftools roh: identify runs of autozygosity (HMM)

    • bcftools stats: produce VCF/BCF stats

This was the first blog post where I wanted to give you a short introduction. In every next blog post, I will present one of the above commands with practical examples. Also, as we go, I will probably expand this article with additional information and explanations.




Comments


bottom of page