Table 1

Extracted information from three types of input files

Data sourceData typeExtracted information
Annotated variant filesAnnotationChromosome, start position, end position, reference allele, alternative allele, allele frequency in the 1000 Genome Project and the NHLBI-ESP6500 project, ClinVar, biological function (such as SIFT, PolyPhen and CADD score) and many others
VCF filesVariationSample family ID, individual ID, called variant genotypes, read depths and Phred quality scores
BAM filesCoverage (read depth)Coverage of each site of every sequencing sample (∼3 billion sites in a WGS)
  • WGS, whole-genome sequencing.