wiki:BigComputeTemplates

Version 2 (modified by Barbera van Schaik, 13 years ago) (diff)

--

Script templates

Implemented components based on the Groningen pipeline

Template (grid component)

Alignment, realignment, recalibration, stats

  • pe0--fastqc.ftl (FastqToFastQC, lfn://lfc.grid.sara.nl:5010/grid/vlemed/AMC-e-BioScience/Sequence_WF/quality/Workflow/FastqToFastQC.gwendia)
  • pe00-bwa-align-pair1.ftl (BwaIllumina, lfn://lfc.grid.sara.nl:5010/grid/vlemed/AMC-e-BioScience/Sequence_WF/gvnl/Workflows/BwaIllumina.gwendia)
  • pe01-bwa-align-pair2.ftl (BwaIllumina, lfn://lfc.grid.sara.nl:5010/grid/vlemed/AMC-e-BioScience/Sequence_WF/gvnl/Workflows/BwaIllumina.gwendia)
  • pe02-bwa-sampe.ftl (BwaIllumina, lfn://lfc.grid.sara.nl:5010/grid/vlemed/AMC-e-BioScience/Sequence_WF/gvnl/Workflows/BwaIllumina.gwendia)
  • pe03-sam-to-bam.ftl (BwaIllumina, lfn://lfc.grid.sara.nl:5010/grid/vlemed/AMC-e-BioScience/Sequence_WF/gvnl/Workflows/BwaIllumina.gwendia)
  • pe04a-HsMetrics.ftl (CalculateHsMetrics, lfn://lfc.grid.sara.nl:5010/grid/vlemed/AMC-e-BioScience/Sequence_WF/gvnl/Workflows/CalculateHsMetrics.gwendia)
  • pe04b-picardQC.ftl (PicardQC, lfn://lfc.grid.sara.nl:5010/grid/vlemed/AMC-e-BioScience/Sequence_WF/gvnl/Workflows/PicardQC.gwendia)
  • pe04-sam-sort.ftl (SamSort, lfn://lfc.grid.sara.nl:5010/grid/vlemed/AMC-e-BioScience/Sequence_WF/gvnl/Workflows/SamSort.gwendia)
  • pe05-mark-duplicates.ftl (MarkDuplicates, lfn://lfc.grid.sara.nl:5010/grid/vlemed/AMC-e-BioScience/Sequence_WF/gvnl/Workflows/MarkDuplicates.gwendia)
  • pe06-realign.ftl (ReAlign, lfn://lfc.grid.sara.nl:5010/grid/vlemed/AMC-e-BioScience/Sequence_WF/gvnl/Workflows/ReAlign.gwendia)
  • pe07-fixmates.ftl (FixMates, lfn://lfc.grid.sara.nl:5010/grid/vlemed/AMC-e-BioScience/Sequence_WF/gvnl/Workflows/FixMates.gwendia)
  • pe08-covariates-before.ftl (GatkRecalibrateAllSteps, lfn://lfc.grid.sara.nl:5010/grid/vlemed/AMC-e-BioScience/Sequence_WF/gvnl/Workflows/GatkRecalibrateAllSteps.gwendia)
  • pe09-recalibrate.ftl (GatkRecalibrateAllSteps, lfn://lfc.grid.sara.nl:5010/grid/vlemed/AMC-e-BioScience/Sequence_WF/gvnl/Workflows/GatkRecalibrateAllSteps.gwendia)
  • pe10-sam-sort.ftl (GatkRecalibrateAllSteps, lfn://lfc.grid.sara.nl:5010/grid/vlemed/AMC-e-BioScience/Sequence_WF/gvnl/Workflows/GatkRecalibrateAllSteps.gwendia)
  • pe11-covariates-after.ftl (GatkRecalibrateAllSteps, lfn://lfc.grid.sara.nl:5010/grid/vlemed/AMC-e-BioScience/Sequence_WF/gvnl/Workflows/GatkRecalibrateAllSteps.gwendia)
  • pe12-analyze-covariates.ftl (GatkRecalibrateAllSteps, lfn://lfc.grid.sara.nl:5010/grid/vlemed/AMC-e-BioScience/Sequence_WF/gvnl/Workflows/GatkRecalibrateAllSteps.gwendia)

Merge bam per sample and perform SNP and indel calling

  • vc00a-unified-genotyper.ftl to do
  • vc00b-variant-filtration.ftl to do
  • vc00c-variant-eval.ftl to do
  • vc00d-picardMetrics.ftl (PicardQC, lfn://lfc.grid.sara.nl:5010/grid/vlemed/AMC-e-BioScience/Sequence_WF/gvnl/Workflows/PicardQC.gwendia)
  • vc00-merge.ftl to do
  • vc00.merge.ftl to do
  • vc01-coverage.ftl to do
  • vc01.unified_genotyper.ftl to do
  • vc02.picardQC.ftl (PicardQC, lfn://lfc.grid.sara.nl:5010/grid/vlemed/AMC-e-BioScience/Sequence_WF/gvnl/Workflows/PicardQC.gwendia)
  • vc02-realigner-target-creator.ftl to do
  • vc03.coverage.ftl to do
  • vc03-realign.ftl (ReAlign, lfn://lfc.grid.sara.nl:5010/grid/vlemed/AMC-e-BioScience/Sequence_WF/gvnl/Workflows/ReAlign.gwendia)
  • vc04-fixmates.ftl (FixMates, lfn://lfc.grid.sara.nl:5010/grid/vlemed/AMC-e-BioScience/Sequence_WF/gvnl/Workflows/FixMates.gwendia)
  • vc05-indel-genotyper-v2.ftl to do
  • vc06-filter-indels.ftl to do
  • vc07-unified-genotyper.ftl to do
  • vc08-make-indel-mask.ftl to do
  • vc09-variant-filtration.ftl to do
  • vc10-variant-eval.ftl to do
  • vc11-name-sort-bam.ftl (SamSort, lfn://lfc.grid.sara.nl:5010/grid/vlemed/AMC-e-BioScience/Sequence_WF/gvnl/Workflows/SamSort.gwendia)
  • Pindel (Pindel, lfn://lfc.grid.sara.nl:5010/grid/vlemed/AMC-e-BioScience/Sequence_WF/pindel/Workflows/Pindel.gwendia)

Other workflow components

This list of other workflow components are available

  • Splitting of fastq files
  • Building a BWA index on the genome sequence (base space and color space)
  • BWA for shotgun reads (base space and color space) It is possible to do parameter sweeps. Output is in bam format
  • Merge bam results
  • Samtools pileup
  • Varscan (pileup to snp, indel and cns)
  • Bam2coverage creates a UCSC wiggle file to display the genome coverage (per 50kbp)
  • Coverage-per-base determines the coverage for every base in the genome and it summarizes the results (coverage versus frequency)
  • Annovar (works for hg18, working on other assemblies). This is a pipeline to annotate variants (gene, dbsnp, hapmap, 1000g, conservation, etc)
  • FastqC