GCC BOSC 2018 has ended
The 2018 Galaxy Community Conference (GCC2018) and Bioinformatics Open Source Conference 2018 (BOSC2018) are meeting together in Portland, Oregon, United States, June 25-30, 2018.  There will be two days of training, a two+ day meeting, and four days of intense collaboration.  The meeting features joint & parallel sessions, shared keynotes, poster & demo sessions, birds-of-a-feather, and social events.  GCCBOSC is organized by Oregon Health & Science University and will be at Reed College.

Back To Schedule
Tuesday, June 26 • 3:30pm - 6:00pm
GATK4: What's new and how to run it

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

→ Prep

Key:  BB | XB |  -  |  -  | GG | CL

GATK4 is the brand new version of the Genome Analysis Toolkit (GATK), an open-source genomics software package focused on variant discovery. This workshop will highlight the key changes and updates in the new 4.0 version which was released in January 2018 (see https://software.broadinstitute.org/gatk/gatk4).

The goal of the workshop is to equip participants with the essential know-how to get started with GATK4, whether they have previously used GATK or not. We will walk participants through hands-on exercises aimed at developing familiarity with the GATK4 command line. We will also guide participants through several different ways of running GATK4 tools and pipelines on publicly available platforms including but not limited to Galaxy. We hope this will help participants understand what are their options and choose the platform that best suits their needs and abilities.

The workshop will cover the following topics:

1. Introduction to GATK and the Best Practices:
  • Purpose, variant calling basics and standard data types

2. What's new in GATK4:
  • New syntax/invocations, performance improvements and tips & tricks for using GATK effectively
  • Expanded scope of analysis:
    • Scaling germline variant discovery with GenomicsDB
    • Calling somatic short variants with the new and improved Mutect2
    • Calling somatic copy number variants with GATK CNV

3. Options for running GATK
  • Running tools individually on a laptop (with Docker)
  • Running Spark-capable tools individually on a Spark cluster (via Google Dataproc)
  • Running pipelines on a laptop using Cromwell
  • Running pipelines on Google Cloud using Cromwell + Pipelines API
  • Running pipelines on the FireCloud analysis portal through the web GUI
  • Running pipelines on FireCloud through the API + Python bindings
  • Running tools and pipelines on Galaxy

Free credits will be provided for running on Google Cloud. For more information on FireCloud, see https://software.broadinstitute.org/firecloud.

  • Basic familiarity with terms and concepts of genetics and genomics, including high-level understanding of high-throughput sequencing technologies and file formats used in genomic analysis.
  • Basic familiarity with the command line environment and usage of command line tools.
  • NO familiarity is expected with Spark or cloud computing concepts.

avatar for Geraldine Van der Auwera

Geraldine Van der Auwera

Director of Outreach and Communications, Broad Institute Data Sciences Platform
I direct outreach and communication efforts for the software and services developed by the Data Sciences Platform at the Broad Institute, which include GATK, the Broad's open source toolkit for variant discovery analysis; the Cromwell/WDL workflow management system; and Terra.bio... Read More →

Kate Noblett

Broad Institute

Tuesday June 26, 2018 3:30pm - 6:00pm PDT
PAB 332 Performing Arts Building, Reed Campus