NHGRI logo

Laboratory and Bioinformatics Staff

The Secondary Genomics Finding Service (SGFS) will annotate your existing research exome/genome sequence data for the presence of possible actionable secondary variants in a list of genes that the SGFS will develop, curate, and update. The current list is based on the genes and variant types specified by the most recent ACMG clinical ES/GS secondary findings report Kalia et al., 2017. This list will be re-evaluated and updated at least annually by the SGFS with notification of any changes in the list to currently participating PIs and all IRP IRBs.

Instructions

  1. Please review the guidance document in the resources section below. 
  2. Complete the application form and send via email to SGFS staff.
  3. After confirmation of an accepted application, complete the attached Data Submission Form and send to SGFS staff prior to submitting data.

Contact Julie Sapp for more information.

  • Instructions
    1. Please review the guidance document in the resources section below. 
    2. Complete the application form and send via email to SGFS staff.
    3. After confirmation of an accepted application, complete the attached Data Submission Form and send to SGFS staff prior to submitting data.

    Contact Julie Sapp for more information.

Sequence Data Submission

Prior to data submission, please submit a completed and initials SGFS Data Submission Form to Julie Sapp. Please ensure that your data does not include any participant identifiers. Please note files must be submitted in variant call (.vcf) format and should be restricted to the coding/splice regions of the genes included in the ACMG list of genes for return of secondary variants. See below for additional details:

VCF (Variant Call Format) file requirements

  1. VCF file
    1. If there is only one sample, a single VCF file for the patient is acceptable.
    2. If there are more than one sample, a single multi-sample VCF for all samples to be analyzed is acceptable. Please do not send individual VCF files.
  2. Bioinformatics pipeline
    1. We strongly recommend the VCF file to be generated using the GATK best practices pipeline
    2. For other pipelines, please check recommended QC filter to exclude low quality variants.
  3. Reference genome
    1. GRCh37, hg19 or other variation of reference human genome GRCh37/hg19 (eg. b37) is acceptable.
    2. Please note that hg38 is not accepted and will not be processed. If the VCF file is in hg38 coordinates, 1) lift over the coordinates to GRCh37/hg19, then 2) proceed to “pre-processing a VCF file” section.

Pre-processing a VCF file

  1. VCF file must be restricted to ACMG 59 genes. Coordinates are provided as a BED file ACMG_59_isplice2_esplice2.bed. Please check the compatibility of chromosome notations (i.e. “chr1” vs “1”) when using the BED file for subsetting the VCF file.
  2. Filter the VCF file to only include high quality (“PASS”) variants. If GATK pipeline was not used, please filter variants to only include high quality variants suggested by the method used.

Mode of Transferring files

  1. For NIH investigators, please use NIH Secure Email to send the VCF file to henoke.shiferaw@nih.gov. 
  2. For non-NIH investigators, Globus is available to any investigator sending files to NIH (No license required). A Globus endpoint will be shared where the file can be transferred to.
  • Sequence Data Submission

    Prior to data submission, please submit a completed and initials SGFS Data Submission Form to Julie Sapp. Please ensure that your data does not include any participant identifiers. Please note files must be submitted in variant call (.vcf) format and should be restricted to the coding/splice regions of the genes included in the ACMG list of genes for return of secondary variants. See below for additional details:

    VCF (Variant Call Format) file requirements

    1. VCF file
      1. If there is only one sample, a single VCF file for the patient is acceptable.
      2. If there are more than one sample, a single multi-sample VCF for all samples to be analyzed is acceptable. Please do not send individual VCF files.
    2. Bioinformatics pipeline
      1. We strongly recommend the VCF file to be generated using the GATK best practices pipeline
      2. For other pipelines, please check recommended QC filter to exclude low quality variants.
    3. Reference genome
      1. GRCh37, hg19 or other variation of reference human genome GRCh37/hg19 (eg. b37) is acceptable.
      2. Please note that hg38 is not accepted and will not be processed. If the VCF file is in hg38 coordinates, 1) lift over the coordinates to GRCh37/hg19, then 2) proceed to “pre-processing a VCF file” section.

    Pre-processing a VCF file

    1. VCF file must be restricted to ACMG 59 genes. Coordinates are provided as a BED file ACMG_59_isplice2_esplice2.bed. Please check the compatibility of chromosome notations (i.e. “chr1” vs “1”) when using the BED file for subsetting the VCF file.
    2. Filter the VCF file to only include high quality (“PASS”) variants. If GATK pipeline was not used, please filter variants to only include high quality variants suggested by the method used.

    Mode of Transferring files

    1. For NIH investigators, please use NIH Secure Email to send the VCF file to henoke.shiferaw@nih.gov. 
    2. For non-NIH investigators, Globus is available to any investigator sending files to NIH (No license required). A Globus endpoint will be shared where the file can be transferred to.

Last updated: March 12, 2020