dbVar FAQs

This page will be periodically updated to include frequently asked questions (FAQs).
  1. What is 'dbVar'?
  2. How does dbVar differ from the Database of Genomic Variants (DGV)?
  3. What is ‘structural variation’?
  4. What types of structural variation data does dbVar accept?
  5. Does it matter how I detected the structural variation?
  6. Can I submit structural variation data from any organism?
  7. Can I submit structural variation data from human clinical or cancer studies?
  8. Does dbVar distinguish between pathogenic and benign variants?
  9. Does dbVar accept genotype data?
  10. Will I get a unique dbVar ID to use in publications?
  11. What is the difference between the dbVar accessions?
  12. Why is the data in dbVar different from that provided in the publication?
  13. Why do some dbVar variants have >1 location?
  14. What's the difference between an nsv and an nssv?
  15. Can I download the whole database?
  16. How do I know if a given variant(s) in dbVar is real (or of high quality)?
  17. How do I download all the data from a given region or gene?
  18. Is there a way to integrate dbVar data with dbSNP data?
  19. Can I include any clinical information in my data submission?
  20. My submission contains sensitive private clinical information. How does dbVar guarantee the information will remain private?
  21. What is the smallest size structural variant dbVar accepts?
  22. My study has identified novel insertion sequences and I don't have a genomic coordinate for these. How can I submit these to dbVar?
  23. The number of "Other Calls in this Sample" in the Variant Call Information table of the Variant Page is different from the number of variants that are returned when I do a dbVar search for the same study and sample ID. What's going on?
  24. How does dbVar place data submitted on one assembly (e.g. NCBI36) on other assemblies (e.g. GRCh37)?
  25. How do I submit to dbVar?

  1. What is 'dbVar'?

    dbVar is the NCBI database of genomic structural variation. For information on how to navigate dbVar see the dbVar Help page.

  2. How does dbVar differ from the Database of Genomic Variants (DGV)?

    DGV has been a useful resource for the human genetics community with respect to collecting and curating structural variation data for human. DGV are now working with DGVa to extend service and we are working with DGVa to exchange data. Additionally, DGV only contains data for healthy control human samples, while dbVar accepts data from all species and includes clinical data.

  3. What is ‘structural variation’?

    Structural variation (SV) is generally defined as any region of DNA involved in inversions and balanced translocations or genomic imbalances (insertions and deletions), commonly referred to as copy number variants (CNVs). For more information see the Overview of Structural Variation page.

  4. What types of structural variation data does dbVar accept?

    dbVar is a structural variation database designed to store data on variant DNA ≥ 1 bp in size. Practically speaking, we recommend submitting variation data that is >50pb to dbVar and variation data that is <50pb to dbSNP. We can accept diverse types of events, including inversions, insertions and translocations. Additionally, both germline and somatic variants are accepted.

  5. Does it matter how I detected the structural variation?

    No, dbVar accepts submissions from whole genome comparative studies such as computational sequence analysis (including paired-end mapping and SNP genotyping data analysis) and microarray experiments (including BAC, cDNA and oligo array Comparative Genomic Hybridization (aCGH), Representational Oligonucleotide Microarray Analysis (ROMA) and SNP). In addition we also accept locus- and gene-specific submissions from quantitative studies such as qPCR, Multiplex Ligation-dependent Probe Amplification (MLPA) and expression analysis. If your method is not listed or the submission forms are inadequate for your submission please contact dbVar.

  6. Can I submit structural variation data from any organism?

    Yes, dbVar accepts data from all organism. If the fields are inadequate or missing for submission of your organism's data please contact dbVar.

  7. Can I submit structural variation data from human clinical or cancer studies?

    Yes, dbVar can hold clinical data. However, there are patient privacy issues to which we must adhere. If the patients involved in the study have not been consented to have their genome fully available in a public sequence archive we must remove identifying information from the record. In practice, this means making sure that two variants are not associated as originating from the same patient. The pathway for this is to submit data to dbGaP. dbGaP will hold identifying data that will be available to approved investigators. dbGaP will anonymize the data and submit variant summary data to dbVar.

  8. Does dbVar distinguish between pathogenic and benign variants?

    Yes, dbVar will clearly mark and allow searching for variants that are known to be pathogenic, providing links to OMIM when available.

  9. Does dbVar accept genotype data?

    Yes, dbVar can accept genotype data although the data model is still under development. Please email dbVar as we would like to work with you on how best to represent your data.

  10. Will I get a unique dbVar ID to use in publications?

    Yes, dbVar will provide a unique accession number for each study, each submitted variant region and each supporting level variant.

  11. What is the difference between the dbVar accessions?

    dbVar is in collaboration with DGVa at EBI to accession genomic structural variants. Accessions prefixed with an 'n' have been processed by NCBI (dbVar), accessions prefixed with an 'e' have been processed by EBI (DGVa). Both NCBI and EBI provide three levels of accessions:

    1. (n|e)std: the study id - this identifies a submitted study
    2. (n|e)sv: the structural variant id - this identifies the submitted region of variation
    3. (n|e)ssv: the supporting structural variant id - this identifies the supporting regions of variation (often sample-specific) that were used to call the submitted region of variation
  12. Why is the data in dbVar sometimes different from that provided in the publication?

    The loading of studies that are submitted to dbVar after publication may highlight errors during our quality control checks. In these cases submitters will be contacted and the errors corrected. In other cases a submitter may detect errors before submitting or may decide to edit their data. Often this will be documented in the study record.

  13. Why do some dbVar variants have >1 location?

    There are two reasons:

    When a submitter gives us data on an assembly obtained from UCSC, we translate this into native assembly coordinates and map sequences (chromosomes and unplaced/unlocalized sequences) to their accession.versions. UCSC concatenates unplaced/unlocalized sequences into pseudo-scaffold objects they call chr*_random. In some cases, submitters have provided data that cross gaps on the 'chr*_random' sequences, meaning that the feature actually maps to two different, unrelated sequences.

    More than one location may also be provided if the variant is the result of a transposition event. In this case, coordinates from both the donor and recipient sites are provided, to retain as much information about the variant event as possible.

  14. What's the difference between an nsv and an nssv?

    nsv and nssv are accession prefixes for variant regions and variant calls (or instances), respectively. Typically, one or more variant instances (nssv – variant calls based directly on experimental evidence) are merged into a variant region (nsv – a pair of start-stop coordinates reflecting the submitters’ assertion of the region of the genome that is affected by the variant instances). The ‘n’ preceding sv or ssv indicates that the variants were submitted to NCBI (dbVar). esv and essv represent the same variant entities, but those that were submitted to EBI (DGVa).

    Please see Overview of Structural Variation for more information.

  15. Can I download the whole database?

    All data is available on our FTP site. Data are available on a per study basis. Additionally, we have used the NCBI Remap Service to project submitted data onto current assemblies; NCBI36 (hg18) and GRCh37 (hg19) for human, and the current reference for all other organisms. By Nov. 2011 you will be able to download from all studies on a per assembly basis.

  16. How do I know if a given variant(s) in dbVar is real / of high quality?

    dbVar is an archive. We report variants as they are submitted to us, usually in association with a peer-reviewed publication. The veracity of the data is the responsibility of the submitting investigators. dbVar accepts (but does not require) information concerning any validation methods and results the submitter may have used to confirm variant calls. These validation data are presented as an integral part of the study data. To be listed as, “validated,” a variant must have been confirmed with at least one, or possibly more, additional independent methods. If you have concerns regarding a particular variant or data set, we recommend you contact the submitter for supporting information.

  17. How do I find data from a given region or gene?

    We are currently working on this functionality. You can perform searches using gene names; the results that are returned will include all variants that overlap the gene, and the studies with which the variants are associated. We anticipate being able to offer "search by location" by early 2012.

  18. Is there a way to integrate dbVar data with dbSNP data?

    Currently, integration has to be done manually by the user. NCBI is currently working on a Variation Portal that will facilitate the viewing and downloading of integrated SNP and SV data.

  19. Can I include any clinical information in my data submission?

    As long as your submission does not contain personally identifiable information, we encourage the inclusion of clinical information in dbVar submissions. Usually this takes the form of established clinical vocabularies like HPO or MeSH. In addition, we encourage the reporting of clinical significance associated with a particular variant, when known.

  20. My submission contains sensitive private clinical information. How does dbVar guarantee the information will remain private?

    Unless the participants in a given study have consented to have their genetic information published in an online public database, we cannot accept any information that can be used to trace back to an individual. or if you are concerned that your study contains personally identifiable information, we recommend you first submit your data to dbGaP

  21. What is the smallest size structural variant dbVar accepts?

    There are no size restrictions on structural variation data. We recommend that variants smaller than 50 bp be submitted to dbSNP but we will accept variants as small as a single basepair as long as the variant is an insertion or deletion, not a single nucleotide change.

  22. My study has identified novel insertion sequences and I don't have a genomic coordinate for these. How can I submit these to dbVar?

    If you have novel insertion sequence data, please submit it first as a WGS Project. This will give all of your novel insertion sequences unique identifiers that can then be tracked. You can then submit your data to dbVar and these novel insertion sequences can reference the sequence identifiers obtained from the WGS submission. With stable sequence identifiers, we may be able to map the sequence to updated assemblies and obtain a chromosome context for this sequence.

  23. The number of "Other Calls in this Sample" in the Variant Call Information table of the Variant Page is different from the number of variants that are returned when I do a dbVar search for the same study and sample ID. What's going on?

    The number in the table on the Variant Page indicates the number of Variant Calls (SSVs) in the sample. A search for the study and sample ID returns the number of Variant Regions (SVs) after similar calls have been merged into regions.

  24. How does dbVar place data submitted on one assembly (e.g., NCBI36) on other assemblies (e.g., GRCh37)?

    dbVar uses NCBI's Remap tool to map variants between assemblies. All variants reported in human-based studies are automatically remapped to both NCBI36 and GRCh37. More information on use the tool and a list of supported assemblies can be found on the Remap website.

  25. How do I submit to dbVar?

    Submitting data to dbVar is not difficult. Complete one of the templates we provide (Excel, Tab-delimited, or XML) and email it to dbvar@ncbi.nlm.nih.gov. Please see the dbVar Submission Information page for more information.

Last updated: Mon, 2012-06-25 16:15