GenBank Submission Types

Standard

GenBank accepts mRNA or genomic sequence data directly determined by the submitter.  The submission must include information about the source organism and annotation provided by the submitter.  More details about adding annotation and sample files can be found in the GenBank Submissions Handbook.  If you have any questions about the best method for submitting your data, please contact our user services group at: info@ncbi.nlm.nih.gov.

The following data is not accepted by GenBank:

  • Noncontiguous sequences
  • Primer sequences
  • Protein sequences with no underlying nucleotide submission
  • Sequence containing a mix of genomic and mRNA sequence
  • Sequences without a physical counterpart (consensus sequences)
  • Sequences with length less than 200 nucleotides

Raw sequence reads from next generation sequencing platforms should be submitted to the Sequence Read Archive (SRA).

Sequence data not directly obtained by the submitter may be acceptable for the Third Party Annotation database.

EST, STS, GSS

Batches of ESTs (expressed sequence tags), STSs (sequence tagged sites), and GSSs (genome survey sequences) can be submitted via special streamlined procedures.

High-Throughput Genomic (HTGs) Sequences

Clone-based High-Throughput Genomic Sequence (usually cosmids or BACs) submissions can be generated using tbl2asn or Sequin. The HTGs page provides detailed submission instructions for genome centers.

Complete Microbial Genomes

The Bacterial Genome Submission Guidelines page provides a detailed guide to help bacterial genome submitters prepare their submissions using Sequin or tbl2asn.

Whole Genome Shotgun (WGS) Sequences

Genomic sequence read-overlap contig sequences and assemblies from ongoing Whole Genome Shotgun (WGS) sequencing projects with or without annotations can be submitted and should be updated as sequencing progresses and new assemblies are computed. Detailed submission instructions can be found on the WGS submission guide.

Transcriptome Shotgun Assembly (TSA) Sequences

Transcriptomic sequence read-overlap contig sequences computationally assembled  from primary data submitted to dbEST, the Sequence Read Archive (SRA), or the Trace Archive can be submitted to TSA.  Detailed submission instructions can be found on the TSA submission guide.

Third Party Annotation (TPA)

The TPA (Third Party Annotation) database accepts third party annotation of genomic sequences or computationally derived/assembled sequences. TPA submissions must include sequence data that is already represented in GenBank, and the analysis upon which the annotations are based must appear in a peer-reviewed scientific journal.   Detailed requirements and submission instructions can be found on the TPA submission guide.

Last updated: 2012-08-27T09:41:07-04:00