Search across databases
Welcome to the Entrez cross-database search page
PubMed: biomedical literature citations and abstracts
PubMed Central: free, full text journal articles
Site Search: NCBI web and FTP sites
Books: online books
OMIM: online Mendelian Inheritance in Man

Nucleotide: Core subset of nucleotide sequence records
EST: Expressed Sequence Tag records
GSS: Genome Survey Sequence records
Protein: sequence database
Genome: whole genome sequences
Structure: three-dimensional macromolecular structures
Taxonomy: organisms in GenBank
SNP: short genetic variations
dbVar: Genomic structural variation
Gene: gene-centered information
SRA: Sequence Read Archive
BioSystems: Pathways and systems of interacting molecules
HomoloGene: eukaryotic homology groups
Probe: sequence-specific reagents
BioProject: aggregated biological research project data
dbGaP: genotype and phenotype
UniGene: gene-oriented clusters of transcript sequences
CDD: conserved protein domain database
Clone: integrated data for clone resources
UniSTS: markers and mapping data
PopSet: population study data sets
GEO Profiles: expression and molecular abundance profiles
GEO DataSets: experimental sets of GEO data
Epigenomics: Epigenetic maps and data sets
PubChem BioAssay: bioactivity screens of chemical substances
PubChem Compound: unique small molecule chemical structures
PubChem Substance: deposited chemical substance records
Protein Clusters: a collection of related protein sequences
OMIA: online Mendelian Inheritance in Animals
BioSample: biological material descriptions

NLM Catalog: catalog of books, journals, and audiovisuals in the NLM collections
MeSH: detailed information about NLM's controlled vocabulary

NCBI Home NCBI Search NCBI SiteMap


PubMed, a service of the National Library of Medicine, provides access to over 12 million MEDLINE citations back to the mid-1960's and additional life science journals. PubMed includes links to many sites providing full text articles and other related resources.

PubMed Central

PubMed Central (PMC) is the U.S. National Library of Medicine's digital archive of life sciences journal literature.
Access to the full text of articles in PMC is free, except where a journal requires a subscription for access to recent articles.

Site Search

Detailed NCBI web site and ftp search.


In collaboration with authors and publishers, the National Center for Biotechnology Information (NCBI) is adapting biomedical Books for the web.


Online Mendelian Inheritance in Man (OMIM) is a catalog of human genes and genetic disorders, with links to literature references, sequence records, maps, and related databases. It is based on the book, Mendelian Inheritance in Man. The online version is updated daily. The OMIM FAQs provide additional information about the book and the online database.


Online Mendelian Inheritance in Animals (OMIA) is a database of genes, inherited disorders and traits in animal species (other than human and mouse) authored by Professor Frank Nicholas of the University of Sydney, Australia, with help from many people over the years. The database contains textual information and references, as well as links to other relevant records.


The Nucleotide database contains records for all Entrez Nucleotide sequences that are not found within the Expressed Sequence Tag (EST) or Genome Survey Sequence (GSS) divisions of GenBank. These include sequences from all remaining divisions of GenBank, NCBI Reference Sequences (RefSeqs), Whole Genome Shotgun (WGS) sequences, Third Party Annotation (TPA) sequences, and sequences imported from the Entrez Structure database.


The EST database contains all records found within the Expressed Sequence Tag (EST) division of GenBank. EST records contain first-pass single-read cDNA sequences and include no annotated biological features.


The GSS database contains all records found within the Genome Survey Sequence (GSS) division of GenBank. GSS records contain first-pass single-read genomic sequences and rarely include annotated biological features.


The Protein entries in the Entrez search and retrieval system have been compiled from a variety of sources, including SwissProt, PIR, PRF, PDB, and translations from annotated coding regions in GenBank and RefSeq.


Enables users to explore and visualize richly-annotated epigenomics datasets. It provides a unique interface to search and navigate epigenomic data in the context of biological sample information, as well as tools to select, download and view multiple sets of epigenomic data as tracks on genome browsers.


The whole Genomes of over 1000 viruses and over 100 microbes can be found in Entrez Genome. The genomes represent both completely sequenced organisms and those for which sequencing is in progress. All three main domains of life - bacteria, archaea, and eukaryota - are represented, as well as many viruses and organelles.


Structure: The Molecular Modeling Database (MMDB) contains 3-D macromolecular structures, including proteins and polynucleotides. MMDB contains over 20,000 structures and is linked to the rest of the NCBI databases, including sequences, bibliographic citations, taxonomic classifications, and sequence and structure neighbors.


The NCBI Taxonomy database contains the names of all organisms that are represented in the genetic databases with at least one nucleotide or protein sequence.


Database of short genetic variations, including, but not limited to, Single Nucleotide Polymorphisms.


The dbVar database has been developed to archive information associated with large scale genomic variation, including large insertions, deletions, translocations and inversions. In addition to archiving variation discovery, dbVar also stores associations of defined variants with phenotype information.


Gene organizes information about the characteristics and defining sequences of genes from species in Genome, RefSeq, and other model organisms.


SRA Raw sequence data from sequencing instruments.


BioSystems The BioSystems database contains records that group together molecules that interact in biological systems. One type of biosystem is a biological pathway, which can consist of interacting genes, proteins, and small molecules. Another type of biosystem is a disease, which can involve components such as genes, biomarkers, and drugs.


HomoloGene is an automated system for detecting homologs among the annotated genes of several completely sequenced eukaryotic genomes.

PubChem Compound

PubChem Compound contains chemical structure information drawn from a variety of public sources. Compounds may be searched by chemical properties and are pre-clustered into identity and similarity groups by structure comparison. Whenever possible, compounds are linked via PubChem Substance to information on their biological activities. Available links include PubMed citations, protein 3D structures and links to biological screening results available in PubChem BioAssay.

PubChem Substance

PubChem Substance contains descriptions of chemical samples, from a variety of public sources, and links to information on their biological activities. The description includes links to PubChem Compound in cases where the chemical structures of compounds in the sample are known. Links providing information on biological activity include links to PubMed citations, protein 3D structures, and to biological screening results available in PubChem BioAssay.


The database of genotype and phenotype (dbGaP) stores phenotype and genotype data, as well as the associations between them. Studies generating data for dbGaP will include genome-wide association studies, medical sequencing, and molecular diagnostic assays. Summaries of phenotype and genotype data as well as study documents and association analyses (when available) will be found on the public site. Authorized access may be required for downloading coded individual-level phenotypes genotypes and pedigrees.


UniGene is an experimental system for automatically partitioning GenBank sequences into a non-redundant set of gene-oriented clusters. Each UniGene cluster contains sequences that represent a unique gene, as well as related information such as the tissue types in which the gene has been expressed and map location.


CDD currently contains domains derived from two popular collections, Smart and Pfam, plus contributions from colleagues at NCBI, such as COG. The source databases also provide descriptions and links to citations. Since conserved domains correspond to compact structural units, CDs contain links to 3D-structure via Cn3D whenever possible.


Clone is a database that integrates information about clones and libraries, including sequence data, map positions and distributor information. It replaces the former NCBI Clone Registry.


UniSTS is a NCBI resource that reports information about markers, or Sequence Tagged Sites (STS). UniSTS integrates marker and mapping data from public resources including GenBank, RHdb, GDB, various human maps (Genethon genetic map, Marshfield genetic map, Whitehead RH map, Whitehead YAC map, Stanford RH map, NHGRI chr 7 physical map, WashU chrX physical map), various mouse maps (Whitehead RH map, Whitehead YAC map, Jackson laboratory's MGD map).


PopSet is a set of DNA sequences that have been collected to analyse the evolutionary relatedness of a population. The population could originate from different members of the same species, or from organisms from different species. They are submitted to GenBank via Sequin, often as a sequence alignment.

GEO Profiles

Individual gene expression and molecular abundance profiles assembled from the Gene Expression Omnibus (GEO) repository. Entrez GEO Profiles queries annotation and pre-computed profile characteristics, allowing identification of specific genes, and molecular abundance profiles of interest.

GEO DataSets

Comparable experimental sample sets assembled from the Gene Expression Omnibus (GEO) repository. Entrez GDS queries all GEO DataSet annotation, allowing identification of experiments of interest.

PubChem BioAssay

PubChem BioAssay contains the results of biological activity screening from a variety of public sources. It provides searchable descriptions of each bioassay, including descriptions of the conditions and readouts specific to that screening procedure. PubChem BioAssay results are linked to PubChem Substance, and in turn to PubChem Compound, whenever chemical structures are known. Screening results may be browsed via a web interface and also downloaded for further cheminformatics analysis.


The Probe Database is a public registry of sequence-specific reagents designed for use in a wide variety of biomedical research applications, together with information on reagent availability, experimental protocols, probe effectiveness, and computed sequence similarities.

Protein Clusters

Protein Clusters is a collection of related protein sequences (clusters). Currently it consists of Reference Sequence proteins encoded by complete prokaryotic and chloroplast genomes and plasmids. This database contains both curated and non-curated clusters.

NLM Catalog

The NLM Catalog provides access to the National Library of Medicine's bibliographic data for journals, books, audiovisuals, computer software, electronic resources, and other materials. Links to the Library's holdings in LocatorPlus, NLM's online public access catalog, are also provided.


MeSH is NLM's controlled vocabulary used for indexing articles in PubMed. MeSH terminology provides a consistent way to retrieve information that may use different terminology for the same concepts. Use the MeSH database to build a PubMed search strategy.


BioProject aggregates information about and links to data generated by a single biological research project of an organization or consortium.