PubMed | Nucleotide | Protein | Genome | Structure | Taxonomy |
Homo sapiens genome data and search tips | Revised October 5, 2011 |
The Map Viewer help document describes how to use the Map Viewer software. This page describes the data available for the human genome, and the search tips specific to that organism. You can also return to the Homo sapiens genome view search page. Separate documents provide information about the process used to assemble and annotate the human genome sequence, release notes for each build of the genome, and statistics for the current build. The Map Viewer home page allows you to search the genome data of any organism represented in MapViewer.
|
Scope of Data |
Integrated Data from Various Sources |
The Entrez Map Viewer integrates human sequence and map data from a variety of sources. The types of maps include sequence, cytogenetic, genetic linkage, radiation hybrid, and YAC contig. The next section, on Available Maps, provides additional detail about each source. The maps are integrated with each other as described in the Show Connections section of the general Map Viewer help document. The sequence data include both finished and draft high throughput genomic sequences (HTGs), as described below. Separate documents provide an introduction to the information infrastructure developed at NCBI to integrate the various types of data generated by the Human Genome Project, the process used to assemble and annotate the human genome sequence, release notes for each build of the genome, and statistics for the current build. |
Finished and draft sequence data |
The sequence data include both finished and draft high throughput genomic sequences (HTGs). On the Component map, finished (phase 3) HTG sequences are shown in blue, and draft HTG sequences (phase 1 and 2) are shown in orange. Definitions of the various phases are provided on the HTGs home page. Multiple algorithms are still being tested to assemble contigs from a combination of draft and finished genomic sequence, and to identify the genes, markers, SNPs and other features on that sequence. Therefore it is possible that sequence and/or features present in one version may not appear in the next. A separate document provides more detail about NCBI's Contig Assembly and Annotation Process. |
Frequency of Updates to Map Viewer Data |
Currently, the Map Viewer data files are updated with each full re-annotation run, which is approximately once per year. The update frequency will increase in the future, as development continues. The data are also available on NCBI's FTP site. |
Available Maps |
The maps for human include: |
Map Name | Description |
Sequence Maps |
Assembly Assembly Region |
The Assembly and Assembly Region maps allow users to visualize all of the sequence data available for a given region of the genome, and separates the data by assembly. Full chromosome assemblies are shown on the Assembly map, and region assemblies are shown on the Assembly Region map. Data are currently available from the following assemblies:
The assembly map also acts as a filter through which all of the other sequence maps are viewed, allowing you to see the annotations that have been placed on the sequence data from each assembly. When viewing the Assembly map, vertical lines indicate regions of the genome where sequence data from other assemblies is available. Blue lines represent the current (selected) assembly, and orange lines represent other assemblies. The GRC primary assembly is not represented. The alternate loci and patches are displayed as assembly regions instead of assembly units (e.g. MHC rather than ALT_REF_LOCI_1-7). By default, the other sequence maps that are available for an organism display the features that have been annotated on the reference assembly. The Maps&Options dialog box allows you to change the assembly being displayed in the Assembly map and any other sequence maps. Instructions on how to do this are provided in the section on Customizing The Display / Maps & Options Dialog Box / Select One or More Assemblies to Display. Example: human Assembly map for chromosome 6, displayed beside the Genes_Sequence map. The Reference assembly is shown as a vertical blue line on the Assembly map and spans the length of the chromosome. The Celera and HuRef assemblies are shown as vertical orange lines. The ALT_REF_LOCI_1-7 assemblies are also shown as vertical orange lines and represent a curated sequence region for alternate haplotypes in the Major Histocompatibility Complex. The Genes_Sequence map, by default, shows the genes that have been mapped onto the reference assembly. To see the genes that have been annotated on the other assemblies, follow the instructions provided in the section on Select One or More Assemblies to Display. |
Clone | Alignment of BAC end sequences to the assembled genomic sequence. During the alignment process, BAC ends are aligned to the genome and the best placement is selected, with the requirement that at least 50% of the BAC end had to align to the genome with >90% identity. If a BAC end sequence has two or more best placements on the genome, then each location will be used for clone placement. Clones shown in blue have an unambiguous best placement, whereas clones shown in black have multiple possible placements. Clones shown in green have discordant end alignments, and clones shown in orange have multiple placements with discordant end alignments.
When the Clone map is displayed as the Master map, the verbose display provides the clone name (linked to the Clone Registry database) and the BAC end sequence accessions (linked to dbGSS). |
Component | Components of the human genome assembly.
Shows the placement of individual GenBank sequence entries that were used to generate the genomic contigs. This represents a tiling path for the human genome sequence, based on the relationship of overlapping clones. It is assembled using the method described by the Genome Reference Consortium. Finished (phase 3 HTG) GenBank sequence records are shown in blue. Draft GenBank sequence records (phase 1 and 2 HTG) are shown in orange. The High-Throughput Genomic Sequences (HTG) page provides additional information about the various phases. Note: This map was called "GenBank" map through build 26 of the human genome data. The map name was changed to "Component" in build 27, December 2001. At that time, the "GenBank DNA" map was also added, described below. |
Contig | Shows the chromosomal placement of contigs that have been assembled at NCBI using finished and draft high-throughput genomic (HTG) sequence data. Any individual contig can be assembled from finished sequence (phase 3 HTG), draft sequence (phase 1 and 2 HTG), or a mixture of both. Contig regions made from finished sequence are shown in blue, while regions made from draft sequence are shown in orange. The data are assembled using the method described by the Genome Reference Consortium. The High-Throughput Genomic Sequences page provides additional information about the various phases of HTG sequence data used in the assembly. Note: The Component map shows the individual GenBank records used in assembling the contigs. |
CpG Island | Shows regions of high G + C content on the assembled genome sequence. CpG islands are identified using the algorithm and "relaxed" cutoffs of Takai and Jones, 2002. The resulting islands are then converted into a histogram that reflects the frequency of islands in non-overlapping 1, 10, or 100 kb intervals (depending on the viewing magnification), and displayed in a log(10) scale.
|
Ensembl Genes (assembly specific) | Map of annotated genes provided by Ensembl, based on the latest Ensembl release available at the time of the build. This map is only available for the reference assembly. |
Ensembl Transcripts (assembly specific) | Map of annotated transcripts provided by Ensembl, based on the latest Ensembl release available at the time of the build. This map is only available for the reference assembly. |
FISH Clone (seq) | Localization of FISH mapped clones.
Clones placed onto the genomic sequence by using their clone insert sequences either in finished or draft form are shown as thick lines. Clones placed based on best placement of the BAC-end sequences (from GSS division of GenBank) are shown as thin lines, as described for the Clone map. When a clone insert sequence was used for contig assembly and if it spans a large region (e.g., > 1Mb), the clone was also marked to span the same region.
When the FISH Clone (seq) map is displayed as the Master map, the verbose display provides the clone name (linked to the Clone Registry database), the BAC end sequence accessions (linked to dbGSS), and the cytogenetic location. Clones can also be viewed based on cytogenetic positions on the FISH Clones map. |
GenBank DNA | Shows the placement of human genomic DNA sequences from GenBank that were not used in the assembly of contigs. The line indicates the maximal extent of the alignment, and does not reflect gaps in the alignment. The clone alignments are color coded according to the type of sequence: blue for HTGS phase 3 or finished clones; orange for HTGS phase 1 or 2 clones; green for whole genome shotgun (WGS) contigs; and black for other sequences. |
Gene | Genes that have been annotated on the genomic contigs. This includes known and putative genes placed as a result of alignments of mRNAs to the contigs, and gene predictions. If multiple models exist for a single gene, corresponding to splicing variants, the Gene_Sequence map presents a flattened view of all the exons that can be spliced together in various ways. For example, if one splice variant uses exons 1, 3, 4, and another splice variant uses exons 2, 3, 4, the Gene_Sequence map shows exons 1, 2, 3, 4. (In comparison, the RefSeq Transcript map shows what combinations of exons are valid based on mRNA sequences from RefSeq and GenBank.) Genes shown on the left of the grey line are transcribed in the - orientation (from bottom up), and those on the right in the + orientation (from top down). When Gene_Sequence is selected as the Master map, the verbose display (detailed labeling, shown by default) includes arrows to the right of each gene name indicate its direction of transcription as well as links to:
Additional information about these links is also provided below, under view/download sequence data from a chromosome region. Gene models are shown in different colors, depending on the quality of the alignment of defining RNA RefSeqs to reference genome, and the maintenance of the coding sequence based on the placement. |
|
Additional Notes: In general, a gene model is shown in blue if there is a clean alignment between a RefSeq or GenBank mRNA sequence and the genomic sequence, and if there is an exact match between the protein product that was annotated in the mRNA sequence record and the conceptual translation of the genomic sequence gene model. A gene model is shown in brown if there is some discrepancy between the mRNA sequence and the gene model, either in the alignment of the two and/or in their protein products. Examples of the former can include gaps, or the alignment of an mRNA to two or more genomic regions. Examples of the latter can include differences between the amino acid sequence given in an mRNA sequence record and the conceptual translation of the corresponding gene model, or premature termination of a coding region in the genomic sequence. Both of those can be caused by base pair mismatches between the mRNA and genomic sequence. Models with Interim GeneIDs (evidence code I) may be paralogs, genes not yet curated, duplications because of assembly errors, or pseudogenes. The genome assembly and annotation pipeline assigns interim IDs when there is no unambiguous solution to what they should be. Interim GeneIDs for protein-coding genes are associated with RefSeq XM_* accessions (model mRNAs), although supporting alignments may (or may not) include RefSeq NM_* accessions (known mRNAs). The RefSeq web site contains more information about RefSeq and RefSeq accessions. |
Model Transcripts | Models generated by Gnomon.
Gnomon uses a combination of homology searching (protein and transcript alignments) and ab initio modeling to predict both complete and partial coding sequences. Please note that this process may not accurately represent alternatively spliced transcripts. The labels on the map are linked to the protein record of the highest scoring match to the model's predicted protein. Gnomon models are also included in the Gene and RefSeq Transcript maps, in regions where known RefSeq transcripts have not yet been identified.
Models are color coded based on their level of support:
|
NCI Clone | A subset of clones shown on the FISH clone (seq) map that are from the NCI. |
Phenotype | Shows the placement of loci associated with phenotypes on the assembled human genome sequence. Phenotypes include those described in Online Mendelian Inheritance in Man (OMIM), and quantitative trait loci (QTLs). OMIM - While the OMIM resource itself shows the location of phenotypes (when known) in cytogenetic coordinates, the phenotype map shows the location in sequence coordinates. Thus it is now easier, when querying by a disease name, to know if it has been placed on a sequence map at all. If the phenotype is associated with a known gene, the placement of the gene is determined by aligning its sequence data to the human genome sequence. QTLs - If the phenotype is placed by linkage or association to mapped markers, the phenotype is placed by the position of that marker or markers. The data are represented as single points along the chromosome, as each QTL is currently associated with the marker that gave the highest LOD score. At present, there is no step to extend the range defined by the markers to reflect the level of confidence in any boundary marker. |
RefSeq Transcripts | Diagrams of the RefSeq RNAs that are mapped on the genomic contigs. Known RefSeq transcripts have accession prefixes beginning with NM_ or NR_, and model RefSeq transcripts have accession prefixes beginning with XM_ or XR_. The Transcript map and Gene_Sequence map are built in the same way, using the same types of evidence, described above. However, the Gene_Sequence map shows a view of all the exons in a gene, while the Transcript map shows the combinations of exons (i.e., splice variants) that are valid, based on mRNA sequences. |
Repeats | Position of repetitive elements, calculated using RepeatMasker v3.2.6 using these flags:
|
RNA Maps | The RNA maps show mRNAs from a given organism aligned to the assembled human genomic sequence that has been repeat-masked and dusted. Each alignment is the single best placement for that sequence in the current build of the human genome. It can be queried by sequence accession. The RNA maps include:
The display for RNA maps differs from the Hs UniGene map in that what are displayed here are the alignments [thicker lines] and putative introns [thinner lines] of mRNAs best placed at that position. Green lines indicate ESTs; blue indicates cDNAs. In contrast, the "UniGene" map is a summary of probable splicing events, with connections to UniGene for the clusters that contain those sequences. |
STS
|
Placement of STSs from a variety of sources onto the genomic data using Electronic-PCR (e-PCR). The markers are from RHdb, GDB, GeneMap'99 (gene-based markers), Stanford G3 RH map (both gene and non-gene markers), TNG map, Whitehead RH map and YAC maps (both gene and non-gene markers), Genethon genetic map, Marshfield genetic map, and several chromosome-specific maps, such as the NHGRI map for chromosome 7 and the Washington University map of chromosome X . |
TCAG Genes (assembly-specific) | Map of annotated genes provided by The Center for Applied Genomics (TCAG) at the Hospital for Sick Children on their assembly of chromosome 7. |
TCAG Transcripts (assembly-specific) | Map of annotated transcripts provided by The Center for Applied Genomics (TCAG) at the Hospital for Sick Children on their assembly of chromosome 7. |
Homo sapiens UniGene Clusters | The UniGene map show human mRNA and EST sequences aligned to the assembled human genomic sequence that has been repeat-masked and dusted. Only ESTs supplied with orientation are used. Each alignment is the single best placement for that sequence in the current build of the human genome. The display of the UniGene map varies according to the span of sequence being displayed. For large spans of sequence (greater than 10 million bases), the Map Viewer displays histograms that show the density of ESTs and mRNAs aligned to a region, the UniGene clusters to which they belong, and the number of sequences from each UniGene cluster. For smaller spans of sequence (i.e., higher resolutions, showing less than 10 million bases), the Map Viewer displays the above information plus blue lines that indicate exon/intron structure:
Alignments are grouped by common structure. If two or more transcripts share at least one intron/exon splice junction, the alignments of those transcripts are merged into a single model. If two or more transcripts do not share any intron/exon splice junction, they are shown as separate models. The UniGene map displays differ from those labeled as Xx_RNA in that what is labeled here is a summary of probable splicing events. The 'RNA' maps (not to be confused with the RefSeq 'RNA' map) show the mRNAS best placed at that position. |
Variation | Alignment of genetic variation data from dbSNP onto the genomic sequence. more... |
Cytogenetic Maps |
FISH Clones | BAC clones that were mapped to cytogenetic bands using fluorescent in situ hybridization (FISH). When viewing FISHClone as the master map, the source of FISH data are indicated in parentheses. These clones have also been aligned to the genomic sequence data on the FISH Clone (seq) map. |
Genes_Cytogenetic | Cytogenetic locations of genes as reported in Entrez Gene, which includes map locations from OMIM, the Human Gene Nomenclature Committee, and the other valued collaborators. |
Ideogram | Ideogram of the G-banding pattern at the 850 band resolution. |
Mitelman Breakpoint | Genome-wide map of chromosomal breakpoints, based on the Mitelman Database of Chromosome Aberrations in Cancer, by Drs. Mitelman, Mertens, and Johansson, http://cgap.nci.nih.gov/Chromosomes/Mitelman. |
Morbid | Cytogenetic map locations of disease genes described in OMIM. |
NCI FISH Clone | A subset of clones shown on the FISH Clones map that are from the NCI. |
Note: the genes on all cytogenetic maps are ordered based on cytogenetic band. At present, order within a band is not being calculated. |
Genetic Linkage Maps |
deCODE | deCODE high resolution genetic map, from deCODE genetics, Iceland. The map has a total length of 2161.71 cM and is described by Kong, A., et al. in "A high-resolution recombination map of the human genome," Nat Genet., 2002 Jul;31(3):225-6. more... |
Genethon | Microsatellite map, described by Dib, C., et al. in "A comprehensive genetic map of the human genome based on 5,264 microsatellites," Nature, 1996 Mar 14;380(6570):152-4. more... |
Marshfield | Comprehensive human linkage map incorporating >8000 polymorphic markers. Total sex-averaged genetic distance is 3500 cM. (Broman et al., Comprehensive human genetic maps: Individual and sex-specific variation in recombination. American Journal of Human Genetics, 1999 63:861-869) more... |
Radiation Hybrid Maps |
GeneMap99-G3 | 7,061 STS markers mapped onto the G3 RH panel by
the International Radiation Hybrid Consortium (Schuler GD, et al., Science, October 25, 1996
, and Deloukas, et al., Science, October 23, 1998). Scale = cR10000. Total number of centiRays across the genome = 125,853. Resolution = 42 cR10000 per megabase. The GeneMap'99 home page provides additional details about the project. |
GeneMap99-GB4 | 45,758 STS markers mapped onto the GB4 RH panel
by
the International Radiation Hybrid Consortium (Schuler GD, et al., Science, October 25, 1996
, and Deloukas, et al., Science, October 23, 1998). Scale = cR3000. Total number of centiRays across the genome = 11,524. Resolution = 3.84 cR3000 per megabase. The GeneMap'99 home page provides additional details about the project. |
NCBI RH | NCBI Integrated Radiation Hybrid Map contains 23,723 markers from both the G3 and GB4 RH panels of GeneMap'99. Those markers were mapped with respect to 1084 framework markers (a subset of markers common to the G3 and GB4 panels). All markers from both panels were interpolated onto the GB4 scale. The article by R. Agarwala et al. provides detail about the integration strategy, as well as the methods used to evaluate the quality of the integrated map. |
Stanford G3 | Includes 11,458 STS markers
(both gene-based and non-gene-based) mapped onto the G3 RH panel. more... Scale = cR10000. Total number of centiRays across the genome = 124,349. Resolution = 41.5 cR10000 per megabase. A subset of the markers from this map were used in the GeneMap99-G3 map. |
Stanford TNG |
The TNG map includes over 37,000 markers. more...
Scale = cR50000. Resolution = 1 cR50000 is approximately 2 kbp. On average, there is one ordered STS per 94 kbp. |
Whitehead-RH | Includes 6,193 STS markers mapped onto the
GB4 RH panel. more... Scale = cR3000. Total number of centiRays across the genome = 11,042. Resolution = 3.7 cR3000 per megabase. |
Note: the RH maps described above are static and will not be updated with additional markers. |
Other Maps |
Whitehead-YAC | STS content map of 10,850 STS markers placed onto 16,494 YACs with an average intermarker distance of 276 kilobases. The scale shown on the ruler for this map indicates the ordinal from the top of the chromosome. For example, a unit of 30 represents the 30th marker from the top of the chromosome. more... |
Note: In addition to the maps listed above, NCBI offers some additional mapping information resources. For example, a comparative Human/Mouse Homology Map is not displayed in the Entrez Map Viewer, but is available for your use. The NCBI Site Map lists a number of resources for various organisms in the Genomes and Maps section. |
Types of objects and maps on which they can be found |
Clones |
Components of Sequence Assembly |
|
CpG Islands |
|
Expression Data |
|
GenBank Accessions |
|
Genes |
|
Phenotypes |
|
Polymorphisms |
|
STSs |
|
Legend |
Verbose Mode |
By default, the master map at the right side of the display is shown in verbose mode, which provides descriptive information (as available) for each object on the master map. |
Orientation |
Object Location | Symbol | Meaning |
Plus strand | Genes shown to the right of the grey line are transcribed in the + orientation (from top down); contigs with a + orientation are read from top down | |
Minus strand | Genes shown to the left of the grey line are transcribed in the - orientation (from bottom up); contigs with a - orientation are read from bottom up | |
Unknown | ? | The orientation of the map element is unknown. |
Links to Related Resources |
Each map element displayed in your search results will be associated with a number of links (when available) that lead to additional information. The links include: |
Linked Text | Link Action | Description |
Map element | Map View | The results of a search list the map elements that contain your search term. Those elements can be present in one or more maps. Following the link for a particular map element leads to a graphical view of the chromosomal region that contains the element. |
OMIM | Online Mendelian Inheritance in Man | Links to the corresponding entry in Online Mendelian Inheritance in Man, a continuously updated catalog of human genes and genetic disorders. |
sv | Sequence Viewer | Graphically shows the position of the map element within the sequence region. The display includes a graphic depiction of the coding region (CDS), RNA, and gene features that have been annotated on that sequence region. A 2 Kb section of sequence is shown below that, with corresponding graphic annotations of the features. The left and right arrows at either end of the sequence data allow you to move upstream and downstream. |
pr | Protein | Links to the corresponding protein sequence record in the Entrez Protein database. |
dl | Download Sequence |
Opens a form that allows you to download a region of a chromosome. The form has two parts: (1) the top part allows you to enter chromosome coordinates in text boxes, and (2) the bottom part displays the NT_* contigs (or portions of them) that are found in that chromosome region. Note that part 1 shows the position (base span) of the region on the chromosome, and part 2 shows the position of the region on the contig. The "strand" column for each contig shows whether that contig is on the plus or minus strand of the chromosome. Therefore, if a contig is on the minus strand, increasing the value of the 3' chromosome coordinate will decrease the value of the 5' contig coordinate. The options to "Display, Save to Disk, and View Evidence" allow you to view the individual contigs in the region (or portions of them, depending on the chromosome region specified). By default, the dl link beside each gene displays the chromosome and contig coordinates for the span of that gene. To view/save additional sequence data upstream and downstream of the gene, simply adjust the chromosome coordinates and press the "Change Region" button. Note that the contig coordinates will also change. |
ev | Evidence Viewer | Graphical display of the biological evidence supporting a particular gene model. It displays all RefSeq models, GenBank mRNAs, annotated known or potential transcripts, and ESTs that align to the genomic sequence region of interest. (more...) |
hm | HomoloGene | a resource of curated and calculated orthologs for genes as represented by UniGene or by annotation of genomic sequences. (more about HomoloGene...) |
STS Maps Legend |
Colored dots indicate uniqueness of STS positions |
Polymorphism Column |
The polymorphism column indicates whether the marker has been used to detect a polymorphism, with Y for yes and N for no. |
Detailed Marker Information |
To see detailed mapping information about a marker, follow the link for that marker to its UniSTS record. |
Constructing queries |
Searchable Terms |
Text terms |
The viewer supports searching on any text term that may describe an element on any map. These include:
|
Truncation |
Search terms can also be truncated at the right end only, using an asterisk (*) as a wild card to represent zero to many characters. See the truncation section of the general Map Viewer Help document for more details. |
Map Positions |
As noted in the Search By Position section of the Entrez Map Viewer general help document,
there are three main ways to search by map position from the
Map View of a chromosome:
|
Allowable Values |
For human, the following types of map positions can be entered in the Region text boxes noted in option 1:
It is not necessary to enter a value in both Region text boxes. If you enter a value (e.g., 9q21) only in the upper box, the Map Viewer will display the region of the chromosome starting from that point and ending at that q telomere. If you enter the value only in the lower box, the Map Viewer will display the region of the chromosome starting at the p telomere and ending at that value. |
View/Download Chromosome Region |
You can view or download map and sequence data for chromosome regions from the graphic displays, as explained below, or by FTP. |
Map Data Map data for a chromosome region or a complete chromosome can be viewed/downloaded by using the Data as Table View option. It is accessible from the blue sidebar of a chromosome display. The Table View shows tab delimited output for the chromosomal region that was shown in the graphic display, and for each map that was shown in the display. If only sequence maps are displayed, the Table View gives the additional option of viewing/downloading the data for the complete set of sequence maps, even if only a subset was shown in the graphic display. |
Sequence Data Sequence data can be downloaded for a chromosome region of interest by following either of the following links in the graphic display of sequence maps:
When the Gene_Sequence map is the master map, the links column also includes the following links that allow you to view and/or download sequence data in a selected region. (Additional links, OMIM and hm, are also shown for map elements on the Gene_Sequence map. However, they not listed in the table below because the following links focus on viewing/downloading sequence data. In contrast, the OMIM and hm links point to Online Mendelian Inheritance in Man and HomoloGene, respectively.) |
sv | sequence viewer | Graphically shows the position of the map element within the sequence region, including the coding region (CDS), RNA, and gene features that have been annotated on that region. A 2 Kb section of sequence is shown below that, with corresponding graphic annotations of the features. The left and right arrows at either end of the sequence data allow you to move upstream and downstream. |
pr | protein | links to the corresponding protein sequence record in the Entrez Protein database. |
dl | download sequence | Opens a form that allows you to download a region of a chromosome. The form has two parts: (1) the top part allows you to enter chromosome coordinates in text boxes, and (2) the bottom part displays the NT_* contigs (or portions of them) that are found in that chromosome region. Note that part 1 shows the position (base span) of the region on the chromosome, and part 2 shows the position of the region on the contig. The "strand" column for each contig shows whether that contig is on the plus or minus strand of the chromosome. Therefore, if a contig is on the minus strand, increasing the value of the 3' chromosome coordinate will decrease the value of the 5' contig coordinate. The options to "Display, Save to Disk, and View Evidence" allow you to view the individual contigs in the region (or portions of them, depending on the chromosome region specified). By default, the seq link beside each gene displays the chromosome and contig coordinates for the span of that gene. To view/save additional sequence data upstream and downstream of the gene, simply adjust the chromosome coordinates and press the "Change Region" button. Note that the contig coordinates will also change. |
ev | evidence viewer | Displays the biological evidence supporting a particular gene model. It displays all RefSeq models, GenBank mRNAs, annotated known or potential transcripts, and ESTs that align to the genomic sequence region of interest. (more...) |
Query options |
Boolean Operators |
If multiple terms are entered, they will automatically be combined with a Boolean AND, as mentioned in the Text Terms section above. Adjacency searches are not supported at present. For example, a query entered as cell adhesion will be processed as cell AND adhesion and will retrieve records with descriptions that contain cell matrix adhesion as well as cell adhesion.
You can choose to use any Boolean operators (AND, OR, NOT) in your query. Boolean operators must be written in upper case.
The general syntax for a Boolean Query is: The available search fields and their corresponding abbreviations (qualifiers) are listed below. By default, Boolean operators are processed from left to right. The order in which Entrez processes a search statement can be changed by enclosing individual concepts in parentheses. The terms inside the parentheses are processed first as a unit and then incorporated into the overall strategy. Additional details about Boolean Operators are provided in the Entrez Help document. |
Search fields |
If desired, you can restrict the search for a term to a particular field by placing the field qualifier in square brackets [] after the term. It is not necessary to include a space between the search term and the field specifier. If no field qualifier is used, the system will search all fields. For example, a search for cancer will retrieve records which contain that term in any field. A search for cancer[dis] will only retrieve records which contain that term in the disease field. "Disease" refers to diseases on the OMIM Morbid Map and the Mitelman Breakpoint Map. Terms can be combined with Boolean operators, as described above. The Advanced Search page (see example) also provides the ability to restrict your search to specific fields, and to limit retrieval to mapped objects that have desired properties. |
Search field | Description | Qualifier |
---|---|---|
accession | the nucleotide accession of a GenBank component or the nucletide or protein accessions for RefSeqs | [accession], [acc], [accn] |
chromosome | the chromosome number | [chr] |
disease | disease or Mitelman breakpoint name | [dis] |
id | the integer identifier for a particular type of object; useful in combination with type | [id] |
map name | the name of the map (The general Map Viewer Help document provides a list of map names. Use the character string in the "URL value" column.) | [map_name],[map] |
MIM number | the MIM number for a phenotype or gene from Online Mendelian Inheritance in Man | [mim] |
properties | various attributes associated with a mapped object; additional details in the Properties section below | [prop] |
symbol | the gene symbol or other short name; includes clone names, marker names, and alternate symbols (also referred to as aliases or synonyms; see Text Terms section above for example) | [sym] |
title | gene, disease, or Mitelman breakpoint names; includes symbols | [title], [ti], [titl] |
type | type of mapped object; most useful in combination with id Options are: clone, component, contig, gs_tran, gene, mim, mitel, snp, sts, tag, transcript, unigene | [obj_type] |
Properties: | limiting retrieval to mapped objects that have certain attributes |
The Advanced Search page (see example) allows you to limit retrieval to mapped elements that have certain attributes, or properties, listed below. The properties indented under has_snp apply only to mapped markers associated with reported genetic variations. "Disease" refers to diseases on the OMIM Morbid Map and the Mitelman Breakpoint Map. |
Property | Description |
---|---|
disease_known | mapped object associated with a known disease; data in this category are currently available only for genes |
on_seq | mapped object that is present on one of the sequence maps |
in_clone | mapped object that falls within the boundaries of a FISH mapped clone |
in_gene | Any part of the marker position on sequence map is within a 2kb interval 5' of the most 5' feature of gene (CDS, mRNA, gene), OR the marker position is within a 500 base interval 3' of the most 3' feature of the gene. Both strands of sequence are examined for gene features, so a marker can potentially be a variation on multiple genes at a single location. |
has_NM | mapped object connected with a RefSeq mRNA (which has an accession number in the format NM_123456) |
has_STS | mapped object that is not an STS but is connected to an STS (e.g., a gene or clone that contains known STSs) |
has_snp | mapped object that contains a reported single nucleotide polymorphism (SNP), insertion, deletion, or other small variation [or mapped marker that has a link (see show connections) to a marker on the Variation map] [or mapped marker that is associated with a gene that has...] |
Variation only -- the properties below apply only to objects on the Variation map | |
Any part of marker position overlaps with mRNA location (or overlaps with UTR/intron and mRNA feature is missing), BUT marker position is not within the coding region of the transcript. | |
Any part of the marker position overlaps with a coding sequence (CDS) region (or overlaps with exon region in the unlikely case an exon is annotated but CDS is missing). | |
A variation object that is connected with genotype information. | |
A variation object for which submitter links are available. | |
heterozygosity of 80-90% | |
heterozygosity of >90% |
Advanced Search Page |
The Advanced Search page (see example) allows you to use a number of query options by simply checking boxes or radio buttons that represent various search fields, properties, object types. It also allows you to limit your query to one or more chromosomes. The Advanced Search page is accessible from the header region of the genome view page (described in the general Map Viewer Help document). |
Search Tips |
Show Linked Entries: finding associated objects on other maps |
The human genome search page provides an option to "Show linked entries" under the text box. If that option is not checked, the search system will only retrieve map elements that contain your search term in their descriptions. If that option is checked, the search system will retrieve the latter, plus associated map elements that do not necessarily contain the search string. Examples are below. Note: Do not use the "Show linked entries" option if you anticipate your search will retrieve a large number of map elements. It will cause your search results to be extremely long. Making connections between disease phenotypes on the Morbid map and STSs To find STSs associated with a disease specific phenotype:
To find a disease phenotype associated with specific STSs:
To see a list of all the disease phenotypes from OMIM that have links to associated STSs:
How links are made between disease phenotypes on the Morbid map and STSs For known genes, links are established in an automated way by using e-PCR to compare the data in UniSTS against the mRNAs for those genes. For disease phenotypes with no known genes, links are based on published references. If an article about a disease cites an STS, a link to that STS is provided in Map Viewer through the "show linked entries" function. |
FTP Data |
FTP Map Data |
The Map Viewer data are available in the ftp://ftp.ncbi.nih.gov/genomes/H_sapiens/mapview/ directory of NCBI's FTP site. Map data can also be downloaded for a complete chromosome or chromosome region by using the Data as Table View option, which is accessible from the blue sidebar of a chromosome display. More information about this option is provided in View/Download Chromosome Region, above. |
FTP Sequence Data |
The ftp://ftp.ncbi.nih.gov/genomes/H_sapiens/ directory of the NCBI FTP site contains one folder for each chromosome, which includes genomic contigs (NT_* records) built from finished and unfinished sequence data. The contigs are available in various formats, described below. The contig assembly and annotation process is described in a separate document.
In addition, sequence data can be downloaded from the graphic display of sequence maps, as explained in View/Download Chromosome Region, above. |
Constructing URLs to link to Map Viewer |
If you would like to create WWW links to the Map Viewer, the instructions for constructing URLs are given in the the general Map Viewer Help document. You can construct URLs that either perform a search or display a specific mapped object or chromosomal region. |
Questions or Comments? Write to the NCBI Service Desk |