Skip Navigation Bar

Unified Medical Language System® (UMLS®)

The CORE Problem List Subset of SNOMED CT®

Current Data Files
(expect a new version of the subset for each new release of SNOMED CT and the UMLS Metathesaurus)

CORE Problem List Subset VersionDerived from SNOMED CT versionDerived from UMLS Metathesaurus version
SNOMEDCT_CORE_SUBSET_201208 July 2012 International Release 2012AA

Past Data Files from:

2012

CORE Problem List Subset VersionDerived from SNOMED CT versionDerived from UMLS Metathesaurus version
SNOMEDCT_CORE_SUBSET_201205 January 2012 International Release 2012AA
SNOMEDCT_CORE_SUBSET_201202 January 2012 International Release 2011AB

2011

CORE Problem List Subset VersionDerived from SNOMED CT versionDerived from UMLS Metathesaurus version
SNOMEDCT_CORE_SUBSET_201111 July 2011 International Release 2011AB
SNOMEDCT_CORE_SUBSET_201108 July 2011 International Release 2011AA
SNOMEDCT_CORE_SUBSET_201105 January 2011 International Release 2011AA
SNOMEDCT_CORE_SUBSET_201102 January 2011 International Release 2010AB

2010

CORE Problem List Subset VersionDerived from SNOMED CT versionDerived from UMLS Metathesaurus version
SNOMEDCT_CORE_SUBSET_201011 July 2010 International Release 2010AB
SNOMEDCT_CORE_SUBSET_201008 July 2010 International Release 2010AA
SNOMEDCT_CORE_SUBSET_201005 January 2010 International Release 2010AA
SNOMEDCT_CORE_SUBSET_201002 January 2010 International Release 2009AB

2009

CORE Problem List Subset VersionDerived from SNOMED CT versionDerived from UMLS Metathesaurus version
SNOMEDCT_CORE_SUBSET_200911 July 2009 International Release 2009AB
SNOMEDCT_CORE_SUBSET_200908 July 2009 International Release 2009AA
SNOMEDCT_CORE_SUBSET_200907 January 2009 International Release 2009AA

Introduction

The CORE Problem List Subset of SNOMED CT® is an output of the UMLS CORE Project (CORE stands for Clinical Observations Recording and Encoding). The purpose of the UMLS CORE Project is to define a UMLS subset that is most useful for documentation and encoding of clinical information at a summary level, such as problem list, discharge diagnosis or reason of encounter. A key aspect of the Project is the collation and analysis of datasets collected from health care institutions that utilize controlled vocabularies for data entry. These datasets contain the list of controlled terms and their actual frequency of usage in clinical databases.

The original subset was based on datasets submitted by 7 institutions - Beth Israel Deaconess Medical Center, Intermountain Healthcare, Kaiser Permanente, Mayo Clinic, Nebraska University Medical Center, Regenstrief Institute and Hong Kong Hospital Authority. These institutions are large-scale, mixed inpatient-outpatient facilities that cover most major medical specialties (including Internal Medicine, General Surgery, Pediatrics, Obstetrics, Gynecology, Psychiatry and Orthopedics). From the 201208 version onwards, problem list data from the Veterans Administration have also been incorporated. The most frequently used 16,874 terms that cover 95% of usage volume in each institution are mapped to UMLS concepts using lexical matching supplemented by manual review.

Through the UMLS, mappings from the local terms to SNOMED CT concepts are identified. This constitutes the CORE Problem List Subset of SNOMED CT. Unmapped local terms that are considered useful for the problem list are submitted to the International Health Terminology Standards Development Organisation (IHTSDO). If accepted, the new SNOMED CT concepts are added to the CORE Subset. The 201108 release of the CORE Subset covers 92% of the 16,874 local terms. As SNOMED CT is the designated U.S. standard terminology for diagnosis and problem lists, and one of the requirements of the ‘Meaningful Use’ criteria of the Electronic Health Record, we believe that identifying a frequently subset of SNOMED CT concepts will be useful to users who want to implement SNOMED CT in their clinical systems.

Purpose and use of subset

The main purpose of the SNOMED CT CORE subset is to facilitate the use of SNOMED CT as the primary coding terminology for problem lists or other summary level clinical documentation. The use of a common list of SNOMED CT concepts will maximize data interoperability among institutions. Local problem list vocabularies often need to expand to satisfy specific user needs. Users should check to see if SNOMED CT contains terminology for concepts they need to meet local requirements. The UMLS Terminology Services (UTS) include a SNOMED CT browser that may be used for this purpose. The SNOMED CT Browser is available through the Applications menu of the UTS. When adding new concepts that are not covered by SNOMED CT, users are encouraged to follow the rules of post-coordination in SNOMED CT where possible. For instance, if a new concept ‘Left kidney stone’ is needed, it can be created by adding the qualifier concept ‘7771000 Left’ as a laterality attribute to the CORE concept ‘95570007 Kidney stone’. In this way, link to the CORE concepts is maintained and divergence of problem list vocabularies can be minimized. Institutions that are using their own problem list vocabularies are encouraged to map them to SNOMED CT with a focus on the CORE concepts to facilitate data interoperability.

Choice of SNOMED CT concepts

To find the most appropriate SNOMED CT concepts for each problem list term, the following guidelines are used:

  • Only current SNOMED CT concepts are included (concept status = 0)
  • Concepts with names prefaced with a symbol in square brackets e.g. ‘307724003 [D] Left upper quadrant pain (situation)’ are excluded. These are legacy concepts inherited from the National Health Service Clinical Terms, version 3 (CTV3) and are not recommended for use in clinical records
  • Concepts belonging to the Non-Human Subset are excluded
  • Most concepts are chosen from the following 4 hierarchies: Clinical finding, Procedure, Situation with explicit context, and Events
  • Within the Clinical finding hierarchy, when two very similar concepts exist e.g. ‘12441001 Epistaxis (disorder)’ and ‘249366005 Bleeding from nose (finding)’, the disorder concept is favored
  • Procedures are included in the CORE subset because they occur in some problem lists. However, some institutions keep a separate procedures list (not included in the generation of the CORE Subset). Some terms (< 100) are intended to indicate the past occurrence of a procedure e.g. ‘Personal history of cardiac catheterization’. These are mapped to the corresponding SNOMED CT concepts when such exist (typically in the Situation with explicit context hierarchy). When no such SNOMED CT concepts exist, they are mapped to the procedure itself. If users think it is necessary, the precise meaning can still be expressed by post-coordinating the procedure as the associated procedure attribute to the concept ‘416940007 Past history of procedure (situation)’, or by using special flags in their information model
  • Terms that indicate the presence of a medical device e.g. ‘Heart Valve Prosthesis’ are mapped to the procedure of introducing the device ‘307279007 Prosthetic replacement of heart valve (procedure)’ when no more appropriate SNOMED CT concepts exist
  • For a small number of terms (< 50), a concept outside the above 4 hierarchies is chosen because there are no existing SNOMED CT concepts more suitable to represent the term. Some examples are ‘128856005 Thymoma (morphologic abnormality)’ and ‘105463007 Donor for liver transplant (person)’. The required concepts are submitted to the IHTSDO. Most of the original concepts have been replaced by new SNOMED CT concepts in the appropriate hierarchies.

Mapping to ICD-9-CM and ICD-10-CM

We recognize that problem list data are sometimes used to generate ICD-9-CM codes for reimbursement and other purposes. We have created a rule-based draft map for about 5,000 SNOMED CT concepts. But this draft map has not been updated since release and some codes may have become obsolete. A simple equivalence map from SNOMED CT to ICD-9-CM is available with each international release of SNOMED CT.

In the migration to SNOMED CT as the primary clinical terminology for patient problems (diseases and conditions), it is desirable that the legacy ICD-9-CM data be translated to SNOMED CT. We have published a ICD-9-CM to SNOMED CT map to facilitate this translation.

In preparation for the transition to ICD-10-CM, we have created a map from SNOMED CT to ICD-10-CM which can be found here.

Additional resources from the UMLS

For each SNOMED CT concept in the subset, the corresponding UMLS CUI is listed. Through this, users can have access to resources available in the UMLS e.g., additional synonyms (beyond those present in SNOMED CT itself), text definitions for many terms, corresponding codes in other terminologies.

File description

The SNOMED CT CORE subset data file has the following fields:

  • SNOMED_CID – conceptId of the SNOMED CT concept
  • SNOMED_FSN – SNOMED CT fully-specified name
  • SNOMED_CONCEPT_STATUS – concept status of the SNOMED CT concept
  • UMLS_CUI – the corresponding UMLS concept identifier, if concept is not yet in the UMLS this will be NA (not available)
  • OCCURRENCE – number of institutions having this concept on their problem list (from 1 to 8), not populated for concepts retired from Subset
  • USAGE – the average usage percentage among all institutions (i.e. sum of individual usage percentages divided by 8), not populated for concepts retired from Subset
  • FIRST_IN_SUBSET – the version of Subset first containing this concept
  • IS_RETIRED_FROM_SUBSET – in future, some concepts will be marked retired if they are retired by IHTSDO or no longer considered to be useful e.g. when there are more appropriate SNOMED CT concepts
  • LAST_IN_SUBSET – the version of Subset last containing this concept, only populated for concepts retired from Subset
  • REPLACED_BY_SNOMED_CID - SNOMED CT concept to replace concept retired from Subset, only populated for concepts retired from Subset

A sample database load script can be found here.

The file identifies what we hope is a useful subset of SNOMED CT, but does not include all the information likely to be required to make effective use of SNOMED CT in an application, such as synonyms that may include more clinician-friendly terms than the SNOMED CT fully-specified name. The identifiers in the file can be used to extract more complete information for these concepts from either the UMLS release files or the SNOMED CT native format files.

Update and maintenance

A new version of the subset will be published for each new release of SNOMED CT and the UMLS. Newly retired SNOMED CT concepts will be flagged and additional concepts may be added if appropriate.

SNOMED CT license requirement

SNOMED CT is owned by the International Health Terminology Standards Development Organisation (IHTSDO), of which NLM is the US Member. Use of SNOMED CT is subject to the IHTSDO Affiliate license provisions ( incorporated in the License Agreement for Use of the UMLS® Metathesaurus® as Appendix 2) and is free in IHTSDO Member countries (http://www.ihtsdo.org/members) including the United States, in low income countries (http://www.ihtsdo.org/news/article/view/ihtsdo-announces-free-use-of-snomed-ct-in-low-income-countries/), and for approved research projects in any country.

Feedback and suggestions

We welcome any questions, comments or suggestions that would improve the quality, accuracy and usability of the subset. Please send feedback to Dr. Kin Wah Fung, Lister Hill National Center for Biomedical Communications, National Library of Medicine (email: kwfung@nlm.nih.gov).

Acknowledgements

We thank the institutions that supplied datasets used to define the subset and SNOMED Terminology Solutions of the College of American Pathologists for reviewing part of the subset and giving us valuable suggestions.

Lister Hill National Center for Biomedical Communications
U.S. National Library of Medicine