Skip Over Navigation Links

PSI Pilot Phase Fact Sheet

Piloting high-throughput structure determination

Crystal structure of a protein with unknown function from Pseudomonas aeruginosa, a disease-causing bacterium.

The Protein Structure Initiative (PSI) is a national effort to assemble a large collection of protein structures in a high-throughput operation. To be successful, the PSI must cut the costs and time it takes to produce proteins for study and to determine their three-dimensional structures. The long-range goal is to make the atomic-level structures of most proteins easily obtainable from their corresponding DNA sequences. Knowledge gained could help researchers better understand the function of proteins, learn how altered structures can contribute to disease and identify new targets for drug development.

PSI Pilot Phase Facts at a Glance

Goal: To develop new approaches and tools needed to streamline and automate the steps of protein structure determination, and to incorporate those methods into high-throughput pipelines that use DNA sequence information to generate three-dimensional protein structure models

Project period
: September 2000 to June 2005

: $270 million (funded largely by the National Institute of General Medical Sciences, with additional support from the National Institute of Allergy and Infectious Diseases)

Number of Centers
: 9

Solved protein structures
: More than 1,100

Unique structures solved
(structures sharing less than 30 percent of their sequence with other known proteins): More than 700

Selected Technical Advances

  • The “Sesame” Laboratory Information Management System allows users to enter, process, view and extract relevant data from any location using a series of Web-base applications.
  • Auto-induction protocols allow automatic induction of bacterial protein production. These protocols produce 10 times more protein per volume of culture than traditional methods.
  • Systems based on fusions of a target protein with a fluorescent tag have been developed to evaluate whether the target has folded properly when expressed in cells or in vitro and for determining whether the target is present in soluble form. These methods can be used to engineer proteins to improve their folding and solubility.
  • A fully integrated robotic crystallization system can set up a 96-well plate every 2 minutes, giving it a maximum throughput of 2,880 different crystallization experiments per hour.
  • Storage and crystal imaging units can quickly image 96 wells of a crystallization plate.
  • Small-volume crystallization chips, now used widely by crystallographers, are used to screen conditions efficiently and speed crystal growth.
  • Incorporation of a wheat germ cell-free expression system holds the promise of increasing the production of proteins from higher organisms.
  • Automated software for X-ray crystallographic structure determination can carry out fully automated determination of three-dimensional protein structures from X-ray diffraction data.
  • Automatic crystal mounting and crystal screening robots use computational processes to automatically screen crystals for quality or for contiguous collection of multiple data sets.
  • The interaction between various pieces of lab equipment, a bar code writer and a personal digital assistant through a wireless computer network allows for inexpensive, small-scale automation of a lab environment and can replace the old-fashioned laboratory notebook.
  • Automated NMR data analysis is a fully integrated data analysis platform that pulls together the complete process of protein NMR structure determination and analysis, as well as archiving raw NMR data and intermediate results.
  • Automated post-structure functional analysis software is used to search a three-dimensional structure against databases of three-dimensional structural templates and identify functionally important motifs. A Web interface for the software has been designed, and it can be used to obtain a summary of a protein’s most likely function.

Pilot Centers

Additional information about the pilot centers.

  • Berkeley Structural Genomics Center Link to external Web site focused on two bacterial species with extremely small genomes to study proteins essential for independent life.
    Principal investigator: Sung-Hou Kim, Lawrence Berkeley National Laboratory
  • Center for Eukaryotic Structural Genomics Link to external Web site, based in Wisconsin, focused on protein production, characterization and structure determination from Arabidopsis thaliana, a plant that is frequently used in laboratory research and that has many genes in common with humans and animals.
    Principal investigator: John Markley, University of Wisconsin, Madison
  • Joint Center for Structural Genomics Link to external Web site, based in California, focused on novel structures from thermophilic microorganisms and on human proteins thought to be involved in cell signaling.
    Principal investigator: Ian Wilson, The Scripps Research Institute
  • Midwest Center for Structural Genomics Link to external Web site, based in Illinois, selected bacterial targets related to disease and proteins from all three kingdoms of life. The emphasis was on previously unknown folds and on proteins from disease-causing organisms.
    Principal investigator: Andrzej Joachimiak, Argonne National Laboratory
  • New York Structural Genomics Research Consortium Link to external Web site solved protein structures for disease-related proteins from eukaryotes and bacteria.
    Principal investigator: Stephen K. Burley, Structural GenomiX, Inc.
  • Northeast Structural Genomics Consortium Link to external Web site, based in New Jersey, focused on target proteins from various model organisms, including the fruit fly, yeast and roundworm. It used both X-ray crystallography and NMR spectroscopy.
    Principal investigator: Gaetano Montelione, Rutgers University
  • The Southeast Collaboratory for Structural Genomics, based in Georgia, determined structures from the prokaryotic model organism, Pyrococcus furiosus, and the eukaryotic model organism C. elegans, as well as some human proteins.
    Principal investigator: Bi-Cheng Wang, University of Georgia
  • Structural Genomics of Pathogenic Protozoa Consortium Link to external Web site, based in Washington, solved protein structures from organisms known as protozoans, many species of which cause deadly diseases such as sleeping sickness, malaria and Chagas' disease.
    Principal investigator: Wim G. J. Hol, University of Washington
  • TB Structural Genomics Consortium Link to external Web site, based in New Mexico, analyzed protein structures from Mycobacterium tuberculosis.
    Principal investigator: Thomas Terwilliger, Los Alamos National Laboratory


NIGMS supports basic biomedical research that is the foundation for advances in the diagnosis, treatment and prevention of disease. NIGMS is part of the National Institutes of Health, U.S. Department of Health and Human Services. To learn more about NIGMS, visit

Content created July 2005

This page last reviewed on November 26, 2012