The MCSG consortium has been organized around seven highly integrated core-technology units. All the data and materials generated by individual cores are tracked by MCSG databases as each target moves down the pipeline from core to core.

 


 

The primary objective of the Midwest Center for Structural Genomics (MCSG) is to rapidly determine the structures of large numbers of strategically selected proteins in order to elucidate protein fold space.

Our targets include: proteins from large families with no structural representative, biomedically important and high-value proteins from pathogens and higher eukaryotes, and community selected targets. The long-term goal is to provide structural coverage of major protein superfamilies with sufficient granularity to allow 3D homology modeling of all proteins using only computational methods. The ultimate goal is to build a foundation for 21st century structural biology where the structures of virtually all proteins will be found in the Protein Data Bank (PDB) or derived by computational methods. To achieve this ultimate goal, the MCSG continues to determine protein crystal structures while implementing and refining rapid, highly integrated, and cost-effective methods for such determination by X-ray crystallography using high-performance beamlines at third-generation synchrotron X-ray sources. We also continue to develop and maintain modern laboratory information management systems and databases, and other related technologies that are vital to the success of the primary mission.

The MCSG consortium has been organized around seven highly integrated core-technology units described below. All the data and materials generated by individual cores are tracked by MCSG databases as each target moves down the pipeline from core to core.


 

  | Target Selection |

The Bioinformatics Core has key expertise and resources in genome sequence analysis, target selection, structure and fold analysis, computational approaches to assigning protein function and homology modeling. This core performs two important tasks: (1) target selection and (2) analysis of structures and folds. Argonne provides bioinformatics support in specialized databases and analysis of some pathogenic organisms. This Bioinformatics Core interacts closely with the Database Core in data dissemination between members of the consortium and the public. The product of the Bioinformatics Core is a prioritized list of gene targets for the Gene Cloning and Protein Expression Core.


 
  | Gene Cloning and Protein Expression |

The Gene Cloning and Protein Expression Core is responsible for the generation of well-characterized protein expression systems for structural analysis. This core performs two important tasks: (1) production and characterization of expression clones, (2) development of expression constructs and strategies for high-value targets. This core designs PCR primers for gene cloning, selects vectors and strains for cloning and expression, tests protein expression and solubility. To streamline the process, targets for cloning are grouped by genomes. High-value targets are exchanged between sites to apply site-specific approaches. The core distributes clones to the Purification and Crystallization Core.
The Eukaryotic and Viral Proteins Expression Core focuses entirely on high-value targets that cannot be expressed using bacterial and cell-free expression systems. This core interacts closely with the Argonne component of the Gene Cloning and Protein Expression Core.

 
  | Protein Production |

The Purification Core is responsible for the efficient generation of proteins for structural analysis. The Core is responsible for the efficient generation of proteins and crystals for structural analysis. This core performs two key tasks: (1) protein purification and characterization and (2) methods development for the purification of high-value proteins.

 
  | Crystal Production |

The Crystallization Core is responsible for the efficient generation of crystals for structural analysis. This core performs two key tasks: (1) crystallization and (2) methods development for the crystallization of high-value proteins. The core also distributes crystals to the Data Collection and Analysis Core.

 
  | Structure Determination and Refinement |

The Data Collection and Analysis Core is responsible for HTP crystal analysis and data collection. This unit interacts very closely with the Purification and Crystallization, and the Structure Determination Cores. The crystals may be shipped from several locations to the beamlines to be tested for diffraction. This core is responsible for selecting the best crystals and data collection strategies. The main products of the core are verified diffraction data sets. The core works closely with beamline staff and coordinates data collection.
The Structure Determination Core is responsible for determination, refinement, validation, and deposition of protein structures. This core interacts very closely with Data Collection and Analysis, Bioinformatics, and the Databases and LIMS Cores. Data is distributed among crystallographers for automated structure determination, refinement, and validation. The main products are completed structures.

 
  | Model Validation & Fold Function Analysis |

MCSG tries to automate the process of fold recognition, assignment and model validation, automate deposition of structures to databases, and speed up data release. Development a database to monitor and evaluate all steps of the process from gene to structure is essential. The database creates a self-training system to aid decision making process and improve efficiency.

 

  | Dissemination of Data |

The Databases and LIMS Core is responsible for collecting, tracking, and analyzing experimental data from collaborating institutions and disseminating data to the public on behalf of the entire MCSG project. The main products of this core are databases, information analysis, and reports. The PDB deposit is treated as a report from the database. The core also links all MCSG computational resources, interacts with outside databases, and transmits data to the PSI Network Knowledge Base. The Bioinformatics Core provides important contributions to the Databases and LIMS Core .