CCD Analyzer#

Chemical Component Dictionary (CCD) BinaryCIF Data Analyzer

This module provides efficient parsing and lookup functionality for CCD BinaryCIF files, with automatic download capabilities and in-memory data structures optimized for fast atom and bond lookups by residue and atom IDs.

class hbat.ccd.ccd_analyzer.CCDDataManager(ccd_folder: str | None = None)[source]#

Bases: object

Manages Chemical Component Dictionary data with efficient lookup capabilities.

This class handles automatic download of CCD BinaryCIF files and provides optimized in-memory data structures for fast lookups of atoms and bonds by component ID and atom ID.

__init__(ccd_folder: str | None = None)[source]#

Initialize the CCD data manager.

Parameters:

ccd_folder – Path to folder for storing CCD BinaryCIF files. If None, uses the user’s ~/.hbat/ccd-data directory.

ensure_files_exist() bool[source]#

Ensure CCD BinaryCIF files exist, downloading if necessary.

Returns:

True if files are available, False if download failed

load_atoms_data() bool[source]#

Load and parse atom data from CCD BinaryCIF file into memory.

Returns:

True if successful, False otherwise

load_bonds_data() bool[source]#

Load and parse bond data from CCD BinaryCIF file into memory.

Returns:

True if successful, False otherwise

get_component_atoms(comp_id: str) List[Dict][source]#

Get all atoms for a specific component.

Parameters:

comp_id – Component identifier (e.g., ‘ALA’, ‘GLY’)

Returns:

List of atom dictionaries for the component

get_component_bonds(comp_id: str) List[Dict][source]#

Get all bonds for a specific component.

Parameters:

comp_id – Component identifier (e.g., ‘ALA’, ‘GLY’)

Returns:

List of bond dictionaries for the component

get_atom_by_id(comp_id: str, atom_id: str) Dict | None[source]#

Get a specific atom by component and atom ID.

Parameters:
  • comp_id – Component identifier

  • atom_id – Atom identifier

Returns:

Atom dictionary if found, None otherwise

get_bonds_involving_atom(comp_id: str, atom_id: str) List[Dict][source]#

Get all bonds involving a specific atom.

Parameters:
  • comp_id – Component identifier

  • atom_id – Atom identifier

Returns:

List of bond dictionaries involving the atom

get_available_components() Set[str][source]#

Get set of all available component IDs.

Returns:

Set of component identifiers

get_component_summary(comp_id: str) Dict[source]#

Get summary information for a component.

Parameters:

comp_id – Component identifier

Returns:

Dictionary with component summary

extract_residue_bonds_data(residue_list: List[str]) Dict[str, Dict][source]#

Extract bond information for a list of residues in a format suitable for constants generation.

Parameters:

residue_list – List of residue codes to extract data for

Returns:

Dictionary mapping residue codes to their bond information