Core Data Structures#

Molecular structure classes for HBAT.

This module contains the core data structures representing molecular entities including atoms, bonds, and residues from PDB files.

class hbat.core.structure.Bond(atom1_serial: int, atom2_serial: int, bond_type: str = 'covalent', distance: float | None = None, detection_method: str = 'distance_based')[source]#

Bases: object

Represents a chemical bond between two atoms.

This class stores information about atomic bonds, including the atoms involved and bond type/origin.

Parameters:
  • atom1_serial (int) – Serial number of first atom

  • atom2_serial (int) – Serial number of second atom

  • bond_type (str) – Type of bond (β€˜covalent’, β€˜explicit’, etc.)

  • distance (Optional[float]) – Distance between bonded atoms in Angstroms

  • detection_method (str) – Method used to detect this bond

__init__(atom1_serial: int, atom2_serial: int, bond_type: str = 'covalent', distance: float | None = None, detection_method: str = 'distance_based') None[source]#

Initialize a Bond object.

Parameters:
  • atom1_serial (int) – Serial number of first atom

  • atom2_serial (int) – Serial number of second atom

  • bond_type (str) – Type of bond (β€˜covalent’, β€˜explicit’, etc.)

  • distance (Optional[float]) – Distance between bonded atoms in Angstroms

  • detection_method (str) – Method used to detect this bond

involves_atom(serial: int) bool[source]#

Check if bond involves the specified atom.

Parameters:

serial (int) – Atom serial number

Returns:

True if bond involves this atom

Return type:

bool

get_partner(serial: int) int | None[source]#

Get the bonding partner of the specified atom.

Parameters:

serial (int) – Atom serial number

Returns:

Serial number of bonding partner, None if atom not in bond

Return type:

Optional[int]

__iter__() Iterator[Tuple[str, Any]][source]#

Iterate over bond attributes as (name, value) pairs.

Returns:

Iterator of (attribute_name, value) tuples

Return type:

Iterator[Tuple[str, Any]]

to_dict() Dict[str, Any][source]#

Convert bond to dictionary.

Returns:

Dictionary representation of the bond

Return type:

Dict[str, Any]

classmethod fields() List[str][source]#

Get list of field names.

Returns:

List of field names

Return type:

List[str]

__repr__() str[source]#

String representation of the bond.

__eq__(other: object) bool[source]#

Check equality with another Bond.

__hash__() int[source]#

Hash function for Bond objects to make them hashable.

class hbat.core.structure.Atom(serial: int, name: str, alt_loc: str, res_name: str, chain_id: str, res_seq: int, i_code: str, coords: NPVec3D, occupancy: float, temp_factor: float, element: str, charge: str, record_type: str, residue_type: str = 'L', backbone_sidechain: str = 'S', aromatic: str = 'N')[source]#

Bases: object

Represents an atom from a PDB file.

This class stores all atomic information parsed from PDB format including coordinates, properties, and residue information.

Parameters:
  • serial (int) – Atom serial number

  • name (str) – Atom name

  • alt_loc (str) – Alternate location indicator

  • res_name (str) – Residue name

  • chain_id (str) – Chain identifier

  • res_seq (int) – Residue sequence number

  • i_code (str) – Insertion code

  • coords (NPVec3D) – 3D coordinates

  • occupancy (float) – Occupancy factor

  • temp_factor (float) – Temperature factor

  • element (str) – Element symbol

  • charge (str) – Formal charge

  • record_type (str) – PDB record type (ATOM or HETATM)

  • residue_type (str) – Residue type classification (P=Protein, D=DNA, R=RNA, L=Ligand)

  • backbone_sidechain (str) – Backbone/sidechain classification (B=Backbone, S=Sidechain)

  • aromatic (str) – Aromatic classification (A=Aromatic, N=Non-aromatic)

__init__(serial: int, name: str, alt_loc: str, res_name: str, chain_id: str, res_seq: int, i_code: str, coords: NPVec3D, occupancy: float, temp_factor: float, element: str, charge: str, record_type: str, residue_type: str = 'L', backbone_sidechain: str = 'S', aromatic: str = 'N') None[source]#

Initialize an Atom object.

Parameters:
  • serial (int) – Atom serial number

  • name (str) – Atom name

  • alt_loc (str) – Alternate location indicator

  • res_name (str) – Residue name

  • chain_id (str) – Chain identifier

  • res_seq (int) – Residue sequence number

  • i_code (str) – Insertion code

  • coords (NPVec3D) – 3D coordinates

  • occupancy (float) – Occupancy factor

  • temp_factor (float) – Temperature factor

  • element (str) – Element symbol

  • charge (str) – Formal charge

  • record_type (str) – PDB record type (ATOM or HETATM)

  • residue_type (str) – Residue type classification (P=Protein, D=DNA, R=RNA, L=Ligand)

  • backbone_sidechain (str) – Backbone/sidechain classification (B=Backbone, S=Sidechain)

  • aromatic (str) – Aromatic classification (A=Aromatic, N=Non-aromatic)

is_hydrogen() bool[source]#

Check if atom is hydrogen.

Returns:

True if atom is hydrogen or deuterium

Return type:

bool

is_metal() bool[source]#

Check if atom is a metal.

Returns:

True if atom is a common metal ion

Return type:

bool

get_vdw_radius() float[source]#

Get van der Waals radius of this atom in Angstroms.

Returns:

vdW radius (default 2.0 if element unknown)

Return type:

float

calculate_vdw_distance(other: Atom) float[source]#

Calculate sum of van der Waals radii between this and another atom.

Parameters:

other (Atom) – The other atom

Returns:

Sum of vdW radii in Angstroms

Return type:

float

find_bonded_atom(element: str | set, bonds: list, atoms: list) Atom | None[source]#

Find the first bonded atom matching the given element(s).

Parameters:
  • element (Union[str, set]) – Element symbol string or set of element symbols to match

  • bonds (list) – List of Bond objects to search

  • atoms (list) – List of Atom objects to search

Returns:

First matching bonded atom, or None if not found

Return type:

Optional[Atom]

get_bonded_donor(bonds: list, atoms: list) Atom | None[source]#

Find the donor heavy atom (N/O/S) bonded to this hydrogen.

Parameters:
  • bonds (list) – List of Bond objects to search

  • atoms (list) – List of Atom objects to search

Returns:

Bonded donor atom, or None

Return type:

Optional[Atom]

get_bonded_carbon(bonds: list, atoms: list) Atom | None[source]#

Find the carbon atom bonded to this atom (e.g. for halogen bonding).

Parameters:
  • bonds (list) – List of Bond objects to search

  • atoms (list) – List of Atom objects to search

Returns:

Bonded carbon atom, or None

Return type:

Optional[Atom]

get_bonded_hydrogen(bonds: list, atoms: list) Atom | None[source]#

Find the hydrogen atom bonded to this donor atom.

Parameters:
  • bonds (list) – List of Bond objects to search

  • atoms (list) – List of Atom objects to search

Returns:

Bonded hydrogen atom, or None

Return type:

Optional[Atom]

classify_lone_pair_subtype(residue: Residue) str[source]#

Classify this atom’s lone pair subtype for nβ†’Ο€* interaction analysis.

Parameters:

residue (Residue) – The residue containing this atom

Returns:

Subtype classification string

Return type:

str

__iter__() Iterator[Tuple[str, Any]][source]#

Iterate over atom attributes as (name, value) pairs.

Returns:

Iterator of (attribute_name, value) tuples

Return type:

Iterator[Tuple[str, Any]]

to_dict() Dict[str, Any][source]#

Convert atom to dictionary.

Returns:

Dictionary representation of the atom

Return type:

Dict[str, Any]

classmethod fields() List[str][source]#

Get list of field names.

Returns:

List of field names

Return type:

List[str]

__repr__() str[source]#

String representation of the atom.

__eq__(other: object) bool[source]#

Check equality with another Atom.

__hash__() int[source]#

Hash function for Atom objects to make them hashable.

class hbat.core.structure.Residue(name: str, chain_id: str, seq_num: int, i_code: str, atoms: List[Atom])[source]#

Bases: object

Represents a residue containing multiple atoms.

This class groups atoms belonging to the same residue and provides methods for accessing and analyzing residue-level information.

Parameters:
  • name (str) – Residue name (e.g., β€˜ALA’, β€˜GLY’)

  • chain_id (str) – Chain identifier

  • seq_num (int) – Residue sequence number

  • i_code (str) – Insertion code

  • atoms (List[Atom]) – List of atoms in this residue

__init__(name: str, chain_id: str, seq_num: int, i_code: str, atoms: List[Atom]) None[source]#

Initialize a Residue object.

Parameters:
  • name (str) – Residue name (e.g., β€˜ALA’, β€˜GLY’)

  • chain_id (str) – Chain identifier

  • seq_num (int) – Residue sequence number

  • i_code (str) – Insertion code

  • atoms (List[Atom]) – List of atoms in this residue

get_atom(atom_name: str) Atom | None[source]#

Get specific atom by name.

Parameters:

atom_name (str) – Name of the atom to find

Returns:

The atom if found, None otherwise

Return type:

Optional[Atom]

get_atoms_by_element(element: str) List[Atom][source]#

Get all atoms of specific element.

Parameters:

element (str) – Element symbol (e.g., β€˜C’, β€˜N’, β€˜O’)

Returns:

List of atoms matching the element

Return type:

List[Atom]

get_carbonyl_groups(atom_to_index: Dict[Atom, int]) List[Tuple[int, int, bool, str]][source]#

Identify C=O groups in this residue.

Parameters:

atom_to_index – Mapping from Atom objects to their global indices

Returns:

List of (C_index, O_index, is_backbone, residue_id) tuples

get_lone_pair_donor_atoms() List[Tuple[Atom, str, str]][source]#

Return (atom, element, subtype) tuples for O/N/S lone pair donors in this residue.

get_aromatic_center() NPVec3D | None[source]#

Calculate aromatic ring center if residue is aromatic.

For aromatic residues (PHE, TYR, TRP, HIS), calculates the geometric center of the aromatic ring atoms.

Returns:

Center coordinates of aromatic ring, None if not aromatic

Return type:

Optional[NPVec3D]

__iter__() Iterator[Tuple[str, Any]][source]#

Iterate over residue attributes as (name, value) pairs.

Returns:

Iterator of (attribute_name, value) tuples

Return type:

Iterator[Tuple[str, Any]]

to_dict() Dict[str, Any][source]#

Convert residue to dictionary.

Returns:

Dictionary representation of the residue

Return type:

Dict[str, Any]

classmethod fields() List[str][source]#

Get list of field names.

Returns:

List of field names

Return type:

List[str]

__repr__() str[source]#

String representation of the residue.

__eq__(other: object) bool[source]#

Check equality with another Residue.

__hash__() int[source]#

Hash function for Residue objects to make them hashable.