PDB Structure Fixer

PDB Structure Fixer#

Comprehensive PDB structure fixing and enhancement utilities for preparing molecular structures for analysis.

Module Overview#

PDB structure fixing module for adding missing hydrogen atoms.

This module provides functionality to add missing hydrogen atoms to PDB structures using either OpenBabel or PDBFixer tools. It integrates with HBAT’s internal data structures and provides a clean interface for structure enhancement.

This module provides advanced PDB structure enhancement capabilities including missing atom addition, residue standardization, and structure validation. It’s designed to prepare raw PDB structures for accurate molecular interaction analysis.

exception hbat.core.pdb_fixer.PDBFixerError[source]#

Bases: Exception

Exception raised when PDB fixing operations fail.

class hbat.core.pdb_fixer.PDBFixer[source]#

Bases: object

Fix PDB structures by adding missing hydrogen atoms.

This class provides methods to add missing hydrogen atoms to protein structures using either OpenBabel or PDBFixer with OpenMM. It works with HBAT’s internal atom and residue data structures.

__init__() → None[source]#: Initialize PDB fixer.

add_missing_hydrogens(atoms: List[Atom], method: str = 'openbabel', pH: float = 7.0, **kwargs: Any) → List[Atom][source]#

Add missing hydrogen atoms to a list of atoms.

Takes a list of HBAT Atom objects and returns a new list with missing hydrogen atoms added using the specified method.

Parameters:

atoms (List[Atom]) – List of atoms to process
method (str) – Method to use (‘openbabel’ or ‘pdbfixer’)
pH (float) – pH value for protonation (pdbfixer only)
kwargs (Any) – Additional parameters for the fixing method

Returns:

List of atoms with hydrogens added

Return type:

List[Atom]

Raises:

PDBFixerError if fixing fails

add_missing_heavy_atoms(atoms: List[Atom], method: str = 'pdbfixer', **kwargs: Any) → List[Atom][source]#

Add missing heavy atoms to a structure.

Uses PDBFixer to identify and add missing heavy atoms in residues. This is particularly useful for structures with incomplete side chains.

Parameters:

atoms (List[Atom]) – List of atoms to process
method (str) – Method to use (only ‘pdbfixer’ supports this)
kwargs (Any) – Additional parameters

Returns:

List of atoms with missing heavy atoms added

Return type:

List[Atom]

Raises:

PDBFixerError if fixing fails

convert_nonstandard_residues(atoms: List[Atom], custom_replacements: Dict[str, str] | None = None) → List[Atom][source]#

Convert non-standard residues to their standard equivalents using PDBFixer.

This method uses PDBFixer’s built-in findNonstandardResidues() and replaceNonstandardResidues() methods to properly handle non-standard residues.

Parameters:

atoms (List[Atom]) – List of atoms to process
custom_replacements (Optional[Dict[str, str]]) – Custom residue replacements to apply

Returns:

List of atoms with converted residue names

Return type:

List[Atom]

remove_heterogens(atoms: List[Atom], keep_water: bool = True) → List[Atom][source]#

Remove unwanted heterogens from the structure using PDBFixer.

Uses PDBFixer’s built-in removeHeterogens() method to properly handle heterogen removal with the option to keep water molecules.

Parameters:

atoms (List[Atom]) – List of atoms to process
keep_water (bool) – Whether to keep water molecules

Returns:

List of atoms with heterogens removed

Return type:

List[Atom]

fix_structure_file(input_path: str, output_path: str | None = None, method: str = 'openbabel', pH: float = 7.0, overwrite: bool = False, **kwargs: Any) → str[source]#

Fix a PDB file by adding missing hydrogen atoms.

Parameters:

input_path (str) – Path to input PDB file
output_path (Optional[str]) – Path for output file (optional)
method (str) – Method to use (‘openbabel’ or ‘pdbfixer’)
pH (float) – pH value for protonation (pdbfixer only)
overwrite (bool) – Whether to overwrite existing output file
kwargs (Any) – Additional parameters for the fixing method

Returns:

Path to the output file

Return type:

str

Raises:

PDBFixerError if fixing fails

fix_pdb_file_to_file(input_pdb_path: str, output_pdb_path: str, method: str = 'openbabel', add_hydrogens: bool = True, add_heavy_atoms: bool = False, convert_nonstandard: bool = False, remove_heterogens: bool = False, keep_water: bool = True, pH: float = 7.0, **kwargs: Any) → bool[source]#

Fix a PDB file and save the result to another file.

This method processes the original PDB file directly and saves the fixed structure to a new file, preserving proper PDB formatting.

Parameters:

input_pdb_path (str) – Path to the original PDB file
output_pdb_path (str) – Path where the fixed PDB should be saved
method (str) – Method to use (‘openbabel’ or ‘pdbfixer’)
add_hydrogens (bool) – Whether to add missing hydrogen atoms
add_heavy_atoms (bool) – Whether to add missing heavy atoms (pdbfixer only)
convert_nonstandard (bool) – Whether to convert nonstandard residues (pdbfixer only)
remove_heterogens (bool) – Whether to remove heterogens (pdbfixer only)
keep_water (bool) – Whether to keep water molecules when removing heterogens
pH (float) – pH value for protonation (pdbfixer only)
kwargs (Any) – Additional parameters

Returns:

True if fixing succeeded, False otherwise

Return type:

bool

Raises:

PDBFixerError if fixing fails

get_missing_hydrogen_info(atoms: List[Atom]) → Dict[str, Any][source]#

Analyze structure for missing hydrogen information.

Parameters:: atoms (List[Atom]) – List of atoms to analyze
Returns:: Dictionary with hydrogen analysis information
Return type:: Dict[str, Any]

hbat.core.pdb_fixer.add_missing_hydrogens(atoms: List[Atom], method: str = 'openbabel', pH: float = 7.0, **kwargs: Any) → List[Atom][source]#

Convenience function to add missing hydrogen atoms.

Parameters:

atoms (List[Atom]) – List of atoms to process
method (str) – Method to use (‘openbabel’ or ‘pdbfixer’)
pH (float) – pH value for protonation (pdbfixer only)
kwargs (Any) – Additional parameters for the fixing method

Returns:

List of atoms with hydrogens added

Return type:

List[Atom]

hbat.core.pdb_fixer.fix_pdb_file(input_path: str, output_path: str | None = None, method: str = 'openbabel', pH: float = 7.0, overwrite: bool = False, **kwargs: Any) → str[source]#

Convenience function to fix a PDB file.

Parameters:

input_path (str) – Path to input PDB file
output_path (Optional[str]) – Path for output file (optional)
method (str) – Method to use (‘openbabel’ or ‘pdbfixer’)
pH (float) – pH value for protonation (pdbfixer only)
overwrite (bool) – Whether to overwrite existing output file
kwargs (Any) – Additional parameters for the fixing method

Returns:

Path to the output file

Return type:

str

hbat.core.pdb_fixer.add_missing_heavy_atoms(atoms: List[Atom], method: str = 'pdbfixer', **kwargs: Any) → List[Atom][source]#

Convenience function to add missing heavy atoms.

Parameters:

atoms (List[Atom]) – List of atoms to process
method (str) – Method to use (only ‘pdbfixer’ supports this)
kwargs (Any) – Additional parameters

Returns:

List of atoms with missing heavy atoms added

Return type:

List[Atom]

hbat.core.pdb_fixer.convert_nonstandard_residues(atoms: List[Atom], custom_replacements: Dict[str, str] | None = None) → List[Atom][source]#

Convenience function to convert non-standard residues using PDBFixer.

Uses PDBFixer’s built-in findNonstandardResidues() and replaceNonstandardResidues() methods to properly handle non-standard residue conversion.

Parameters:

atoms (List[Atom]) – List of atoms to process
custom_replacements (Optional[Dict[str, str]]) – Custom residue replacements to apply

Returns:

List of atoms with converted residue names

Return type:

List[Atom]

hbat.core.pdb_fixer.remove_heterogens(atoms: List[Atom], keep_water: bool = True) → List[Atom][source]#

Convenience function to remove unwanted heterogens using PDBFixer.

Uses PDBFixer’s built-in removeHeterogens() method which only supports the option to keep or remove water molecules.

Parameters:

atoms (List[Atom]) – List of atoms to process
keep_water (bool) – Whether to keep water molecules

Returns:

List of atoms with heterogens removed

Return type:

List[Atom]

Main Classes#

PDBFixer#

class hbat.core.pdb_fixer.PDBFixer[source]#

Bases: object

Fix PDB structures by adding missing hydrogen atoms.

This class provides methods to add missing hydrogen atoms to protein structures using either OpenBabel or PDBFixer with OpenMM. It works with HBAT’s internal atom and residue data structures.

Comprehensive PDB structure enhancement engine with multiple fixing strategies.

Core Capabilities:

Missing Hydrogen Addition: Adds missing hydrogen atoms using chemical rules
Heavy Atom Reconstruction: Reconstructs missing heavy atoms in standard residues
Residue Standardization: Converts non-standard residues to standard forms
Hetrogen Management: Removes or retains specific heterogens
Structure Validation: Comprehensive quality assessment and reporting

Fixing Modes:

The fixer supports various combinations of fixing operations:

NONE: No modifications (validation only)
ADD_HYDROGENS: Add missing hydrogen atoms
CONVERT_RESIDUES: Standardize residue names
REMOVE_HETEROGENS: Remove non-essential heterogens
ADD_HEAVY_ATOMS: Reconstruct missing heavy atoms
Combined modes for comprehensive fixing

Usage Examples:

from hbat.core.pdb_fixer import PDBFixer
from hbat.constants import PDBFixingModes

# Basic hydrogen addition
fixer = PDBFixer()
fixer.fix_structure_file(
    "input.pdb",
    "output.pdb",
    mode=PDBFixingModes.ADD_HYDROGENS
)

# Comprehensive fixing
fixer.fix_structure_file(
    "raw_structure.pdb",
    "fixed_structure.pdb",
    mode=PDBFixingModes.ADD_HYDROGENS_AND_CONVERT_RESIDUES
)

# Advanced fixing with custom parameters
from hbat.constants import ParametersDefault

params = ParametersDefault()
params.fix_missing_heavy_atoms = True
params.remove_waters = False

fixer = PDBFixer(params)
result = fixer.fix_structure_file("complex.pdb", "enhanced.pdb")

print(f"Added {result.hydrogens_added} hydrogen atoms")
print(f"Converted {result.residues_converted} residues")

__init__() → None[source]#: Initialize PDB fixer.

add_missing_hydrogens(atoms: List[Atom], method: str = 'openbabel', pH: float = 7.0, **kwargs: Any) → List[Atom][source]#

Add missing hydrogen atoms to a list of atoms.

Takes a list of HBAT Atom objects and returns a new list with missing hydrogen atoms added using the specified method.

Parameters:

atoms (List[Atom]) – List of atoms to process
method (str) – Method to use (‘openbabel’ or ‘pdbfixer’)
pH (float) – pH value for protonation (pdbfixer only)
kwargs (Any) – Additional parameters for the fixing method

Returns:

List of atoms with hydrogens added

Return type:

List[Atom]

Raises:

PDBFixerError if fixing fails

add_missing_heavy_atoms(atoms: List[Atom], method: str = 'pdbfixer', **kwargs: Any) → List[Atom][source]#

Add missing heavy atoms to a structure.

Uses PDBFixer to identify and add missing heavy atoms in residues. This is particularly useful for structures with incomplete side chains.

Parameters:

atoms (List[Atom]) – List of atoms to process
method (str) – Method to use (only ‘pdbfixer’ supports this)
kwargs (Any) – Additional parameters

Returns:

List of atoms with missing heavy atoms added

Return type:

List[Atom]

Raises:

PDBFixerError if fixing fails

convert_nonstandard_residues(atoms: List[Atom], custom_replacements: Dict[str, str] | None = None) → List[Atom][source]#

Convert non-standard residues to their standard equivalents using PDBFixer.

This method uses PDBFixer’s built-in findNonstandardResidues() and replaceNonstandardResidues() methods to properly handle non-standard residues.

Parameters:

atoms (List[Atom]) – List of atoms to process
custom_replacements (Optional[Dict[str, str]]) – Custom residue replacements to apply

Returns:

List of atoms with converted residue names

Return type:

List[Atom]

remove_heterogens(atoms: List[Atom], keep_water: bool = True) → List[Atom][source]#

Remove unwanted heterogens from the structure using PDBFixer.

Uses PDBFixer’s built-in removeHeterogens() method to properly handle heterogen removal with the option to keep water molecules.

Parameters:

atoms (List[Atom]) – List of atoms to process
keep_water (bool) – Whether to keep water molecules

Returns:

List of atoms with heterogens removed

Return type:

List[Atom]

fix_structure_file(input_path: str, output_path: str | None = None, method: str = 'openbabel', pH: float = 7.0, overwrite: bool = False, **kwargs: Any) → str[source]#

Fix a PDB file by adding missing hydrogen atoms.

Parameters:

input_path (str) – Path to input PDB file
output_path (Optional[str]) – Path for output file (optional)
method (str) – Method to use (‘openbabel’ or ‘pdbfixer’)
pH (float) – pH value for protonation (pdbfixer only)
overwrite (bool) – Whether to overwrite existing output file
kwargs (Any) – Additional parameters for the fixing method

Returns:

Path to the output file

Return type:

str

Raises:

PDBFixerError if fixing fails

Fix a PDB file and save the result to another file.

This method processes the original PDB file directly and saves the fixed structure to a new file, preserving proper PDB formatting.

Parameters:

input_pdb_path (str) – Path to the original PDB file
output_pdb_path (str) – Path where the fixed PDB should be saved
method (str) – Method to use (‘openbabel’ or ‘pdbfixer’)
add_hydrogens (bool) – Whether to add missing hydrogen atoms
add_heavy_atoms (bool) – Whether to add missing heavy atoms (pdbfixer only)
convert_nonstandard (bool) – Whether to convert nonstandard residues (pdbfixer only)
remove_heterogens (bool) – Whether to remove heterogens (pdbfixer only)
keep_water (bool) – Whether to keep water molecules when removing heterogens
pH (float) – pH value for protonation (pdbfixer only)
kwargs (Any) – Additional parameters

Returns:

True if fixing succeeded, False otherwise

Return type:

bool

Raises:

PDBFixerError if fixing fails

get_missing_hydrogen_info(atoms: List[Atom]) → Dict[str, Any][source]#

Analyze structure for missing hydrogen information.

Parameters:: atoms (List[Atom]) – List of atoms to analyze
Returns:: Dictionary with hydrogen analysis information
Return type:: Dict[str, Any]

Fixing Methods Comparison#

The PDBFixer supports two different backend methods for hydrogen addition, each with distinct characteristics:

OpenBabel Method:

Algorithm: Chemical rules-based hydrogen placement
Sensitivity: More aggressive hydrogen placement, finds ~2x more hydrogen bonds
Quality: More permissive geometry, may include marginal bonds
Bond Perception: Enhanced with ConnectTheDots() and PerceiveBondOrders() for robust aromatic handling
Best For: Screening studies where sensitivity is prioritized
Typical Results: Higher hydrogen bond counts, broader interaction detection

PDBFixer Method:

Algorithm: Physics-based hydrogen placement using OpenMM force fields
Specificity: More conservative hydrogen placement, higher-quality geometry
Quality: Stricter geometric criteria, fewer false positives
Integration: Native support for missing heavy atoms and residue standardization
Best For: Detailed studies where accuracy is prioritized
Typical Results: Lower but higher-quality hydrogen bond counts

Method Selection Guidelines:

# For high-sensitivity screening
fixer = PDBFixer()
result = fixer.fix_structure_file(
    "structure.pdb", "fixed.pdb",
    method="openbabel"
)

# For high-accuracy analysis
fixer = PDBFixer()
result = fixer.fix_structure_file(
    "structure.pdb", "fixed.pdb",
    method="pdbfixer",
    add_heavy_atoms=True
)

Performance Characteristics:

OpenBabel: ~1.85x more hydrogen bonds than PDBFixer
Bond Quality: PDBFixer produces fewer short bonds (<2.0Å) and marginal angles
Computational Cost: Similar processing times for both methods
Memory Usage: Comparable memory requirements

Scientific Validation:

Both methods are scientifically valid but optimized for different use cases:

Research Publications: Document which method was used for reproducibility
Comparative Studies: Consider running both methods and comparing results
Quality Control: Monitor bond distance/angle distributions for quality assessment

Troubleshooting#

Common Warnings and Their Meanings:

OpenBabel Kekulization Warning:

*** Open Babel Warning in PerceiveBondOrders
Failed to kekulize aromatic bonds in OBMol::PerceiveBondOrders

Explanation: This warning occurs when OpenBabel cannot assign alternating single/double bonds to aromatic rings due to non-ideal geometry in the PDB structure. This is common with real crystal structures and does not prevent successful hydrogen addition.

Impact: Minimal - the analysis continues normally and hydrogen bonds are detected correctly.

Resolution: No action needed. This is expected behavior for structures with imperfect aromatic ring geometry.

PDBFixer pH Warnings:

PDBFixer may issue warnings about protonation states at extreme pH values. These are informational and indicate the tool is making reasonable chemical assumptions.

Bond Detection Warnings:

After structure fixing, the analyzer re-detects bonds to ensure proper connectivity. This process may generate informational messages about bond count changes, which are normal and expected.

Exception Classes#

class hbat.core.pdb_fixer.PDBFixerError[source]#

Bases: Exception

Exception raised when PDB fixing operations fail.

Specialized exception for PDB fixing operations with detailed error reporting.

Error Categories:

File Errors: Input/output file issues
Chemical Errors: Invalid molecular structures
Geometric Errors: Impossible atomic coordinates
Constraint Errors: Unsatisfiable fixing requirements

Core Fixing Methods#

Hydrogen Addition#

PDBFixer.add_missing_hydrogens(atoms: List[Atom], method: str = 'openbabel', pH: float = 7.0, **kwargs: Any) → List[Atom][source]#

Add missing hydrogen atoms to a list of atoms.

Takes a list of HBAT Atom objects and returns a new list with missing hydrogen atoms added using the specified method.

Parameters:

atoms (List[Atom]) – List of atoms to process
method (str) – Method to use (‘openbabel’ or ‘pdbfixer’)
pH (float) – pH value for protonation (pdbfixer only)
kwargs (Any) – Additional parameters for the fixing method

Returns:

List of atoms with hydrogens added

Return type:

List[Atom]

Raises:

PDBFixerError if fixing fails

Add missing hydrogen atoms using chemical bonding rules and geometric optimization.

Algorithm Details:

Identify Missing Hydrogens: Compare with expected atom counts
Determine Bonding Geometry: Analyze local chemical environment
Calculate Positions: Use ideal bond lengths and angles
Geometric Optimization: Minimize steric clashes
Validation: Check for reasonable H-bond geometries

Chemical Rules:

sp³ Carbons: Tetrahedral geometry (109.5°)
sp² Carbons: Trigonal planar geometry (120°)
Nitrogens: Pyramidal or planar based on hybridization
Oxygens: Bent geometry for alcohols, linear for carbonyls

Heavy Atom Reconstruction#

PDBFixer.add_missing_heavy_atoms(atoms: List[Atom], method: str = 'pdbfixer', **kwargs: Any) → List[Atom][source]#

Add missing heavy atoms to a structure.

Uses PDBFixer to identify and add missing heavy atoms in residues. This is particularly useful for structures with incomplete side chains.

Parameters:

atoms (List[Atom]) – List of atoms to process
method (str) – Method to use (only ‘pdbfixer’ supports this)
kwargs (Any) – Additional parameters

Returns:

List of atoms with missing heavy atoms added

Return type:

List[Atom]

Raises:

PDBFixerError if fixing fails

Reconstruct missing heavy atoms in standard protein and nucleic acid residues.

Reconstruction Strategy:

Template Matching: Use ideal residue geometries
Coordinate Transformation: Align existing atoms with templates
Position Calculation: Place missing atoms using transformations
Clash Resolution: Adjust positions to avoid overlaps
Energy Minimization: Optimize local geometry

Residue Standardization#

PDBFixer.convert_nonstandard_residues(atoms: List[Atom], custom_replacements: Dict[str, str] | None = None) → List[Atom][source]#

Convert non-standard residues to their standard equivalents using PDBFixer.

This method uses PDBFixer’s built-in findNonstandardResidues() and replaceNonstandardResidues() methods to properly handle non-standard residues.

Parameters:

atoms (List[Atom]) – List of atoms to process
custom_replacements (Optional[Dict[str, str]]) – Custom residue replacements to apply

Returns:

List of atoms with converted residue names

Return type:

List[Atom]

Convert non-standard and modified residues to their standard equivalents.

Conversion Rules:

Modified Amino Acids: Map to parent amino acid (e.g., MSE → MET)
Protonation States: Standardize histidine variants (HIS, HID, HIE)
Post-translational Modifications: Convert to unmodified forms
Non-standard Nucleotides: Map to canonical bases

Database Integration:

Uses comprehensive substitution tables from the constants module:

from hbat.constants import PROTEIN_SUBSTITUTIONS

# Example conversions
conversions = {
    "MSE": "MET",  # Selenomethionine → Methionine
    "CSO": "CYS",  # Cysteine sulfenic acid → Cysteine
    "HYP": "PRO",  # Hydroxyproline → Proline
    "PCA": "GLU"   # Pyroglutamic acid → Glutamic acid
}

Hetrogen Management#

PDBFixer.remove_heterogens(atoms: List[Atom], keep_water: bool = True) → List[Atom][source]#

Remove unwanted heterogens from the structure using PDBFixer.

Uses PDBFixer’s built-in removeHeterogens() method to properly handle heterogen removal with the option to keep water molecules.

Parameters:

atoms (List[Atom]) – List of atoms to process
keep_water (bool) – Whether to keep water molecules

Returns:

List of atoms with heterogens removed

Return type:

List[Atom]

Remove or retain specific heterogens based on analysis requirements.

Hetrogen Categories:

Waters: HOH, WAT, DOD molecules
Ions: Metal ions and simple salts
Cofactors: Essential prosthetic groups
Ligands: Small molecule binding partners
Crystallographic Additives: PEG, glycerol, buffer components

Retention Strategies:

Keep All: Retain all heterogens for comprehensive analysis
Keep Essential: Retain only biologically relevant heterogens
Remove All: Remove all heterogens for protein-only analysis
Custom Filtering: User-defined retention criteria

File-Level Operations#

Structure File Fixing#

PDBFixer.fix_structure_file(input_path: str, output_path: str | None = None, method: str = 'openbabel', pH: float = 7.0, overwrite: bool = False, **kwargs: Any) → str[source]#

Fix a PDB file by adding missing hydrogen atoms.

Parameters:

input_path (str) – Path to input PDB file
output_path (Optional[str]) – Path for output file (optional)
method (str) – Method to use (‘openbabel’ or ‘pdbfixer’)
pH (float) – pH value for protonation (pdbfixer only)
overwrite (bool) – Whether to overwrite existing output file
kwargs (Any) – Additional parameters for the fixing method

Returns:

Path to the output file

Return type:

str

Raises:

PDBFixerError if fixing fails

High-level interface for fixing PDB files with comprehensive options.

Process Flow:

Parse Input: Load and validate input PDB structure
Apply Fixes: Execute requested fixing operations in optimal order
Validate Results: Check fixed structure for consistency
Write Output: Save enhanced structure to output file
Generate Report: Provide detailed fixing statistics

Output Formats:

Standard PDB: Traditional PDB format with fixed structure
Enhanced PDB: PDB with additional metadata and validation info
Statistics Report: Detailed log of all fixing operations

Convenience Functions#

Quick Fixing Operations#

Structure Validation#

Chemical Intelligence#

Bonding Rules Engine#

The fixer uses sophisticated chemical rules for accurate structure enhancement:

Hybridization Detection:

# Determine carbon hybridization
def determine_hybridization(carbon_atom, neighbors):
    if len(neighbors) == 4:
        return "sp3"  # Tetrahedral
    elif len(neighbors) == 3:
        return "sp2"  # Trigonal planar
    elif len(neighbors) == 2:
        return "sp"   # Linear
    else:
        raise ValueError("Invalid carbon coordination")

Hydrogen Placement:

Tetrahedral Centers: Use ideal tetrahedral angles (109.5°)
Planar Centers: Use trigonal planar geometry (120°)
Aromatic Systems: Place hydrogens in ring plane
Heteroatoms: Consider lone pair geometry

Energy Minimization:

The fixer includes basic energy minimization to resolve steric clashes:

# Simple steepest descent minimization
def minimize_hydrogen_positions(atoms, max_iterations=100):
    for iteration in range(max_iterations):
        forces = calculate_forces(atoms)
        move_atoms(atoms, forces, step_size=0.01)

        if convergence_reached(forces):
            break

    return atoms

Performance and Scalability#

Computational Complexity:

Hydrogen Addition: O(n) where n is number of heavy atoms
Heavy Atom Reconstruction: O(n log n) for template matching
Residue Conversion: O(n) linear scan and replacement
Validation: O(n) comprehensive structure checking

Memory Usage:

Minimal memory overhead beyond original structure
Efficient data structures for large protein complexes
Streaming processing for very large structures

Benchmarks:

Typical performance on modern hardware:

Small proteins (<1000 atoms): <100 ms fixing time
Medium proteins (1000-10000 atoms): 100-1000 ms fixing time
Large complexes (10000+ atoms): 1-10 seconds fixing time

Integration Examples#

Analysis Pipeline Integration#

from hbat.core.analyzer import MolecularInteractionAnalyzerractionAnalyzer
from hbat.core.pdb_fixer import PDBFixer
from hbat.constants import PDBFixingModes, ParametersDefault

# Complete analysis pipeline with fixing
def analyze_structure_with_fixing(pdb_file):
    # Step 1: Fix structure
    fixer = PDBFixer()
    fixed_file = "temp_fixed.pdb"

    fix_result = fixer.fix_structure_file(
        pdb_file,
        fixed_file,
        mode=PDBFixingModes.ADD_HYDROGENS_AND_CONVERT_RESIDUES
    )

    print(f"Structure fixing completed:")
    print(f"  Added {fix_result.hydrogens_added} hydrogens")
    print(f"  Converted {fix_result.residues_converted} residues")

    # Step 2: Analyze fixed structure
    analyzer = MolecularInteractionAnalyzerractionAnalyzer(ParametersDefault())
    results = analyzer.analyze_file(fixed_file)

    print(f"Analysis results:")
    print(f"  Hydrogen bonds: {len(results.hydrogen_bonds)}")
    print(f"  Halogen bonds: {len(results.halogen_bonds)}")

    return results, fix_result

Batch Processing#

import os
from concurrent.futures import ProcessPoolExecutor

def fix_structure_batch(pdb_files, output_dir):
    """Fix multiple PDB structures in parallel."""

    def fix_single_file(pdb_file):
        fixer = PDBFixer()
        output_file = os.path.join(output_dir, f"fixed_{os.path.basename(pdb_file)}")

        try:
            result = fixer.fix_structure_file(pdb_file, output_file)
            return {"file": pdb_file, "success": True, "result": result}
        except Exception as e:
            return {"file": pdb_file, "success": False, "error": str(e)}

    # Process files in parallel
    with ProcessPoolExecutor() as executor:
        results = list(executor.map(fix_single_file, pdb_files))

    # Summarize results
    successful = [r for r in results if r["success"]]
    failed = [r for r in results if not r["success"]]

    print(f"Successfully fixed {len(successful)} structures")
    print(f"Failed to fix {len(failed)} structures")

    return results

Quality Control#

Validation Metrics:

The fixer provides comprehensive quality metrics:

# Quality assessment after fixing
validation_result = fixer.validate_structure("fixed_structure.pdb")

print(f"Structure Quality Metrics:")
print(f"  Completeness: {validation_result.completeness:.1%}")
print(f"  Geometric validity: {validation_result.geometry_score:.2f}")
print(f"  Chemical consistency: {validation_result.chemistry_score:.2f}")
print(f"  Overall quality: {validation_result.overall_score:.2f}")

Common Issues and Solutions:

Missing Atoms: Automatically detected and reconstructed
Steric Clashes: Resolved through geometric optimization
Invalid Residues: Converted to standard equivalents
Chain Breaks: Flagged for manual inspection
Unusual Geometries: Validated against chemical expectations

PDB Structure Fixer

Contents

PDB Structure Fixer#

Module Overview#

Main Classes#

PDBFixer#

Fixing Methods Comparison#

Troubleshooting#

Exception Classes#

Core Fixing Methods#

Hydrogen Addition#

Heavy Atom Reconstruction#

Residue Standardization#

Hetrogen Management#

File-Level Operations#

Structure File Fixing#

Convenience Functions#

Quick Fixing Operations#

Structure Validation#

Chemical Intelligence#

Bonding Rules Engine#

Performance and Scalability#

Integration Examples#

Analysis Pipeline Integration#

Batch Processing#

Quality Control#