structural biology - How can I pare down a PDB file in Python to only include specific residues?

Sunday, 3 May 2015

structural biology - How can I pare down a PDB file in Python to only include specific residues?

ProDy works quite well, especially from within an existing Python script.

The following code takes an existing PDB file, performs some selection query on it, then saves it to another file.

import prody

def pdbsubset(inpdb, outpdb, selection):
    with open(inpdb) as protf:
        prot = prody.parsePDBStream(protf)
    atoms = prot.select(selection)
    prody.writePDB(outpdb, atoms)

An example selection query builder

residues is a list e.g. ['A12', 'A39'] with each element in the form <chain><residue number>. They were captured from the command line using argparse with
```
    parser.add_argument('-i', '--residues', nargs='+')
```
- so you would specify -i A12 A39 or whatever.

pdb and outpdb are file paths

radius is the distance in angstroms to expand the selection by.

reslist = ["(chid {0} and resid {1})".format(res[0], res[1:]) for res in residues]
selector = 'within {0} of ({1})'.format(radius, ' or '.join(reslist))

# and running it:
pdbsubset(pdb, outpdb, selector)

The documentation on ProDy selection queries is not the most straightforward, but fairly analogous to PyMol, so doable.

Answer Desk

Sunday, 3 May 2015

structural biology - How can I pare down a PDB file in Python to only include specific residues?

No comments:

Post a Comment