Dynamics and Druggability log/20251028before
2025-10-15:
现在这部分独立成log,专注于成药性评价成药性口袋辨别的功能实现,脚本:75.2~4Druggability/pockets_statistic.py
命令python pockets_statistic.py -i ../pdbdynamics -o output/ -d 10 -t 0.1
进展:现在除了Drugbank中structures表的没有完善,pockets_statistic脚本已经进展到可以比对配体和药物的SMILES结构。现在遇到一个问题:配体比较然后到下一步的逻辑是什么?
简单分析下流程:根据pdbid获取了药物,然后只要审批通过的药物d。然后获取配体的的结构l
理论上应该要做一个笛卡尔积,d×l,然后找到配体和药物结构相似性大于阈值的药配组合。然后就是在这些配体周围产生口袋,然后跟预测的口袋进行比对,然后区分出成药的和未成药的口袋。
然后进行一个泛靶点蛋白的横线统计/特征提取。
然后现在到配体和药物相似性比对的过程,现在已经能完成多配体和药物的比较。记录一下:ver/pockets_statistic_1.py还是药物和单个配体进行相似度匹配。
我感觉我说的最多的就是人在无语的时候是真的会笑出来,threshold都0.1了,就几个是符合的。
然后下一个问题就是经典的PocketStablility哈哈哈哈哈,解决了
接下来是特征量的计算,还是:体积、表面积、深度、疏水性和静电势能
有一个问题,就是要明确一个很重要的问题:特征计算过程中依赖的公式
体积:
表面积:
深度:
疏水性:
静电势能:
2025-10-18:又走通了,相似度0.1才能有11个,然后distance=10一个成药口袋都没有。
similarity=0.1, distance=20试试
以上过程结果的配体验证的距离是错误的,参数传递错了,similarity_threshold和distance_threshold没有区分开。
现在是10A distance,0.1 相似度比对
2025-10-24:开始处理照强师兄的遗产,35个pdb蛋白,包含于MISATO。
貌似MISATO轨迹文件不可用:
Traceback (most recent call last):
File "/home/dddc/software/miniforge3/envs/D3pockets/lib/python3.11/site-packages/mdtraj/core/trajectory.py", line 446, in load
t = loader(tmp_file, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "mdtraj/formats/xtc/xtc.pyx", line 167, in mdtraj.formats.xtc.load_xtc
File "mdtraj/formats/xtc/xtc.pyx", line 174, in mdtraj.formats.xtc.load_xtc
File "mdtraj/formats/xtc/xtc.pyx", line 345, in mdtraj.formats.xtc.XTCTrajectoryFile.read_as_traj
File "/home/dddc/software/miniforge3/envs/D3pockets/lib/python3.11/site-packages/mdtraj/core/trajectory.py", line 1332, in __init__
self.xyz = xyz
^^^^^^^^
File "/home/dddc/software/miniforge3/envs/D3pockets/lib/python3.11/site-packages/mdtraj/core/trajectory.py", line 988, in xyz
value = ensure_type(
^^^^^^^^^^^^
File "/home/dddc/software/miniforge3/envs/D3pockets/lib/python3.11/site-packages/mdtraj/utils/validation.py", line 157, in ensure_type
raise error
ValueError: xyz must be shape (Any, 22272, 3). You supplied (100, 23852, 3)
用原始帧文件试试,topfile是指原始的pdb文件
用75.4:test_rf(帧文件读取版本)
nohup python main.py -o ~ -p ./data/~.pdb -f ./data/~traj-output &
感觉是拓扑文件有错误:是的不能把第一个作为拓扑文件
25.10.25:
试一下75.4:/home/dddc/gxxu/test_rf/
数据data/1a4g_A
拓扑文件:1a4g.cif
轨迹文件:1a4g_A_traj.xtc
nohup python main.py -o ./output/1a4g/ -p ./data/1a4g_A/1a4g.cif -x ./data/1a4g_A/1a4g_A_traj.xtc &
问题:ValueError: xyz must be shape (Any, 6641, 3). You supplied (100, 23852, 3)
The topology and the trajectory files might not contain the same atoms
The input topology must contain all atoms even if you want to select a subset of them with atom_indices
我发现,问题好像1a4g.cif是二聚体,1a4g_A_traj.xtc中是四聚体。
换一个3ml5试试
数据data/3ml5_A
拓扑文件:3ml5_A.pdb
轨迹文件:1a4g_A_traj.xtc
nohup python main.py -o ./output/3ml5/ -p ./data/3ml5_A/3ml5_A.pdb -x ./data/3ml5_A/3ml5_A_traj.xtc &
问题:ValueError: xyz must be shape (Any, 3803, 3). You supplied (100, 4071, 3)
ValueError: The topology and the trajectory files might not contain the same atoms
The input topology must contain all atoms even if you want to select a subset of them with atom_indices
换拓扑文件:3ml5.cif
问题:ValueError: xyz must be shape (Any, 2354, 3). You supplied (100, 4071, 3)
ValueError: The topology and the trajectory files might not contain the same atoms
The input topology must contain all atoms even if you want to select a subset of them with atom_indices
拓扑文件换成:3ml5_A_frame.0.pdb
问题:Traceback (most recent call last):
File "/home/dddc/software/miniforge3/envs/D3pockets/lib/python3.11/site-packages/mdtraj/core/trajectory.py", line 446, in load
t = loader(tmp_file, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "mdtraj/formats/xtc/xtc.pyx", line 167, in mdtraj.formats.xtc.load_xtc
File "mdtraj/formats/xtc/xtc.pyx", line 174, in mdtraj.formats.xtc.load_xtc
File "mdtraj/formats/xtc/xtc.pyx", line 345, in mdtraj.formats.xtc.XTCTrajectoryFile.read_as_traj
File "/home/dddc/software/miniforge3/envs/D3pockets/lib/python3.11/site-packages/mdtraj/core/trajectory.py", line 1332, in __init__
self.xyz = xyz
^^^^^^^^
File "/home/dddc/software/miniforge3/envs/D3pockets/lib/python3.11/site-packages/mdtraj/core/trajectory.py", line 988, in xyz
value = ensure_type(
^^^^^^^^^^^^
File "/home/dddc/software/miniforge3/envs/D3pockets/lib/python3.11/site-packages/mdtraj/utils/validation.py", line 157, in ensure_type
raise error
ValueError: xyz must be shape (Any, 3803, 3). You supplied (100, 4071, 3)
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/dddc/gxxu/test_rf/main.py", line 229, in <module>
main(args)
File "/home/dddc/gxxu/test_rf/main.py", line 45, in main
pdbfiles = pocket_detector.cluster(
^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/dddc/gxxu/test_rf/pocket_detect/pocket_detector.py", line 44, in cluster
traj=md.load(self.trajfile,top=self.topfile)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/dddc/software/miniforge3/envs/D3pockets/lib/python3.11/site-packages/mdtraj/core/trajectory.py", line 462, in load
raise ValueError(
ValueError: The topology and the trajectory files might not contain the same atoms
The input topology must contain all atoms even if you want to select a subset of them with atom_indices
#以下测试把轨迹文件换成帧文件
1.3ml5_A.pdb×帧文件
能跑,但是问题是:Traceback (most recent call last):
File "/home/dddc/gxxu/test_rf/main.py", line 229, in <module>
main(args)
File "/home/dddc/gxxu/test_rf/main.py", line 67, in main
densityball_clu(args.output)
File "/home/dddc/gxxu/test_rf/utils/densityball_clu.py", line 51, in densityball_clu
kd=cKDTree(pointss)
^^^^^^^^^^^^^^^^
File "scipy/spatial/_ckdtree.pyx", line 556, in scipy.spatial._ckdtree.cKDTree.__init__
ValueError: data must be of shape (n, m), where there are n points of dimension m
2.3ml5_A_frame.0.pdb×帧文件,同样出现以上问题。
于是我打算修改85.23中gxxu中的D3Pockets_WEB_UCB\D3Pockets_WEB_UCB\read_traj_single_frames.py
然后再测试一下
问题:/home/dddc/software/miniforge3/envs/D3pockets/lib/python2.7/site-packages/mdtraj/core/trajectory.py:419: UserWarning: top= kwarg ignored since file contains topology information
warnings.warn('top= kwarg ignored since file contains topology information')
分析:用帧文件不用指定top文件
能跑下来了,85.23:~/gxxu/D3P*/*/druggability/*
bash D3Pockets_MD_Batc*
2025-10-28:
75.2 ~/gxxu/Backup/M*dyna*/D3pocketsMD,该路径保存用于D3pockets计算的MD数据集。
