KL div pyver 对比报告
1.思路整理:
1)二面角计算:既然二面角的计算结果不符,Trackback到输入和计算过程{
- 输入:{
- m.[chain,HigherOrder,Path,ReSortPath,ResIds,StartFrame]
- p.[u_ref,n_frames_ref,residues_ref]
- #差不很多
- }
- 计算:{
- ##大概率是计算过程,维度都对不上。
- ###残基对不上
- }
- }
}
python版本可视化分布
2)KL div计算:
为了确认错误的问题点在哪,将python计算的二面角数据保存为mat格式,然后用matlab加载,然后看结果如何。
脚本:
1)在kldiv_example_usuage.py中增加了写入mat文件的代码:/variables/new_format/*.mat
(要转化为0-2pi)
2)通过inspect_mat_format.py确认写入的文件的内容格式:
结果:dihedrals正确写入,reSort不能正确写入(参考log1)
现在已经可以正确写入reSort,参考log2
3)从new_format/*mat 加载数据,然后开始加载,暂且跳过reconcileDihedralList,看结果,没成功。接下来不跳过reconcileDihedralList,发现测试体系和参考体系的残基对不上,并且python代码的确没用上database,修改代码
结论:相同体系的二面角的分布的不同,二面角的计算过程存在不一致
补充:测试文件包:kldiv_lyw_tested.tar.gz
有效信息:
dihedral.mat数据结构:
参考数据.mat文件结构:
键: dict_keys(['__header__', '__version__', '__globals__', 'dihedrals', 'reSort'])
dihedrals: 类型=<class 'numpy.ndarray'>, 形状=(278, 1)
reSort: 类型=<class 'numpy.ndarray'>, 形状=(1066, 6)
测试数据.mat文件结构:
键: dict_keys(['__header__', '__version__', '__globals__', 'dihedrals', 'reSort'])
dihedrals: 类型=<class 'numpy.ndarray'>, 形状=(278, 1)
reSort: 类型=<class 'numpy.ndarray'>, 形状=(1066, 6)
2.问题总结
1)二面角的计算出的变量{
- |dihedralsRef|: (0,10)
- |dihedralsTest|:(0,10)
- } versus {
- |dihedralsRef/Test| (0,...180
- }
- 大概python使用了角度制√
2)Resort.txt内容代表什么,reSortKLDiv.mat量代表什么,reSortCommonRef,reSortCommon代表啥,reSort到底是啥。
Resort是对应残基和二面角关系的文件。
3)m code中提到了对齐的问题(reconcileDihedralList)会不会有意一定影响,感觉会有,但是不应该很大。
4)一个比较大的问题一定是:二面角和残基的对应关系,Resort.txt和dihedral.mat表示,一个残基对应(1,8)个二面角不等。mad,reSort文件在各自体系的输出目录中,另外原来保存的dihedrals.mat文件就有reSort,测试体系的reSort和参考体系的不太一样
5) 二面角聚类过程会不会有影响,不会
6)难道因为python使用的不是弧制度?不是,不过后面会转化
7)matlab中按照二面角的类型组织,python中按照残基组织
8)python产生的mat Resort全是0,解决了,调整了赋值
9)然后就是为什么kldiv中的代码没有用到‘database',也就是reconcileDihedralList过程
补充:
computeDihedrals用的calcalldihedralsfromtrajs是MDprot的。
log1:inspect_mat_format结果
python inspect_mat_format.py Inspecting original .mat file formats Inspecting file: variables/dihedralsRef.mat Keys in the .mat file: ['__header__', '__version__', '__globals__', 'dihedrals', 'reSort'] Structure of each variable: Variable: dihedrals Type: <class 'numpy.ndarray'> Shape: (278, 1) Data type: object First few elements: [array([[ nan, 6.26505222, 0.60820916, 0.41011049], [ nan, 3.11268983, 0.68137676, 0.4586515 ], [ nan, 3.29773689, 5.73342441, 5.85807488], ..., [ nan, 2.95955215, 5.54917202, 5.84895467], [ nan, 3.03202124, 5.58968382, 5.8660925 ], [ nan, 2.46877949, 0.38443584, 0.37765127]])] Variable: reSort Type: <class 'numpy.ndarray'> Shape: (1066, 6) Data type: uint16 First few elements: [ 1 2 1 7 15 17] Inspecting file: variables/dihedralsTest.mat Keys in the .mat file: ['__header__', '__version__', '__globals__', 'dihedrals', 'reSort'] Structure of each variable: Variable: dihedrals Type: <class 'numpy.ndarray'> Shape: (278, 1) Data type: object First few elements: [array([[ nan, 6.16714437, 0.45286268, 0.43257636], [ nan, 2.69939788, 6.0249python inspect_mat_format.py
Inspecting original .mat file formats
Inspecting file: variables/dihedralsRef.mat
Keys in the .mat file:
['__header__', '__version__', '__globals__', 'dihedrals', 'reSort']
Structure of each variable:
Variable: dihedrals
Type: <class 'numpy.ndarray'>
Shape: (278, 1)
Data type: object
First few elements:
[array([[ nan, 6.26505222, 0.60820916, 0.41011049],
[ nan, 3.11268983, 0.68137676, 0.4586515 ],
[ nan, 3.29773689, 5.73342441, 5.85807488],
...,
[ nan, 2.95955215, 5.54917202, 5.84895467],
[ nan, 3.03202124, 5.58968382, 5.8660925 ],
[ nan, 2.46877949, 0.38443584, 0.37765127]])]
Variable: reSort
Type: <class 'numpy.ndarray'>
Shape: (1066, 6)
Data type: uint16
First few elements:
[ 1 2 1 7 15 17]
Inspecting file: variables/dihedralsTest.mat
Keys in the .mat file:
['__header__', '__version__', '__globals__', 'dihedrals', 'reSort']
Structure of each variable:
Variable: dihedrals
Type: <class 'numpy.ndarray'>
Shape: (278, 1)
Data type: object
First few elements:
[array([[ nan, 6.16714437, 0.45286268, 0.43257636],
[ nan, 2.69939788, 6.02492648, 5.98582133],
[ nan, 3.05658081, 0.57866149, 0.30948201],
...,
[ nan, 3.265389 , 5.67503575, 5.87973518],
[ nan, 2.88017197, 6.08011377, 6.25613019],
[ nan, 2.98105289, 5.53090046, 5.81683091]])]
Variable: reSort
Type: <class 'numpy.ndarray'>
Shape: (1066, 6)
Data type: uint16
First few elements:
[ 1 2 1 7 15 17]
Inspecting new format .mat files
Inspecting file: variables/new_format/dihedralsRef.mat
Keys in the .mat file:
['__header__', '__version__', '__globals__', 'dihedrals', 'reSort']
Structure of each variable:
Variable: dihedrals
Type: <class 'numpy.ndarray'>
Shape: (295, 1)
Data type: object
First few elements:
[array([[3.64495732, 2.42104095, 1.54115443, nan],
[1.18972157, 1.15738416, 2.07942048, nan],
[1.90693256, 5.72842837, 5.6320921 , nan],
...,
[1.77526998, 1.5740378 , 2.12850539, nan],
[2.97226818, 2.5168138 , 0.36831886, nan],
[1.09061554, 6.22481215, 1.69429451, nan]])]
Variable: reSort
Type: <class 'numpy.ndarray'>
Shape: (1066, 6)
Data type: uint16
First few elements:
[0 0 0 0 0 0]
Inspecting file: variables/new_format/dihedralsTest.mat
Keys in the .mat file:
['__header__', '__version__', '__globals__', 'dihedrals', 'reSort']
Structure of each variable:
Variable: dihedrals
Type: <class 'numpy.ndarray'>
Shape: (295, 1)
Data type: object
First few elements:
[array([[2.75087261, 5.87442114, 5.72130917, nan],
[6.25840421, 3.30164357, 2.23651414, nan],
[5.68586318, 3.89117689, 4.28636175, nan],
...,
[0.97210598, 2.58374201, 4.17642761, nan],
[1.28241095, 2.97127412, 4.97082563, nan],
[1.30699512, 5.13538554, 5.81914576, nan]])]
Variable: reSort
Type: <class 'numpy.ndarray'>
Shape: (1066, 6)
Data type: uint16
First few elements:
[0 0 0 0 0 0]2648, 5.98582133], [ nan, 3.05658081, 0.57866149, 0.30948201], ..., [ nan, 3.265389 , 5.67503575, 5.87973518], [ nan, 2.88017197, 6.08011377, 6.25613019], [ nan, 2.98105289, 5.53090046, 5.81683091]])] Variable: reSort Type: <class 'numpy.ndarray'> Shape: (1066, 6) Data type: uint16 First few elements: [ 1 2 1 7 15 17] Inspecting new format .mat files Inspecting file: variables/new_format/dihedralsRef.mat Keys in the .mat file: ['__header__', '__version__', '__globals__', 'dihedrals', 'reSort'] Structure of each variable: Variable: dihedrals Type: <class 'numpy.ndarray'> Shape: (295, 1) Data type: object First few elements: [array([[ -78.03645168, -3.86214436, -61.29069864, nan], [ -93.05805803, 120.537905 , -60.75243259, nan], [-130.03995889, 24.57798429, -63.48294628, nan], ..., [ -98.75569494, 14.14040841, -54.42016237, nan], [ -91.27551143, 84.19822279, -87.59627544, nan], [ -86.87397876, 87.90622114, -61.13755856, nan]])] Variable: reSort Type: <class 'numpy.ndarray'> Shape: (1066, 6) Data type: uint16 First few elements: [0 0 0 0 0 0] Inspecting file: variables/new_format/dihedralsTest.mat Keys in the .mat file: ['__header__', '__version__', '__globals__', 'dihedrals', 'reSort'] Structure of each variable: Variable: dihedrals Type: <class 'numpy.ndarray'> Shape: (295, 1) Data type: object First few elements: [array([[ -72.64735107, -12.97513478, -63.39372921, nan], [-125.68848724, 15.86801418, -54.31215363, nan], [ -69.7123605 , 135.83806834, -45.9791207 , nan], ..., [ -68.1429324 , 115.68107754, -77.50498138, nan], [ 64.11426402, 46.95357127, -57.86102744, nan], [ 57.85566288, 42.83449738, -44.4463367 , nan]])] Variable: reSort Type: <class 'numpy.ndarray'> Shape: (1066, 6) Data type: uint16 First few elements: [0 0 0 0 0 0]
log2:
python inspect_mat_format.py
Inspecting original .mat file formats
Inspecting file: variables/dihedralsRef.mat
Keys in the .mat file:
['__header__', '__version__', '__globals__', 'dihedrals', 'reSort']
Structure of each variable:
Variable: dihedrals
Type: <class 'numpy.ndarray'>
Shape: (278, 1)
Data type: object
First few elements:
[array([[ nan, 6.26505222, 0.60820916, 0.41011049],
[ nan, 3.11268983, 0.68137676, 0.4586515 ],
[ nan, 3.29773689, 5.73342441, 5.85807488],
...,
[ nan, 2.95955215, 5.54917202, 5.84895467],
[ nan, 3.03202124, 5.58968382, 5.8660925 ],
[ nan, 2.46877949, 0.38443584, 0.37765127]])]
Variable: reSort
Type: <class 'numpy.ndarray'>
Shape: (1066, 6)
Data type: uint16
Detailed reSort array analysis:
Shape: (1066, 6)
Total entries: 1066
Column descriptions (MATLAB reference):
Column 0: First atom index
Column 1: Dihedral type (0=SC, 1=BB1/phi, 2=BB2/psi)
Columns 2-4: Other atom indices forming the dihedral
Column 5: Additional parameter (if present)
Found 278 unique residues
First 10 entries with full details:
Entry 0: [ 1 2 1 7 15 17]
Entry 1: [ 1 0 1 7 9 12]
Entry 2: [ 1 0 4 7 9 12]
Entry 3: [ 2 1 15 17 19 32]
Entry 4: [ 2 2 17 19 32 34]
Entry 5: [ 2 0 17 19 21 26]
Entry 6: [ 2 0 19 21 24 26]
Entry 7: [ 3 1 32 34 36 53]
Entry 8: [ 3 2 34 36 53 55]
Entry 9: [ 3 0 34 36 38 41]
Dihedral type counts:
SC: 512
BB1/phi: 277
BB2/psi: 277
Inspecting file: variables/dihedralsTest.mat
Keys in the .mat file:
['__header__', '__version__', '__globals__', 'dihedrals', 'reSort']
Structure of each variable:
Variable: dihedrals
Type: <class 'numpy.ndarray'>
Shape: (278, 1)
Data type: object
First few elements:
[array([[ nan, 6.16714437, 0.45286268, 0.43257636],
[ nan, 2.69939788, 6.02492648, 5.98582133],
[ nan, 3.05658081, 0.57866149, 0.30948201],
...,
[ nan, 3.265389 , 5.67503575, 5.87973518],
[ nan, 2.88017197, 6.08011377, 6.25613019],
[ nan, 2.98105289, 5.53090046, 5.81683091]])]
Variable: reSort
Type: <class 'numpy.ndarray'>
Shape: (1066, 6)
Data type: uint16
Detailed reSort array analysis:
Shape: (1066, 6)
Total entries: 1066
Column descriptions (MATLAB reference):
Column 0: First atom index
Column 1: Dihedral type (0=SC, 1=BB1/phi, 2=BB2/psi)
Columns 2-4: Other atom indices forming the dihedral
Column 5: Additional parameter (if present)
Found 278 unique residues
First 10 entries with full details:
Entry 0: [ 1 2 1 7 15 17]
Entry 1: [ 1 0 1 7 9 12]
Entry 2: [ 1 0 4 7 9 12]
Entry 3: [ 2 1 15 17 19 32]
Entry 4: [ 2 2 17 19 32 34]
Entry 5: [ 2 0 17 19 21 26]
Entry 6: [ 2 0 19 21 24 26]
Entry 7: [ 3 1 32 34 36 53]
Entry 8: [ 3 2 34 36 53 55]
Entry 9: [ 3 0 34 36 38 41]
Dihedral type counts:
SC: 512
BB1/phi: 277
BB2/psi: 277
Inspecting new format .mat files
Inspecting file: variables/new_format/dihedralsRef.mat
Keys in the .mat file:
['__header__', '__version__', '__globals__', 'dihedrals', 'reSort']
Structure of each variable:
Variable: dihedrals
Type: <class 'numpy.ndarray'>
Shape: (295, 1)
Data type: object
First few elements:
[array([[3.64495732, 2.42104095, 1.54115443, nan],
[1.18972157, 1.15738416, 2.07942048, nan],
[1.90693256, 5.72842837, 5.6320921 , nan],
...,
[1.77526998, 1.5740378 , 2.12850539, nan],
[2.97226818, 2.5168138 , 0.36831886, nan],
[1.09061554, 6.22481215, 1.69429451, nan]])]
Variable: reSort
Type: <class 'numpy.ndarray'>
Shape: (775, 6)
Data type: uint16
Detailed reSort array analysis:
Shape: (775, 6)
Total entries: 775
Column descriptions (MATLAB reference):
Column 0: First atom index
Column 1: Dihedral type (0=SC, 1=BB1/phi, 2=BB2/psi)
Columns 2-4: Other atom indices forming the dihedral
Column 5: Additional parameter (if present)
Found 295 unique residues
First 10 entries with full details:
Entry 0: [ 2 1 14 16 18 31]
Entry 1: [ 2 2 16 18 31 33]
Entry 2: [ 2 0 16 18 20 25]
Entry 3: [ 3 1 31 33 35 52]
Entry 4: [ 3 2 33 35 52 54]
Entry 5: [ 3 0 33 35 37 40]
Entry 6: [ 4 1 52 54 56 66]
Entry 7: [ 4 2 54 56 66 68]
Entry 8: [ 4 0 54 56 58 61]
Entry 9: [ 5 1 66 68 70 87]
Dihedral type counts:
SC: 185
BB1/phi: 295
BB2/psi: 295
Inspecting file: variables/new_format/dihedralsTest.mat
Keys in the .mat file:
['__header__', '__version__', '__globals__', 'dihedrals', 'reSort']
Structure of each variable:
Variable: dihedrals
Type: <class 'numpy.ndarray'>
Shape: (295, 1)
Data type: object
First few elements:
[array([[2.75087261, 5.87442114, 5.72130917, nan],
[6.25840421, 3.30164357, 2.23651414, nan],
[5.68586318, 3.89117689, 4.28636175, nan],
...,
[0.97210598, 2.58374201, 4.17642761, nan],
[1.28241095, 2.97127412, 4.97082563, nan],
[1.30699512, 5.13538554, 5.81914576, nan]])]
Variable: reSort
Type: <class 'numpy.ndarray'>
Shape: (775, 6)
Data type: uint16
Detailed reSort array analysis:
Shape: (775, 6)
Total entries: 775
Column descriptions (MATLAB reference):
Column 0: First atom index
Column 1: Dihedral type (0=SC, 1=BB1/phi, 2=BB2/psi)
Columns 2-4: Other atom indices forming the dihedral
Column 5: Additional parameter (if present)
Found 295 unique residues
First 10 entries with full details:
Entry 0: [ 2 1 14 16 18 31]
Entry 1: [ 2 2 16 18 31 33]
Entry 2: [ 2 0 16 18 20 25]
Entry 3: [ 3 1 31 33 35 52]
Entry 4: [ 3 2 33 35 52 54]
Entry 5: [ 3 0 33 35 37 40]
Entry 6: [ 4 1 52 54 56 66]
Entry 7: [ 4 2 54 56 66 68]
Entry 8: [ 4 0 54 56 58 61]
Entry 9: [ 5 1 66 68 70 87]
Dihedral type counts:
SC: 185
BB1/phi: 295
BB2/psi: 295
