Glycam网站的本地配置
环境准备
glycam源码编译需要c++17标准,服务器上现有gcc/g++版本为4.8.5,不支持c++17标准,安装更高版本的gcc/g++需要源码编译和更新的glibc库,很麻烦。因此使用conda环境,在conda环境中安装高本版的gcc/g+g++。glycam需要依赖swig,但服务上swig版本为2.0.10,glycam在该版本上有bug,因此也在conda环境中安装更新版本的swig。为了避免conda包冲突,在创建conda环境时就将所有需要安装的包一并安装。
- (base) [chpeng@localhost ~]$ conda create -n glycam libboost gcc_linux-64 gxx_linux-64 python=3.8 swig=3.0.12
获取glycam源码
先克隆gems源码,由于github用不了,使用镜像网站,将github.com替换为github.com.cnpmjs.org
- (glycam) [chpeng@localhost glycam]$ git clone https://github.com.cnpmjs.org/GLYCAM-Web/gems.git
- 正克隆到 'gems'...
- remote: Enumerating objects: 6859, done.
- remote: Counting objects: 100% (2042/2042), done.
- remote: Compressing objects: 100% (929/929), done.
- remote: Total 6859 (delta 1531), reused 1582 (delta 1100), pack-reused 4817
- 接收对象中: 100% (6859/6859), 19.82 MiB | 10.87 MiB/s, done.
- 处理 delta 中: 100% (4766/4766), done.
克隆完毕后,当前目录存在gems目录,进入gems目录克隆gmml源码
- (glycam) [chpeng@localhost gems]$ git clone https://github.com.cnpmjs.org/GLYCAM-Web/gmml.git
- 正克隆到 'gmml'...
- remote: Enumerating objects: 28527, done.
- remote: Counting objects: 100% (4976/4976), done.
- remote: Compressing objects: 100% (1863/1863), done.
- remote: Total 28527 (delta 3752), reused 4046 (delta 3018), pack-reused 23551
- 接收对象中: 100% (28527/28527), 92.35 MiB | 12.19 MiB/s, done.
- 处理 delta 中: 100% (18018/18018), done.
编译
设置环境变量
- (glycam) [chpeng@localhost gems]$ export PYTHON_HOME=/home/chpeng/anaconda3/envs/glycam/include/python3.8
- (glycam) [chpeng@localhost gems]$ export GEMSHOME=/home/chpeng/glycam/gems
- (glycam) [chpeng@localhost gems]$ export GEMSMAKEPROCS=20
- (glycam) [chpeng@localhost gems]$ export CPLUS_INCLUDE_PATH=/home/chpeng/anaconda3/envs/glycam/include:$CPLUS_INCLUDE_PATH
PYTHON_HOME指定当前conda环境中python3.8的headers目录,GEMSHOME指定gems源码目录,GEMSMAKEPROCS指定编译时需要多少个核进行编译,CPLUS_INCLUDE_PATH将/home/chpeng/anaconda3/envs/glycam/include下的headers加入搜索路径,主要是为了使用boost库。
修改make.sh脚本
make.sh脚本中会克隆github上的源码,所有需要修改make.sh,将github.com替换为github.com.cnpmjs.org。
替换完后的部分代码:
- if [ ! -d "${theDir}/.git" ]; then
- echo ""
- echo "MD_Utils repo does not exist. Attempting to clone."
- git clone -b ${mdBranch} https://github.com.cnpmjs.org/GLYCAM-Web/MD_Utils.git ${theDir}
- if [ ! -d "${theDir}/.git" ]; then
- echo ""
- echo "Error: Unable to clone MD_Utils. Some functions will be unavailable."
- echo "You can try again on your own using the following command."
- echo "You will not need to remake GEMS or GMML after cloning."
- echo ""
- echo "git clone -b ${mdBranch} https://github.com.cnpmjs.org/GLYCAM-Web/MD_Utils.git ${theDir}"
- return 1
- fi
- echo "Cloning of MD_Utils was successfil"
- return 0
- fi
进行编译
conda的gcc_linux-64 gxx_linux-64安装后会自动将CC和CXX变量设置为新安装的gcc和g++,不会对gcc和g++进行设置。glycam编译时不使用CC和CXX变量。glycam需要编译gmml源码,gmml源码使用gmml/make.sh生成Makeflie,其中会调用qmake生成Makefile,我们可以使用qmake的环境变量QMAKE_CC和QMAKE_CXX指定gcc和g++的路径,此外,还需使用QMAKE_LINKER指定link程序为新安装的g++,打开gmml/make.sh文件,找到生成Makefile的那一行,大概在163行左右,添加QMAKE_CC,QMAKE_CXX和QMAKE_LINKER变量,修改后像下面这个样子
- qmake -project -t lib -o gmml.pro "QMAKE_CXXFLAGS += -Wall -W -std=c++17 ${DEBUGOPTIONS} ${OPTIMIZE}" "QMAKE_CFLAGS += -Wall -W ${DEBUGOPTIONS}" "${NO_OPTIMIZE}" "DEFINES += _REENTRANT" "CONFIG = no_lflag_merge" "unix:LIBS = -L/usr/lib/x86_64-linux-gnu -lpthread" "OBJECTS_DIR = build" "DESTDIR = lib" "QMAKE_CC = /home/chpeng/anaconda3/envs/glycam/bin/x86_64-conda-linux-gnu-cc" "QMAKE_CXX = /home/chpeng/anaconda3/envs/glycam/bin/x86_64-conda-linux-gnu-c++" "QMAKE_LINK = /home/chpeng/anaconda3/envs/glycam/bin/x86_64-conda-linux-gnu-c++" -r src/ includes/ -nopwd
修改完毕后并保存退出,在gems目录下运行make.sh
- (glycam) [chpeng@localhost gems]$ pwd
- /home/chpeng/glycam/gems
- (glycam) [chpeng@localhost gems]$ ./make.sh
编译到一半时会出现下面的错误
- /home/chpeng/anaconda3/envs/glycam/bin/../lib/gcc/x86_64-conda-linux-gnu/11.2.0/../../../../x86_64-conda-linux-gnu/bin/ld: build/main.o: in function `main':
- /home/chpeng/glycam/gems/gmml/internalPrograms/GlycoproteinBuilder/main.cpp:6: multiple definition of `main'; build/main.o:/home/chpeng/glycam/gems/gmml/internalPrograms/GlycoproteinBuilder/main.cpp:6: first defined here
- collect2: error: ld returned 1 exit status
- make: *** [lib/libgmml.so.1.0.0] 错误 1
这是因为gmml/Makefile中OBJECTS变量含有两个build/main.o造成的,修改gmml/Makefile,删去一个build/main.o只保留一个,然后保存。在gmml目录下运行make -j 10 -f Makefile继续编译(由于gmml/Makefile是make.sh自动产生的,因此我们只能直接修改Makefile,让gmml编译完后,在运行make.sh程序,接着gmml编译后的步骤运行,因为每次运行make.sh,gmml/Makefile会被覆盖成新的文件)。
- (glycam) [chpeng@localhost gmml]$ pwd
- /home/chpeng/glycam/gems/gmml
- (glycam) [chpeng@localhost gmml]$ make -j 10 -f Makefile
gmml编译完成后,回到gems目录,执行make.sh继续编译
- (glycam) [chpeng@localhost gmml]$ cd ..
- (glycam) [chpeng@localhost gems]$ pwd
- /home/chpeng/glycam/gems
- (glycam) [chpeng@localhost gems]$ ./make.sh
然后会出现下面的错误
- Compiling wrapped gmml library in python ...
- g++: 错误:unrecognized command line option ‘-std=c++17’
- Warning: gmml python interface has not been compiled correctly
这是因为make.sh在编译python模块时,直接使用g++,但是g++指向的是系统上旧的g++,我们将make.sh里的g++改为新安装的g++,修改后类似下面这样:
- if [ -f $PYTHON_FILE ]; then
- echo "Using $PYTHON_FILE header file."
- if [ -f "gmml_wrap.cxx" ]; then
- echo "Compiling wrapped gmml library in python ..."
- /home/chpeng/anaconda3/envs/glycam/bin/x86_64-conda-linux-gnu-c++ -std=c++17 -O3 -fPIC -c gmml_wrap.cxx -I"$PYTHON_HEADER_HOME"
- else
- echo "Warning: gmml_wrap.cxx does not exist."
- fi
- else
- echo "Warning: $PYTHON_FILE not found !"
- fi
- echo ""
- if [[ -f "gmml_wrap.o" ]]; then
- echo "Building python interface ..."
- /home/chpeng/anaconda3/envs/glycam/bin/x86_64-conda-linux-gnu-c++ -std=c++17 -shared gmml/build/*.o gmml_wrap.o -o _gmml.so
- else
- echo "Warning: gmml python interface has not been compiled correctly."
- fi
然后使用make.sh继续编译
- (glycam) [chpeng@localhost gems]$ alias g++="/home/chpeng/anaconda3/envs/glycam/bin/x86_64-conda-linux-gnu-c++"
- (glycam) [chpeng@localhost gems]$ ./make.sh
gems/gmml/internalPrograms/GlycoproteinBuilder/bin/gpBuilder和/home/chpeng/glycam/gems/gmml/internalPrograms/CarbohydrateBuilder/bin/carbBuilder为gmmly源代码里提供的,是gmml作者自己编译的,不能直接拿来用(会报动态链接库和glibc库的错误),我们需要删除他们再重新编译一个。
- (lywu) [chpeng@localhost CarbohydrateBuilder]$ pwd
- /home/chpeng/glycam/gems/gmml/internalPrograms/CarbohydrateBuilder
- (lywu) [chpeng@localhost CarbohydrateBuilder]$ export GEMSHOME=/home/chpeng/glycam/gems
- (lywu) [chpeng@localhost CarbohydrateBuilder]$ g++ -std=c++17 -I $GEMSHOME/gmml -L$GEMSHOME/gmml/bin/ -Wl,-rpath,$GEMSHOME/gmml/lib/ main.cpp -lgmml -o bin/carbBuilder
- (lywu) [chpeng@localhost GlycoproteinBuilder]$ pwd
- /home/chpeng/glycam/gems/gmml/internalPrograms/GlycoproteinBuilder
- (lywu) [chpeng@localhost GlycoproteinBuilder]$ g++ -std=c++17 -I $GEMSHOME/gmml -L$GEMSHOME/gmml/bin/ -Wl,-rpath,$GEMSHOME/gmml/lib/ main.cpp -lgmml -o bin/gpBuilder
测试
在gems目录下,使用test_installation.bash进行测试
- (glycam) [chpeng@localhost gems]$ pwd
- /home/chpeng/glycam/gems
- (glycam) [chpeng@localhost gems]$ ./test_installation.bash
- This test should take less than 10 seconds to run on most modern computers.
- This test will compare these files:
- updated_pdb.txt -- this is the file the test should generate
- test_pdb.txt.save -- this is the file to which it should be identical
- Beginning test.
- Checking for diffeences between test output and the standard output.
- The test passed.
使用
CarbohydrateBuilder example
- (lywu) [chpeng@localhost CarbohydrateBuilder]$ pwd
- /home/chpeng/glycam/gems/gmml/internalPrograms/CarbohydrateBuilder
- (lywu) [chpeng@localhost CarbohydrateBuilder]$ mkdir outputs
- (lywu) [chpeng@localhost CarbohydrateBuilder]$ ./bin/carbBuilder exampleLibrary.txt _ outputs/ ../../dat/prep/GLYCAM_06j-1.prep
GlycoproteinBuilder example
- (lywu) [chpeng@localhost GlycoproteinBuilder]$ pwd
- /home/chpeng/glycam/gems/gmml/internalPrograms/GlycoproteinBuilder
- (lywu) [chpeng@localhost GlycoproteinBuilder]$ ./bin/gpBuilder input.txt tests/tough/
gpBuilder的两个参数input.txt和tests/tough/用于拼接成最终的input文件,即拼接为tests/tough/input.txt,预测第一个参数指定输入文件名,第二个参数指定输入文件所在的目录,由于使用的是字符串拼接,因此tests/tough/最后面的'/'是必须的,否则程序会报错。
Docker image
- FROM ubuntu:20.04
- LABEL maintainer "fanxp <897488736@qq.com>"
- # set env
- ENV TZ=Asia/Shanghai
- ENV LD_LIBRARY_PATH=/opt/gems/gmml/lib:$LD_LIBRARY_PATH
- ENV PATH=/opt/gems/gmml/internalPrograms/GlycoproteinBuilder/bin:$PATH
- # Set the timezone
- RUN apt-get update && apt-get install -y --no-install-recommends --no-install-suggests tzdata && \
- ln -snf /usr/share/zoneinfo/$TZ /etc/localtime && echo $TZ > /etc/timezone && \
- dpkg-reconfigure --frontend noninteractive tzdata && \
- rm -rf /var/lib/apt/lists/*
- # add qt4 repository and install dependences
- RUN apt-get update && \
- apt-get install -y --no-install-recommends --no-install-suggests software-properties-common && \
- add-apt-repository -y ppa:rock-core/qt4 && \
- apt-get install -y --no-install-recommends --no-install-suggests openssl git python3 \
- python3-dev swig qt4-qmake qt4-dev-tools qt4-default build-essential \
- ca-certificates vim libboost-dev && \
- rm -rf /var/lib/apt/lists/*
- # compile glycam
- RUN cd /opt && git clone https://github.com.cnpmjs.org/GLYCAM-Web/gems.git && \
- cd gems && git clone https://github.com.cnpmjs.org/GLYCAM-Web/gmml.git && \
- export PYTHON_HOME=/usr/include/python3.8 && \
- export GEMSHOME=/opt/gems && \
- export GEMSMAKEPROCS=20 && \
- sed -i 's/github\.com/github\.com\.cnpmjs\.org/g' make.sh && \
- ./make.sh
这个Docker image里没有对gems/gmml/internalPrograms/GlycoproteinBuilder/bin/gpBuilder和/home/chpeng/glycam/gems/gmml/internalPrograms/CarbohydrateBuilder/bin/carbBuilder进行重新编译。