此篇博客将介绍spconv、如何安装spconv 1.x和spconv2.x,以及遇到的各种报错处理,更多详细内容可以参考官方的GitHub链接。
spconv为空间稀疏卷积库,一个提供高度优化的稀疏卷积实现和张量核心支持的项目。spconv的设计旨在有效地处理包含大量零元素的稀疏数据,主要用于3D点云的卷积操作。
博主自己的硬件配置如下:
首先安装pytorch、cuda等相关软件包:
conda create --name env_name python=3.7 cmake=3.22.1
conda install pytorch==1.10.1 torchvision==0.11.2 torchaudio==0.10.1 cudatoolkit=11.3 -c pytorch -c conda-forge
conda install cudnn -c conda-forge
conda install boost
打开bashrc文件:vim ~/.bashrc
,设置环境变量:
export PATH=/usr/local/cuda/bin:$PATH
export CUDA_HOME=/usr/local/cuda/bin:$CUDA_HOME
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH
export C_INCLUDE_PATH=$C_INCLUDE_PATH:/[$YourUserName]/anaconda3/envs/[$YourEnvName]/include
export CPLUS_INCLUDE_PATH=$CPLUS_INCLUDE_PATH:/[$YourUserName]/anaconda3/envs/[$YourEnvName]/include
export LIBRARY_PATH=$LIBRARY_PATH:/[$YourUserName]/anaconda3/envs/[$YourEnvName]/lib
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/[$YourUserName]/anaconda3/envs/[$YourEnvName]/lib
spconv 1.x官方已经舍弃掉不维护了,推荐使用spconv 2.x。而且spconv 2.x相较于spconv 1.x做了优化,速度、效率等有所提升。若由于版本需要,不得不需要安装spconv 1.x,则需要源码编译安装。
下面是官方给出的安装spconv 1.2.1的步骤:
# STEP 1: get source code
git clone https://github.com/traveller59/spconv.git
cd spconv
git checkout v1.2.1
git submodule update --init --recursive
# STEP 2: compile
python setup.py bdist_wheel
# STEP 3: install
cd ./dist
pip install spconv-1.2.1-cp37-cp37m-linux_x86_64.whl
# check if is successfully installed
python
import spconv
spconv 2.x目前已经支持通过pip install的方式进行安装了,可以根据自己的环境进行选择运行以下命令:
# 预先安装cumm和timm
pip install cumm timm
# cpu
pip install spconv
# cuda 10.2
pip install spconv-cu102
# cuda 11.3
pip install spconv-cu113
# cuda 11.4
pip install spconv-cu114
# cuda 11.6
pip install spconv-cu116
# cuda 11.7
pip install spconv-cu117
# cuda 11.8
pip install spconv-cu118
# cuda 12.0
pip install spconv-cu120
在此主要列举了在安装spconv 1.x的报错和解决方法。
error This file requires compiler and library support for the ISO C++ 2011 standard. This support is currently experimental, and must be enabled with the -std=c++11 or -std=gnu++11 compiler options.
输入gcc --version
获取gcc版本是4.8.5,解决办法是安装5.4.0版本的gcc。在此,我使用的是在conda虚拟环境中安装:
conda install https://anaconda.org/brown-data-science/gcc/5.4.0/download/linux-64/gcc-5.4.0-0.tar.bz2
这个package有一个坑在于,在miniconda3/envs/env_name/lib
文件下的libstdc++.so
和libstdc++.so.6
软链接指向的是libstdc++.so.6.0.21
动态链接库,并且这个软链接会覆盖环境内原有的软链接,也就是说,之后使用的都会是这个package所提供的动态链接库。
这就会导致如果你的pytorch版本较高时,由于libstdc++.so.6.0.21
版本较旧,所以会报错:
ImportError: /xxx/miniconda3/envs/env_name/lib/libstdc++.so.6: version `GLIBCXX_3.4.22' not found (required by /xxx/miniconda3/envs/env_name/lib/python3.7/site-packages/open3d/open3d.cpython-37m-x86_64-linux-gnu.so)
解决方法是,进入miniconda3/envs/env_name/lib
,然后将软链接libstdc++.so
和libstdc++.so.6
指向高版本的动态链接库,例如同目录下的libstdc++.so.6.0.32
,命令如下:
# 先删除原来的软连接
rm libstdc++.so
rm libstdc++.so.6
# 建立新的软连接
ln -s libstdc++.so.6.0.32 libstdc++.so
ln -s libstdc++.so.6.0.32 libstdc++.so.6
参考链接:https://blog.csdn.net/j___t/article/details/107308883
/xxx/spconv/include/tensorview/tensorview.h:741:23: error: template declaration of ‘constexpr const char* const tv::type_s’
constexpr const char *type_s = detail::TypeToString<T>::value;
^
/xxx/spconv/include/tensorview/tensorview.h: In member function ‘std::string tv::TensorView<T, Rank, PtrTraits, Tindex>::repr(Os&) const’:
/xxx/spconv/include/tensorview/tensorview.h:1124:26: error: ‘type_s’ was not declared in this scope
ss << "Tensor[" << type_s<T> << "]" << std::endl;
^
/xxx/spconv/include/tensorview/tensorview.h:1124:34: error: expected primary-expression before ‘>’ token
ss << "Tensor[" << type_s<T> << "]" << std::endl;
^
/xxx/spconv/include/tensorview/tensorview.h:1124:36: error: expected primary-expression before ‘<<’ token
ss << "Tensor[" << type_s<T> << "]" << std::endl;
^
/xxx/spconv/include/tensorview/tensorview.h:1135:24: error: ‘type_s’ was not declared in this scope
ss << "Tensor[" << type_s<T> << "]: shape=" << shape()
^
/xxx/spconv/include/tensorview/tensorview.h:1135:32: error: expected primary-expression before ‘>’ token
ss << "Tensor[" << type_s<T> << "]: shape=" << shape()
^
/xxx/spconv/include/tensorview/tensorview.h:1135:34: error: expected primary-expression before ‘<<’ token
ss << "Tensor[" << type_s<T> << "]: shape=" << shape()
^
/xxx/spconv/include/tensorview/tensorview.h: At global scope:
/xxx/spconv/include/tensorview/tensorview.h:1270:23: error: template declaration of ‘constexpr const char* const tv::detail::type_printf_format_v’
constexpr const char *type_printf_format_v = TypePrintfFormat<T>::value;
^
/xxx/spconv/include/tensorview/tensorview.h: In function ‘void tv::printTensorView(tv::TensorView<T, Rank, PtrTraits, Tindex>)’:
/xxx/spconv/include/tensorview/tensorview.h:1343:34: error: ‘type_printf_format_v’ is not a member of ‘tv::detail’
return printTensorView(tensor, detail::type_printf_format_v<Traw>);
^
/xxx/spconv/include/tensorview/tensorview.h:1343:67: error: expected primary-expression before ‘>’ token
return printTensorView(tensor, detail::type_printf_format_v<Traw>);
^
/xxx/spconv/include/tensorview/tensorview.h:1343:68: error: expected primary-expression before ‘)’ token
return printTensorView(tensor, detail::type_printf_format_v<Traw>);
^
/xxx/spconv/include/tensorview/tensorview.h: In function ‘void tv::printTensorView(const T*, tv::Shape)’:
/xxx/spconv/include/tensorview/tensorview.h:1349:26: error: ‘type_printf_format_v’ is not a member of ‘tv::detail’
detail::type_printf_format_v<Traw>);
^
/xxx/spconv/include/tensorview/tensorview.h:1349:59: error: expected primary-expression before ‘>’ token
detail::type_printf_format_v<Traw>);
^
/xxx/spconv/include/tensorview/tensorview.h:1349:60: error: expected primary-expression before ‘)’ token
detail::type_printf_format_v<Traw>);
解决方法是直接将build文件夹删掉,重新运行python setup.py bdist_wheel
就解决了,如果没有解决,可以尝试一下将boost/. 复制到spconv/include/,但此方法我没有尝试,不确定是否可以解决。
以下两个报错均是由CUDA环境变量未找到或不正确导致:
CMake Error at /disk1/zhuhe/miniconda3/envs/env_name/lib/python3.7/site-packages/cmake/data/share/cmake-3.28/Modules/CMakeDetermineCUDACompiler.cmake:270 (message):
Failed to detect a default CUDA architecture.
Compiler output:
Call Stack (most recent call first):
CMakeLists.txt:9 (project)
CMake Error at /xxx/miniconda3/envs/env_name/lib/python3.7/site-packages/torch/share/cmake/Caffe2/public/cuda.cmake:87 (message):
FindCUDA says CUDA version is (usually determined by nvcc), but the CUDA
headers say the version is ERROR: ld.so: object
10.2. This often occurs when you set both CUDA_HOME and
CUDA_NVCC_EXECUTABLE to non-standard locations, without also setting PATH
to point to the correct nvcc. Perhaps, try re-running this command again
with PATH=/usr/local/cuda-10.2/bin:$PATH. See above log messages for more
diagnostics, and see https://github.com/pytorch/pytorch/issues/8092 for
more details.
Call Stack (most recent call first):
/xxx/miniconda3/envs/env_name/lib/python3.7/site-packages/torch/share/cmake/Caffe2/Caffe2Config.cmake:88 (include)
/xxx/miniconda3/envs/env_name/lib/python3.7/site-packages/torch/share/cmake/Torch/TorchConfig.cmake:68 (find_package)
CMakeLists.txt:22 (find_package)
打开bashrc文件:vim ~/.bashrc
,设置好CUDA的相关环境变量。
export PATH=/usr/local/cuda/bin:$PATH
export CUDA_HOME=/usr/local/cuda
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH