在启用 GPU 的 Windows 8 上安装 theano
问题描述
我了解 Theano 对 Windows 8.1 的支持仅处于试验阶段,但我想知道是否有人能够解决我的问题.根据我的配置,我得到三种不同类型的错误.我认为解决我的任何错误都会解决我的问题.
I understand that the Theano support for Windows 8.1 is at experimental stage only but I wonder if anyone had any luck with resolving my issues. Depending on my config, I get three distinct types of errors. I assume that the resolution of any of my errors would solve my problem.
我已经使用 WinPython 32 位系统安装了 Python,使用了 此处所述的 MinGW.我的 .theanorc
文件内容如下:
I have installed Python using WinPython 32-bit system, using MinGW as described here. The contents of my .theanorc
file are as follows:
[global]
openmp=False
device = gpu
[nvcc]
flags=-LC:TheanoPythonpython-2.7.6libs
compiler_bindir=C:Program Files (x86)Microsoft Visual Studio 10.0VCin
[blas]
ldflags =
当我运行 import theano
时,错误如下:
When I run import theano
the error is as follows:
nvcc fatal : nvcc cannot find a supported version of Microsoft Visual Studio.
Only the versions 2010, 2012, and 2013 are supported
['nvcc', '-shared', '-g', '-O3', '--compiler-bindir', 'C:\Program Files (x86)\
Microsoft Visual Studio 10.0\VC\bin# flags=-m32 # we have this hard coded for
now', '-Xlinker', '/DEBUG', '-m32', '-Xcompiler', '-DCUDA_NDARRAY_CUH=d67f7c8a21
306c67152a70a88a837011,/Zi,/MD', '-IC:\TheanoPython\python-2.7.6\lib\site-pa
ckages\theano\sandbox\cuda', '-IC:\TheanoPython\python-2.7.6\lib\site-pac
kages\numpy\core\include', '-IC:\TheanoPython\python-2.7.6\include', '-o',
'C:\Users\Matej\AppData\Local\Theano\compiledir_Windows-8-6.2.9200-Intel6
4_Family_6_Model_60_Stepping_3_GenuineIntel-2.7.6-32\cuda_ndarray\cuda_ndarray
.pyd', 'mod.cu', '-LC:\TheanoPython\python-2.7.6\libs', '-LNone\lib', '-LNon
e\lib64', '-LC:\TheanoPython\python-2.7.6', '-lpython27', '-lcublas', '-lcuda
rt']
ERROR (theano.sandbox.cuda): Failed to compile cuda_ndarray.cu: ('nvcc return st
atus', 1, 'for cmd', 'nvcc -shared -g -O3 --compiler-bindir C:\Program Files (x
86)\Microsoft Visual Studio 10.0\VC\bin# flags=-m32 # we have this hard coded
for now -Xlinker /DEBUG -m32 -Xcompiler -DCUDA_NDARRAY_CUH=d67f7c8a21306c67152a
70a88a837011,/Zi,/MD -IC:\TheanoPython\python-2.7.6\lib\site-packages\thean
o\sandbox\cuda -IC:\TheanoPython\python-2.7.6\lib\site-packages\numpy\co
re\include -IC:\TheanoPython\python-2.7.6\include -o C:\Users\Matej\AppDa
ta\Local\Theano\compiledir_Windows-8-6.2.9200-Intel64_Family_6_Model_60_Stepp
ing_3_GenuineIntel-2.7.6-32\cuda_ndarray\cuda_ndarray.pyd mod.cu -LC:\TheanoP
ython\python-2.7.6\libs -LNone\lib -LNone\lib64 -LC:\TheanoPython\python-2
.7.6 -lpython27 -lcublas -lcudart')
WARNING (theano.sandbox.cuda): CUDA is installed, but device gpu is not availabl
e
我还使用安装在我的系统上的 Visual Studio 12.0
对其进行了测试,但出现以下错误:
I have also tested it using Visual Studio 12.0
which is installed on my system with the following error:
mod.cu
nvlink fatal : Could not open input file 'C:/Users/Matej/AppData/Local/Temp/tm
pxft_00001b70_00000000-28_mod.obj'
['nvcc', '-shared', '-g', '-O3', '--compiler-bindir', 'C:\Program Files (x86)\
Microsoft Visual Studio 12.0\VC\bin\', '-Xlinker', '/DEBUG', '-m32', '-Xcompi
ler', '-LC:\TheanoPython\python-2.7.6\libs,-DCUDA_NDARRAY_CUH=d67f7c8a21306c6
7152a70a88a837011,/Zi,/MD', '-IC:\TheanoPython\python-2.7.6\lib\site-package
s\theano\sandbox\cuda', '-IC:\TheanoPython\python-2.7.6\lib\site-packages
\numpy\core\include', '-IC:\TheanoPython\python-2.7.6\include', '-o', 'C:
Users\Matej\AppData\Local\Theano\compiledir_Windows-8-6.2.9200-Intel64_Fam
ily_6_Model_60_Stepping_3_GenuineIntel-2.7.6-32\cuda_ndarray\cuda_ndarray.pyd'
, 'mod.cu', '-LC:\TheanoPython\python-2.7.6\libs', '-LNone\lib', '-LNone\li
b64', '-LC:\TheanoPython\python-2.7.6', '-lpython27', '-lcublas', '-lcudart']
ERROR (theano.sandbox.cuda): Failed to compile cuda_ndarray.cu: ('nvcc return st
atus', 1, 'for cmd', 'nvcc -shared -g -O3 --compiler-bindir C:\Program Files (x
86)\Microsoft Visual Studio 12.0\VC\bin\ -Xlinker /DEBUG -m32 -Xcompiler -LC
:\TheanoPython\python-2.7.6\libs,-DCUDA_NDARRAY_CUH=d67f7c8a21306c67152a70a88
a837011,/Zi,/MD -IC:\TheanoPython\python-2.7.6\lib\site-packages\theano\sa
ndbox\cuda -IC:\TheanoPython\python-2.7.6\lib\site-packages\numpy\core\i
nclude -IC:\TheanoPython\python-2.7.6\include -o C:\Users\Matej\AppData\L
ocal\Theano\compiledir_Windows-8-6.2.9200-Intel64_Family_6_Model_60_Stepping_3
_GenuineIntel-2.7.6-32\cuda_ndarray\cuda_ndarray.pyd mod.cu -LC:\TheanoPython
\python-2.7.6\libs -LNone\lib -LNone\lib64 -LC:\TheanoPython\python-2.7.6
-lpython27 -lcublas -lcudart')
WARNING (theano.sandbox.cuda): CUDA is installed, but device gpu is not availabl
e
在后一个错误中,有几个弹出窗口询问我在抛出错误之前如何打开 (.res) 文件.
In the latter error, several pop-up windows ask me how would I like to open (.res) file before error is thrown.
cl.exe
存在于两个文件夹中(即 VS 2010 和 VS 2013).
cl.exe
is present in both folders (i.e. VS 2010 and VS 2013).
最后,如果我在环境路径中设置VS 2013,并设置.theanorc
内容如下:
Finally, if I set VS 2013 in the environment path and set .theanorc
contents as follows:
[global]
base_compiledir=C:Program Files (x86)Microsoft Visual Studio 12.0VCin
openmp=False
floatX = float32
device = gpu
[nvcc]
flags=-LC:TheanoPythonpython-2.7.6libs
compiler_bindir=C:Program Files (x86)Microsoft Visual Studio 12.0VCin
[blas]
ldflags =
我收到以下错误:
c: heanopythonpython-2.7.6includepymath.h(22): warning: dllexport/dllimport conflict with "round"
c:program files
vidia gpu computing toolkitcudav6.5includemath_functions.h(2455): here; dllimport/dllexport dropped
mod.cu(954): warning: statement is unreachable
mod.cu(1114): error: namespace "std" has no member "min"
mod.cu(1145): error: namespace "std" has no member "min"
mod.cu(1173): error: namespace "std" has no member "min"
mod.cu(1174): error: namespace "std" has no member "min"
mod.cu(1317): error: namespace "std" has no member "min"
mod.cu(1318): error: namespace "std" has no member "min"
mod.cu(1442): error: namespace "std" has no member "min"
mod.cu(1443): error: namespace "std" has no member "min"
mod.cu(1742): error: namespace "std" has no member "min"
mod.cu(1777): error: namespace "std" has no member "min"
mod.cu(1781): error: namespace "std" has no member "min"
mod.cu(1814): error: namespace "std" has no member "min"
mod.cu(1821): error: namespace "std" has no member "min"
mod.cu(1853): error: namespace "std" has no member "min"
mod.cu(1861): error: namespace "std" has no member "min"
mod.cu(1898): error: namespace "std" has no member "min"
mod.cu(1905): error: namespace "std" has no member "min"
mod.cu(1946): error: namespace "std" has no member "min"
mod.cu(1960): error: namespace "std" has no member "min"
mod.cu(3750): error: namespace "std" has no member "min"
mod.cu(3752): error: namespace "std" has no member "min"
mod.cu(3784): error: namespace "std" has no member "min"
mod.cu(3786): error: namespace "std" has no member "min"
mod.cu(3789): error: namespace "std" has no member "min"
mod.cu(3791): error: namespace "std" has no member "min"
mod.cu(3794): error: namespace "std" has no member "min"
mod.cu(3795): error: namespace "std" has no member "min"
mod.cu(3836): error: namespace "std" has no member "min"
mod.cu(3838): error: namespace "std" has no member "min"
mod.cu(4602): error: namespace "std" has no member "min"
mod.cu(4604): error: namespace "std" has no member "min"
31 errors detected in the compilation of "C:/Users/Matej/AppData/Local/Temp/tmpxft_00001d84_00000000-10_mod.cpp1.ii".
ERROR (theano.sandbox.cuda): Failed to compile cuda_ndarray.cu: ('nvcc return status', 2, 'for cmd', 'nvcc -shared -g -O3 -Xlinker /DEBUG -m32 -Xcompiler -DCUDA_NDARRAY_CUH=d67f7c8a21306c67152a70a88a837011,/Zi,/MD -IC:\TheanoPython\python-2.7.6\lib\site-packages\theano\sandbox\cuda -IC:\TheanoPython\python-2.7.6\lib\site-packages\numpy\core\include -IC:\TheanoPython\python-2.7.6\include -o C:\Users\Matej\AppData\Local\Theano\compiledir_Windows-8-6.2.9200-Intel64_Family_6_Model_60_Stepping_3_GenuineIntel-2.7.6-32\cuda_ndarray\cuda_ndarray.pyd mod.cu -LC:\TheanoPython\python-2.7.6\libs -LNone\lib -LNone\lib64 -LC:\TheanoPython\python-2.7.6 -lpython27 -lcublas -lcudart')
ERROR:theano.sandbox.cuda:Failed to compile cuda_ndarray.cu: ('nvcc return status', 2, 'for cmd', 'nvcc -shared -g -O3 -Xlinker /DEBUG -m32 -Xcompiler -DCUDA_NDARRAY_CUH=d67f7c8a21306c67152a70a88a837011,/Zi,/MD -IC:\TheanoPython\python-2.7.6\lib\site-packages\theano\sandbox\cuda -IC:\TheanoPython\python-2.7.6\lib\site-packages\numpy\core\include -IC:\TheanoPython\python-2.7.6\include -o C:\Users\Matej\AppData\Local\Theano\compiledir_Windows-8-6.2.9200-Intel64_Family_6_Model_60_Stepping_3_GenuineIntel-2.7.6-32\cuda_ndarray\cuda_ndarray.pyd mod.cu -LC:\TheanoPython\python-2.7.6\libs -LNone\lib -LNone\lib64 -LC:\TheanoPython\python-2.7.6 -lpython27 -lcublas -lcudart')
mod.cu
['nvcc', '-shared', '-g', '-O3', '-Xlinker', '/DEBUG', '-m32', '-Xcompiler', '-DCUDA_NDARRAY_CUH=d67f7c8a21306c67152a70a88a837011,/Zi,/MD', '-IC:\TheanoPython\python-2.7.6\lib\site-packages\theano\sandbox\cuda', '-IC:\TheanoPython\python-2.7.6\lib\site-packages\numpy\core\include', '-IC:\TheanoPython\python-2.7.6\include', '-o', 'C:\Users\Matej\AppData\Local\Theano\compiledir_Windows-8-6.2.9200-Intel64_Family_6_Model_60_Stepping_3_GenuineIntel-2.7.6-32\cuda_ndarray\cuda_ndarray.pyd', 'mod.cu', '-LC:\TheanoPython\python-2.7.6\libs', '-LNone\lib', '-LNone\lib64', '-LC:\TheanoPython\python-2.7.6', '-lpython27', '-lcublas', '-lcudart']
如果我在没有开启 GPU 选项的情况下运行 import theano
,它会毫无问题地运行.CUDA 样本也可以正常运行.
If I run import theano
without the GPU option on, it runs without a problem. Also CUDA samples run without a problem.
解决方案
Theano 是机器学习应用程序的绝佳工具,但我发现它在 Windows 上的安装并非易事,尤其是对于编程的初学者(如我自己)而言.在我的例子中,当我在 GPU 上运行时,我看到我的脚本加速了 5-6 倍,所以这绝对是值得的.
Theano is a great tool for machine learning applications, yet I found that its installation on Windows is not trivial especially for beginners (like myself) in programming. In my case, I see 5-6x speedups of my scripts when run on a GPU so it was definitely worth the hassle.
我根据我的安装过程编写了本指南,旨在详细说明并希望完整,即使对于事先不了解在 Windows 环境下构建程序的人来说也是如此.本指南的大部分内容都是基于这些 instructions 但我不得不更改一些步骤为了让它在我的系统上工作.如果我所做的任何事情可能不是最佳的或在您的机器上不起作用,请告诉我,我将尝试相应地修改本指南.
I wrote this guide based on my installation procedure and is meant to be verbose and hopefully complete even for people with no prior understanding of building programs under Windows environment. Most of this guide is based on these instructions but I had to change some of the steps in order for it to work on my system. If there is anything that I do that may not be optimal or that doesn't work on your machine, please, let me know and I will try to modify this guide accordingly.
这些是我在 Windows 8.1 机器上安装启用 GPU 的 Theano 时遵循的步骤(按顺序):
These are the steps (in order) I followed when installing Theano with GPU enabled on my Windows 8.1 machine:
CUDA 可以从这里下载.就我而言,我为配备 Geforce 750m 的 NVIDIA Optimus 笔记本电脑选择了 64 位笔记本版本.
CUDA can be downloaded from here. In my case, I chose 64-bit Notebook version for my NVIDIA Optimus laptop with Geforce 750m.
通过从命令行启动 deviceQuery
来验证您的安装是否成功.在我的情况下,它位于以下文件夹中: C:ProgramDataNVIDIA CorporationCUDA Samplesv6.5inwin64Release
.如果成功,您应该在测试结束时看到 PASS.
Verify that your installation was successful by launching deviceQuery
from command line. In my case this was located in the following folder: C:ProgramDataNVIDIA CorporationCUDA Samplesv6.5inwin64Release
. If successful, you should see PASS at the end of the test.
我是通过 dreamspark 安装的.如果您是学生,您有权获得免费版本.如果没有,您仍然可以安装 Express 版本 应该也可以.安装完成后,您应该能够从开始菜单调用 Visual Studio 命令提示符 2010.
I installed this via dreamspark. If you are a student you are entitled for a free version. If not, you can still install the Express version which should work just as well. After install is complete you should be able to call Visual Studio Command Prompt 2010 from the start menu.
在撰写本文时,GPU 上的 Theano 仅允许使用 32 位浮点数,并且主要为 2.7 版本的 Python 构建.Theano 需要大多数基本的科学 Python 库,例如 scipy
和 numpy
.我发现安装这些的最简单方法是通过 WinPython.它将所有依赖项安装在一个独立的文件夹中,如果安装过程中出现问题,可以轻松重新安装,并且您还可以免费安装一些有用的 IDE 工具,例如 ipython notebook 和 Spyder.为了便于使用,您可能希望在 环境变量.
At the time of writing, Theano on GPU only allows working with 32-bit floats and is primarily built for 2.7 version of Python. Theano requires most of the basic scientific Python libraries such as scipy
and numpy
. I found that the easiest way to install these was via WinPython. It installs all the dependencies in a self-contained folder which allows easy reinstall if something goes wrong in the installation process and you get some useful IDE tools such as ipython notebook and Spyder installed for free as well. For ease of use you might want to add the path to your python.exe and path to your Scripts folder in the environment variables.
找到这里.
设置文件在这里.我在安装过程中检查了所有基本安装文件.如果您遇到下面描述的 g++ 错误,这是必需的.
Setup file is here. I checked all the base installation files during the installation process. This is required if you run into g++ error described below.
您可以在这里找到它.我基本上只使用此实用程序来提取基本安装中已经提供的 PyCUDA tar 文件(因此安装应该很简单).
You can find it here. I basically used this utility only to extract PyCUDA tar file which is already provided in the base install (so the install should be straightforward).
打开位于 Python 安装的 /lib/distutils/
目录中的 msvc9compiler.py
.在我的例子中,第 641 行显示为:ld_args.append ('/IMPLIB:' + implib_file)
.在此行之后添加以下内容(相同的缩进):
Open msvc9compiler.py
located in your /lib/distutils/
directory of your Python installation. Line 641 in my case reads: ld_args.append ('/IMPLIB:' + implib_file)
. Add the following after this line (same indentation):
ld_args.append('/MANIFEST')
PyCUDA 安装
PyCUDA 的来源是 这里.
步骤:
打开cygwin并导航到PyCUDA文件夹(即/cygdrive/c/etc/etc
)并执行tar -xzf pycuda-2012.1.tar.gz
.
Open cygwin and navigate to the PyCUDA folder (i.e. /cygdrive/c/etc/etc
) and execute tar -xzf pycuda-2012.1.tar.gz
.
打开 Visual Studio 命令提示符 2010 并导航到解压缩 tarball 的目录并执行 python configure.py
Open Visual Studio Command Prompt 2010 and navigate to the directory where tarball was extracted and execute python configure.py
打开 ./siteconf.py 并更改值以使其读取(例如对于 CUDA 6.5):
Open the ./siteconf.py and change the values so that it reads (for CUDA 6.5 for instance):
BOOST_INC_DIR = []
BOOST_LIB_DIR = []
BOOST_COMPILER = 'gcc43'
USE_SHIPPED_BOOST = True
BOOST_PYTHON_LIBNAME = ['boost_python']
BOOST_THREAD_LIBNAME = ['boost_thread']
CUDA_TRACE = False
CUDA_ROOT = 'C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v6.5'
CUDA_ENABLE_GL = False
CUDA_ENABLE_CURAND = True
CUDADRV_LIB_DIR = ['${CUDA_ROOT}/lib/Win32']
CUDADRV_LIBNAME = ['cuda']
CUDART_LIB_DIR = ['${CUDA_ROOT}/lib/Win32']
CUDART_LIBNAME = ['cudart']
CURAND_LIB_DIR = ['${CUDA_ROOT}/lib/Win32']
CURAND_LIBNAME = ['curand']
CXXFLAGS = ['/EHsc']
LDFLAGS = ['/FORCE']
在VS2010命令提示符下执行以下命令:
Execute the following commands at the VS2010 command prompt:
set VS90COMNTOOLS=%VS100COMNTOOLS%
python setup.py build
python setup.py install
创建这个 python 文件并验证你是否得到了结果:
Create this python file and verify that you get a result:
# from: http://documen.tician.de/pycuda/tutorial.html
import pycuda.gpuarray as gpuarray
import pycuda.driver as cuda
import pycuda.autoinit
import numpy
a_gpu = gpuarray.to_gpu(numpy.random.randn(4,4).astype(numpy.float32))
a_doubled = (2*a_gpu).get()
print a_doubled
print a_gpu
安装 Theano
打开 git bash shell 并选择要放置 Theano 安装文件的文件夹并执行:
Install Theano
Open git bash shell and choose a folder in which you want to place Theano installation files and execute:
git clone git://github.com/Theano/Theano.git
python setup.py install
尝试在VS2010命令提示符下打开python并运行import theano
Try opening python in VS2010 command prompt and run import theano
如果您遇到与 g++ 相关的错误,请打开安装在此处的 MinGW msys.bat:C:MinGWmsys1.0
并尝试在 MinGW shell 中导入 theano.然后重试从 VS2010 命令提示符导入 theano,它现在应该可以工作了.
If you get a g++ related error, open MinGW msys.bat in my case installed here: C:MinGWmsys1.0
and try importing theano in MinGW shell. Then retry importing theano from VS2010 Command Prompt and it should be working now.
在写字板(不是记事本!)中创建一个文件,将其命名为 .theanorc.txt
并将其放入 C:UsersYour_Name
或您的用户文件夹中的任何位置位于:
Create a file in WordPad (NOT Notepad!), name it .theanorc.txt
and put it in C:UsersYour_Name
or wherever your users folder is located:
#!sh
[global]
device = gpu
floatX = float32
[nvcc]
compiler_bindir=C:Program Files (x86)Microsoft Visual Studio 10.0VCin
# flags=-m32 # we have this hard coded for now
[blas]
ldflags =
# ldflags = -lopenblas # placeholder for openblas support
创建一个测试python脚本并运行它:
Create a test python script and run it:
from theano import function, config, shared, sandbox
import theano.tensor as T
import numpy
import time
vlen = 10 * 30 * 768 # 10 x #cores x # threads per core
iters = 1000
rng = numpy.random.RandomState(22)
x = shared(numpy.asarray(rng.rand(vlen), config.floatX))
f = function([], T.exp(x))
print f.maker.fgraph.toposort()
t0 = time.time()
for i in xrange(iters):
r = f()
t1 = time.time()
print 'Looping %d times took' % iters, t1 - t0, 'seconds'
print 'Result is', r
if numpy.any([isinstance(x.op, T.Elemwise) for x in f.maker.fgraph.toposort()]):
print 'Used the cpu'
else:
print 'Used the gpu'
最后验证你有 Used the gpu
就完成了!
Verify you got Used the gpu
at the end and you're done!
相关文章