在 Windows 上的 Qt Creator 中编译 Cuda 代码
我几天来一直在尝试让 Qt 项目文件在 32 位 Windows 7 系统上运行,我希望/需要在其中包含 Cuda 代码.事情的这种组合要么很简单,以至于没有人会费心在网上放一个例子,要么很困难,似乎没有人成功过.不管怎样,我发现的唯一有用的论坛主题是 Linux 或 Mac 上的相同问题,或者 Windows 上的 Visual Studio.然而,所有这些都会产生各种不同的错误,无论是由于链接或冲突库,还是文件名中的空格或 Windows 版本的 Cuda SDK 中不存在的文件夹.有没有人提供明确的 .pro
文件来解决问题?
I have been trying for days to get a Qt project file running on a 32-bit Windows 7 system, in which I want/need to include Cuda code. This combination of things is either so simple that no one ever bothered to put an example online, or so difficult that nobody ever succeeded, it seems. Whatever way, the only helpful forum threads I found were the same issue on Linux or Mac, or with Visual Studio on a Windows.
All of these give all sorts of different errors, however, whether due to linking or clashing libraries, or spaces in file names or non-existing folders in the Windows version of the Cuda SDK.
Is there someone who has a clear .pro
file to offer that does the trick?
我的目标是用 Qt 风格的普通 C++ 代码和 Qt 4.8 库编译一个简单的程序,这些库引用了 .cu 文件中的几个 Cuda 模块.某种形式的东西:
I am aiming to compile a simple programme with ordinary C++ code in Qt style, with Qt 4.8 libraries, which reference several Cuda modules in .cu files. Something of the form:
TestCUDA
TestCUDA.pro
main.cpp
test.cu
推荐答案
所以我终于设法组装了一个 .pro
文件,该文件适用于我的并且可能适用于所有 Windows 系统.以下是一个简单的测试程序,应该可以解决问题.以下是一个小项目文件加上至少在我的系统上运行的测试程序.
So I finally managed to assemble a .pro
file that works on my and probably on all Windows systems. The following is an easy test programme that should probably do the trick. The following is a small project file plus test programme that works at least on my system.
文件系统如下:
TestCUDA
TestCUDA.pro
main.cpp
vectorAddition.cu
项目文件内容为:
TARGET = TestCUDA
# Define output directories
DESTDIR = release
OBJECTS_DIR = release/obj
CUDA_OBJECTS_DIR = release/cuda
# Source files
SOURCES += src/main.cpp
# This makes the .cu files appear in your project
OTHER_FILES += vectorAddition.cu
# CUDA settings <-- may change depending on your system
CUDA_SOURCES += src/cuda/vectorAddition.cu
CUDA_SDK = "C:/ProgramData/NVIDIA Corporation/NVIDIA GPU Computing SDK 4.2/C" # Path to cuda SDK install
CUDA_DIR = "C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v4.2" # Path to cuda toolkit install
SYSTEM_NAME = Win32 # Depending on your system either 'Win32', 'x64', or 'Win64'
SYSTEM_TYPE = 32 # '32' or '64', depending on your system
CUDA_ARCH = sm_11 # Type of CUDA architecture, for example 'compute_10', 'compute_11', 'sm_10'
NVCC_OPTIONS = --use_fast_math
# include paths
INCLUDEPATH += $$CUDA_DIR/include
$$CUDA_SDK/common/inc/
$$CUDA_SDK/../shared/inc/
# library directories
QMAKE_LIBDIR += $$CUDA_DIR/lib/$$SYSTEM_NAME
$$CUDA_SDK/common/lib/$$SYSTEM_NAME
$$CUDA_SDK/../shared/lib/$$SYSTEM_NAME
# Add the necessary libraries
LIBS += -lcuda -lcudart
# The following library conflicts with something in Cuda
QMAKE_LFLAGS_RELEASE = /NODEFAULTLIB:msvcrt.lib
QMAKE_LFLAGS_DEBUG = /NODEFAULTLIB:msvcrtd.lib
# The following makes sure all path names (which often include spaces) are put between quotation marks
CUDA_INC = $$join(INCLUDEPATH,'" -I"','-I"','"')
# Configuration of the Cuda compiler
CONFIG(debug, debug|release) {
# Debug mode
cuda_d.input = CUDA_SOURCES
cuda_d.output = $$CUDA_OBJECTS_DIR/${QMAKE_FILE_BASE}_cuda.o
cuda_d.commands = $$CUDA_DIR/bin/nvcc.exe -D_DEBUG $$NVCC_OPTIONS $$CUDA_INC $$LIBS --machine $$SYSTEM_TYPE -arch=$$CUDA_ARCH -c -o ${QMAKE_FILE_OUT} ${QMAKE_FILE_NAME}
cuda_d.dependency_type = TYPE_C
QMAKE_EXTRA_COMPILERS += cuda_d
}
else {
# Release mode
cuda.input = CUDA_SOURCES
cuda.output = $$CUDA_OBJECTS_DIR/${QMAKE_FILE_BASE}_cuda.o
cuda.commands = $$CUDA_DIR/bin/nvcc.exe $$NVCC_OPTIONS $$CUDA_INC $$LIBS --machine $$SYSTEM_TYPE -arch=$$CUDA_ARCH -c -o ${QMAKE_FILE_OUT} ${QMAKE_FILE_NAME}
cuda.dependency_type = TYPE_C
QMAKE_EXTRA_COMPILERS += cuda
}
注意 QMAKE_LFLAGS_RELEASE =/NODEFAULTLIB:msvcrt.lib
:我花了很长时间才弄清楚,但这个库似乎与 Cuda 中的其他东西发生冲突,这会产生奇怪的链接警告和错误.如果有人对此有解释,并且可能有更好的解决方法,我想听听.
Note the QMAKE_LFLAGS_RELEASE = /NODEFAULTLIB:msvcrt.lib
: it took me a long time to figure out, but this library seems to clash with other things in Cuda, which produces strange linking warnings and errors. If someone has an explanation for this, and potentially a prettier way to get around this, I'd like to hear it.
此外,由于 Windows 文件路径通常包含空格(NVIDIA 的 SDK 默认也包含空格),因此需要人为地在包含路径周围添加引号.同样,如果有人知道解决这个问题的更优雅的方法,我很想知道.
Also, since Windows file paths often include spaces (and NVIDIA's SDK by default does so too), it is necessary to artificially add quotation marks around the include paths. Again, if someone knows a more elegant way of solving this problem, I'd be interested to know.
main.cpp
文件如下所示:
#include <cuda.h>
#include <builtin_types.h>
#include <drvapi_error_string.h>
#include <QtCore/QCoreApplication>
#include <QDebug>
// Forward declare the function in the .cu file
void vectorAddition(const float* a, const float* b, float* c, int n);
void printArray(const float* a, const unsigned int n) {
QString s = "(";
unsigned int ii;
for (ii = 0; ii < n - 1; ++ii)
s.append(QString::number(a[ii])).append(", ");
s.append(QString::number(a[ii])).append(")");
qDebug() << s;
}
int main(int argc, char* argv [])
{
QCoreApplication(argc, argv);
int deviceCount = 0;
int cudaDevice = 0;
char cudaDeviceName [100];
unsigned int N = 50;
float *a, *b, *c;
cuInit(0);
cuDeviceGetCount(&deviceCount);
cuDeviceGet(&cudaDevice, 0);
cuDeviceGetName(cudaDeviceName, 100, cudaDevice);
qDebug() << "Number of devices: " << deviceCount;
qDebug() << "Device name:" << cudaDeviceName;
a = new float [N]; b = new float [N]; c = new float [N];
for (unsigned int ii = 0; ii < N; ++ii) {
a[ii] = qrand();
b[ii] = qrand();
}
// This is the function call in which the kernel is called
vectorAddition(a, b, c, N);
qDebug() << "input a:"; printArray(a, N);
qDebug() << "input b:"; printArray(b, N);
qDebug() << "output c:"; printArray(c, N);
if (a) delete a;
if (b) delete b;
if (c) delete c;
}
Cuda 文件 vectorAddition.cu
,描述了一个简单的向量加法,如下所示:
The Cuda file vectorAddition.cu
, which describes a simple vector addition, look like this:
#include <cuda.h>
#include <builtin_types.h>
extern "C"
__global__ void vectorAdditionCUDA(const float* a, const float* b, float* c, int n)
{
int ii = blockDim.x * blockIdx.x + threadIdx.x;
if (ii < n)
c[ii] = a[ii] + b[ii];
}
void vectorAddition(const float* a, const float* b, float* c, int n) {
float *a_cuda, *b_cuda, *c_cuda;
unsigned int nBytes = sizeof(float) * n;
int threadsPerBlock = 256;
int blocksPerGrid = (n + threadsPerBlock - 1) / threadsPerBlock;
// allocate and copy memory into the device
cudaMalloc((void **)& a_cuda, nBytes);
cudaMalloc((void **)& b_cuda, nBytes);
cudaMalloc((void **)& c_cuda, nBytes);
cudaMemcpy(a_cuda, a, nBytes, cudaMemcpyHostToDevice);
cudaMemcpy(b_cuda, b, nBytes, cudaMemcpyHostToDevice);
vectorAdditionCUDA<<<blocksPerGrid, threadsPerBlock>>>(a_cuda, b_cuda, c_cuda, n);
// load the answer back into the host
cudaMemcpy(c, c_cuda, nBytes, cudaMemcpyDeviceToHost);
cudaFree(a_cuda);
cudaFree(b_cuda);
cudaFree(c_cuda);
}
如果你让它工作,那么更复杂的例子是不言而喻的,我认为.
If you get this to work, then more complicated examples are self-evident, I think.
编辑 (24-1-2013): 我添加了 QMAKE_LFLAGS_DEBUG =/NODEFAULTLIB:msvcrtd.lib
和 CONFIG(debug)
带有额外的 D_DEBUG
标志,这样它也可以在调试模式下编译.
Edit (24-1-2013): I added the QMAKE_LFLAGS_DEBUG = /NODEFAULTLIB:msvcrtd.lib
and the CONFIG(debug)
with the extra D_DEBUG
flag, such that it also compiles in debug mode.
相关文章