我正在尝试将 CUDA 程序分成两个单独的 .cu 文件,以便更接近于用 C++ 编写真正的应用程序.我有一个简单的小程序:
I am trying separate a CUDA program into two separate .cu files in effort to edge closer to writing a real app in C++. I have a simple little program that:
在主机和设备上分配内存.
将主机数组初始化为一系列数字.将主机阵列复制到设备阵列使用设备内核查找数组中所有元素的平方将设备阵列复制回主机阵列打印结果
Allocates a memory on the host and the device.
Initializes the host array to a series of numbers.
Copies the host array to a device array
Finds the square of all the elements in the array using a device kernel
Copies the device array back to the host array
Prints the results
如果我将它们全部放在一个 .cu 文件中并运行它,效果会很好.当我将它分成两个单独的文件时,我开始出现链接错误.就像我最近的所有问题一样,我知道这是一件小事,但它是什么?
This works great if I put it all in one .cu file and run it. When I split it into two separate files I start getting linking errors. Like all my recent questions, I know this is something small, but what is it?
内核支持.cu
#ifndef _KERNEL_SUPPORT_
#define _KERNEL_SUPPORT_
#include <iostream>
#include <MyKernel.cu>
int main( int argc, char** argv)
{
int* hostArray;
int* deviceArray;
const int arrayLength = 16;
const unsigned int memSize = sizeof(int) * arrayLength;
hostArray = (int*)malloc(memSize);
cudaMalloc((void**) &deviceArray, memSize);
std::cout << "Before device
";
for(int i=0;i<arrayLength;i++)
{
hostArray[i] = i+1;
std::cout << hostArray[i] << "
";
}
std::cout << "
";
cudaMemcpy(deviceArray, hostArray, memSize, cudaMemcpyHostToDevice);
TestDevice <<< 4, 4 >>> (deviceArray);
cudaMemcpy(hostArray, deviceArray, memSize, cudaMemcpyDeviceToHost);
std::cout << "After device
";
for(int i=0;i<arrayLength;i++)
{
std::cout << hostArray[i] << "
";
}
cudaFree(deviceArray);
free(hostArray);
std::cout << "Done
";
}
#endif
MyKernel.cu
MyKernel.cu
#ifndef _MY_KERNEL_
#define _MY_KERNEL_
__global__ void TestDevice(int *deviceArray)
{
int idx = blockIdx.x*blockDim.x + threadIdx.x;
deviceArray[idx] = deviceArray[idx]*deviceArray[idx];
}
#endif
构建日志:
1>------ Build started: Project: CUDASandbox, Configuration: Debug x64 ------
1>Compiling with CUDA Build Rule...
1>"C:CUDAin64
vcc.exe" -arch sm_10 -ccbin "C:Program Files (x86)Microsoft Visual Studio 9.0VCin" -Xcompiler "/EHsc /W3 /nologo /O2 /Zi /MT " -maxrregcount=32 --compile -o "x64DebugKernelSupport.cu.obj" "d:StuffProgrammingVisual Studio 2008ProjectsCUDASandboxCUDASandboxKernelSupport.cu"
1>KernelSupport.cu
1>tmpxft_000016f4_00000000-3_KernelSupport.cudafe1.gpu
1>tmpxft_000016f4_00000000-8_KernelSupport.cudafe2.gpu
1>tmpxft_000016f4_00000000-3_KernelSupport.cudafe1.cpp
1>tmpxft_000016f4_00000000-12_KernelSupport.ii
1>Linking...
1>KernelSupport.cu.obj : error LNK2005: __device_stub__Z10TestDevicePi already defined in MyKernel.cu.obj
1>KernelSupport.cu.obj : error LNK2005: "void __cdecl TestDevice__entry(int *)" (?TestDevice__entry@@YAXPEAH@Z) already defined in MyKernel.cu.obj
1>D:StuffProgrammingVisual Studio 2008ProjectsCUDASandboxx64DebugCUDASandbox.exe : fatal error LNK1169: one or more multiply defined symbols found
1>Build log was saved at "file://d:StuffProgrammingVisual Studio 2008ProjectsCUDASandboxCUDASandboxx64DebugBuildLog.htm"
1>CUDASandbox - 3 error(s), 0 warning(s)
========== Build: 0 succeeded, 1 failed, 0 up-to-date, 0 skipped ==========
我在 64 位 Windows 7 上运行 Visual Studio 2008.
I am running Visual Studio 2008 on Windows 7 64bit.
我想我需要详细说明一下.我在这里寻找的最终结果是拥有一个普通的 C++ 应用程序,其中包含 Main.cpp 和 int main()
事件,并从那里运行.在我的 .cpp 代码中的某些点,我希望能够引用 CUDA 位.所以我的想法(如果这里有更标准的约定,请纠正我)是我将 CUDA 内核代码放入他们的 .cu 文件中,然后有一个支持的 .cu 文件,它将负责与设备交谈并调用内核函数和什么不是.
I think I need to elaborate on this a little bit. The end result I am looking for here is to have a normal C++ application with something like Main.cpp with the int main()
event and have things run from there. At certains point in my .cpp code I want to be able to reference CUDA bits. So my thinking (and correct me if there a more standard convention here) is that I will put the CUDA Kernel code into their on .cu files, and then have a supporting .cu file that will take care of talking to the device and calling kernel functions and what not.
您在 kernelsupport.cu
中包含了 mykernel.cu
,当您尝试链接编译器时会看到mykernel.cu 两次.您必须创建一个定义 TestDevice 的标头并包含它.
You are including mykernel.cu
in kernelsupport.cu
, when you try to link the compiler sees mykernel.cu twice. You'll have to create a header defining TestDevice and include that instead.
重新评论:
这样的东西应该可以工作
Something like this should work
// MyKernel.h
#ifndef mykernel_h
#define mykernel_h
__global__ void TestDevice(int* devicearray);
#endif
然后将包含文件更改为
//KernelSupport.cu
#ifndef _KERNEL_SUPPORT_
#define _KERNEL_SUPPORT_
#include <iostream>
#include <MyKernel.h>
// ...
重新编辑
只要您在 c++ 代码中使用的标头没有任何 cuda 特定的内容(__kernel__
、__global__
等),您就应该可以很好地链接 c++ 和 cuda代码.
As long as the header you use in c++ code doesn't have any cuda specific stuff (__kernel__
,__global__
, etc) you should be fine linking c++ and cuda code.
这篇关于如何将CUDA代码分成多个文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持跟版网!