Skip to content
This repository was archived by the owner on Nov 17, 2023. It is now read-only.
This repository was archived by the owner on Nov 17, 2023. It is now read-only.

FusedOp Failing Static Linked Build #16765

@zachgk

Description

@zachgk

Description

The build is currently failing for the statically linked build that is used for Scala Maven Publishing. This is blocking the currently nightly snapshot and must also be fixed before building the release jars as well.

build/src/operator/fusion/fused_op_gpu.o: In function `void mxnet::FusedOp::Forward<mshadow::gpu>(nnvm::NodeAttrs const&, mxnet::OpContext const&, std::vector<mxnet::TBlob, std::allocator<mxnet::TBlob> > const&, std::vector<mxnet::OpReqType, std::allocator<mxnet::OpReqType> > const&, std::vector<mxnet::TBlob, std::allocator<mxnet::TBlob> > const&)':

tmpxft_00008748_00000000-5_fused_op.compute_70.cudafe1.cpp:(.text._ZN5mxnet7FusedOp7ForwardIN7mshadow3gpuEEEvRKN4nnvm9NodeAttrsERKNS_9OpContextERKSt6vectorINS_5TBlobESaISC_EERKSB_INS_9OpReqTypeESaISH_EESG_+0x1287): undefined reference to `cuLaunchKernel'

tmpxft_00008748_00000000-5_fused_op.compute_70.cudafe1.cpp:(.text._ZN5mxnet7FusedOp7ForwardIN7mshadow3gpuEEEvRKN4nnvm9NodeAttrsERKNS_9OpContextERKSt6vectorINS_5TBlobESaISC_EERKSB_INS_9OpReqTypeESaISH_EESG_+0x12a6): undefined reference to `cuGetErrorString'

collect2: error: ld returned 1 exit status

make: *** [bin/im2rec] Error 1

make: *** Waiting for unfinished jobs....

2019-11-01 20:27:09,794 - root - INFO - Waiting for status of container 00fc4568b4c9 for 600 s.

2019-11-01 20:27:10,060 - root - INFO - Container exit status: {'StatusCode': 2, 'Error': None}

2019-11-01 20:27:10,061 - root - ERROR - Container exited with an error 😞

2019-11-01 20:27:10,061 - root - INFO - Executed command for reproduction:

See full log at http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/restricted-publish-artifacts/detail/master/287/pipeline/
Main Scala nightly pipeline at http://jenkins.mxnet-ci.amazon-ml.com/job/restricted-publish-artifacts/job/master/

It seems to be a result of #15167. The pip build has also been failing since this date for what might be the same reason.

To Reproduce

This version of the build can be run by following the instructions located at https://github.com/apache/incubator-mxnet/tree/master/tools/staticbuild. The Scala build uses variant cu92mkl by default, but other cuda builds should have the same problem.
The build is currently run on a Ubuntu 14.04 docker instance using https://github.com/apache/incubator-mxnet/blob/master/ci/docker/Dockerfile.publish.ubuntu1404_cpu.

Metadata

Metadata

Assignees

Labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions