Skip to content

[AMD GPU] Issue with dispatch dimension beyon 32 Qubits #993

@josemonsalve2

Description

@josemonsalve2

Describe the issue

Hi all,

We're running QSim on AMD GPUs, there is an issue with the dispatch dimensions that occurs beyond the 31 Qubits, which goes beyond the device limit. Please see more details below:

This is running on MI300x (192 GB Memory), ROCm 7.1.0.

When running more than 31 Qubits we get:

CUDA error: invalid configuration argument
[PATH] ... /statespace_cuda.h:93

When running using AMD_LOG_LEVEL=5, we get the following debug info:

:3:hip_module.cpp           :744 : 3639998735920 us: [pid:627351 tid: 0x73b21e4f6740]  hipLaunchKernel ( 0x73b10daa38a0, {8388608,1,1}, {512,1,1}, 0x7ffebbdd0d90, 4096, stream:<null> ) 
:3:hip_module.cpp           :745 : 3639998735931 us: [pid:627351 tid: 0x73b21e4f6740] hipLaunchKernel: Returned hipErrorInvalidConfiguration : : duration: 11 us

This breaks because of this condition in the dispatch dimension:

https://github.com/ROCm/clr/blob/0f2d6024245abde73eaff463cdc1f10f193395b1/hipamd/src/hip_module.cpp#L518

I believe the solution is simply change the dispatch dimensions for a larger number of qubits.

If anyone is interested in working on this, I may be able to provide access to the GPU for debugging. Reach out to me.

Tell us how to reproduce the issue

Run any circuit on AMD GPU with more than 31 qubits

Tell us the version of qsim or qsimcirq (if relevant)

No response

Tell us the computing environment (if relevant)

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions