Skip to content
This repository has been archived by the owner on Oct 31, 2024. It is now read-only.

Issue with compiling fortran OpenACC code with rocm/5.3 environment on Joey #10

Open
ilkhomab opened this issue Dec 12, 2022 · 1 comment

Comments

@ilkhomab
Copy link

I am having an issue compiling the OpenACC fortran code with rocm/5.3 environment on Joey. Here is what I am getting
`ilkhom@nid001000:/scratch/pawsey0001/ilkhom/TESTS> rocm-smi

======================= ROCm System Management Interface =======================
================================= Concise Info =================================
GPU Temp AvgPwr SCLK MCLK Fan Perf PwrCap VRAM% GPU%
0 30.0c 80.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0%
1 31.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0%
2 27.0c 88.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0%
3 32.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0%
4 29.0c 86.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0%
5 33.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0%
6 25.0c 84.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0%
7 30.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0%

============================= End of ROCm SMI Log ==============================
ilkhom@nid001000:/scratch/pawsey0001/ilkhom/TESTS> module use /software/projects/pawsey0001/cdipietrantonio/joey/software/rocm/modulefiles/
ilkhom@nid001000:/scratch/pawsey0001/ilkhom/TESTS> module load PrgEnv-cray/8.3.3 craype-accel-amd-gfx90a rocm/5.3.0
ilkhom@nid001000:/scratch/pawsey0001/ilkhom/TESTS> ftn -h acc acc_cray_v2.f90 -o check_acc_v2
Warning: Cannot find all neccessary path for loaded rocm version!!!
lld: error: undefined hidden symbol: __ockl_get_local_size

referenced by /tmp/cooltmpdir-Xf48qJ/check_acc_v2-cce-openmp__llc.amdgpu:(main_$ck_L30_1)
referenced by /tmp/cooltmpdir-Xf48qJ/check_acc_v2-cce-openmp__llc.amdgpu:(main_$ck_L30_1)

lld: error: undefined symbol: __ockl_get_num_groups

referenced by /tmp/cooltmpdir-Xf48qJ/check_acc_v2-cce-openmp__llc.amdgpu:(main_$ck_L30_1)
referenced by /tmp/cooltmpdir-Xf48qJ/check_acc_v2-cce-openmp__llc.amdgpu:(main_$ck_L30_1)
`

@dipietrantonio
Copy link
Collaborator

Hi Ilkom,

I think ftn hardcodes the path to rocm components.

If I change the ROCM_PATH in the modulefile for our ROCm to point to the Cray provided one, that error goes away, but others understandably come in:

cdipietrantonio@joey-01:/scratch/pawsey0001/cdipietrantonio/TESTS> ftn -h acc acc_cray.f90 -o check_acc 
/opt/cray/pe/cce/14.0.3/binutils/x86_64/x86_64-pc-linux-gnu/bin/ld: /software/projects/pawsey0001/cdipietrantonio/joey/software/rocm/rocm-5.3.0rev2/lib/libamd_comgr.so.2: undefined reference to `std::__exception_ptr::exception_ptr::_M_release()@CXXABI_1.3.13'
/opt/cray/pe/cce/14.0.3/binutils/x86_64/x86_64-pc-linux-gnu/bin/ld: /software/projects/pawsey0001/cdipietrantonio/joey/software/rocm/rocm-5.3.0rev2/lib/libamd_comgr.so.2: undefined reference to `std::condition_variable::wait(std::unique_lock<std::mutex>&)@GLIBCXX_3.4.30'
/opt/cray/pe/cce/14.0.3/binutils/x86_64/x86_64-pc-linux-gnu/bin/ld: /software/projects/pawsey0001/cdipietrantonio/joey/software/rocm/rocm-5.3.0rev2/lib/libamd_comgr.so.2: undefined reference to `std::__throw_bad_array_new_length()@GLIBCXX_3.4.29'

This suggests that ftn is looking for specific subdirectories and files within $ROCM_PATH that cannot be found in our ROCm build. I will investigate this soon.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants