quda 0.8.0 and milc 7.7.13 "ERROR: Solve precision ..." #5

lcosmai · 2016-02-21T15:13:30Z

I compiled the target su3_rhmc_hisq for ks_imp_rhmc
in the last stable release of MILC
(https://github.com/milc-qcd/milc_qcd.git Branch:master)
with quda v0.8.0
(https://github.com/lattice/quda.git Branch:master)
using the following Makefile:
https://drive.google.com/file/d/0BxE4mI8SH7wsSnZEaDQyeEQzVVE/view?usp=sharing

Then I performed a short test using 1 GPU.
The job aborted with the following error message:
ERROR: Solve precision 4 doesn't match gauge precision 8 (rank 0, host node496, interface_quda.cpp:1904 in checkGauge())
last kernel called was (name=N4quda22HeavyQuarkResidualNormI7double37double2S2_EE,volume=4x8x8x8,aux=vol=2048,stride=2304,precision=8)

Note that if I instead use quda v0.7.2, the same test job is completed without errors.

mathiaswagner · 2016-02-22T13:16:54Z

Just to clarify:Which MILC version did you use.

I think (https://github.com/milc-qcd/milc_qcd.git Branch:master) corresponds to MILC 7.7.13, not 7.8.0 as mentioned in the bug title.

7.8.0 might well not be compatible with quda 0.8 as there have been quite a few changes affecting MILC.

lcosmai · 2016-02-22T13:26:07Z

You are right. I have just changed 7.8.0 to 7.7.13 in the title.

detar · 2016-02-22T13:29:27Z

Hi Mathias,

To provide a more stable definition of MILC code versions on github,
last week we created two new branches, milc_qcd-7.7.13 and
milc_qcd-7.8.0. They are supposed to be release versions of the code.
The branch 7.7.13 is closest to the one Leonardo was using, and the
branch 7.8.0 is close to the current master branch. The master branch
is the development branch, so it will continue to evolve. It is
unlikely we will make any changes to the milc_qcd-7.7.13 and
milc_qcd-7.8.0 branches unless they are to fix critical bugs.
Eventually, we will copy the master branch to a new release branch. (I
think this is the model you also prefer.)

Best,
Carleton

On 2/22/16 6:16 AM, Mathias Wagner wrote:

Just to clarify:Which MILC version did you use.

I think (https://github.com/milc-qcd/milc_qcd.git Branch:master)
corresponds to MILC 7.7.13, not 7.8.0 as mentioned in the bug title.

7.8.0 might well not be compatible with quda 0.8 as there have been
quite a few changes affecting MILC.

—
Reply to this email directly or view it on GitHub
#5 (comment).

mathiaswagner · 2016-02-22T13:37:03Z

Hi Carleton,

thanks for the correction. It looks like I was confused here.
I will try to check QUDA 0.8 with

milc_qcd-7.7.13
milc_qcd-7.8.0

and try to reproduce the issue.

Sidenote: For QUDA we use a branch called develop for development and copy that over to a new release. We use master for the most recent release version (currently 0.8). This is to make sure that a git clone gives you a (hopefully) stable quda version.

Mathias

NVIDIA GmbH, Wuerselen, Germany, Amtsgericht Aachen, HRB 8361
Managing Director: Karen Theresa Burns

This email message is for the sole use of the intended recipient(s) and may contain
confidential information. Any unauthorized review, use, disclosure or distribution
is prohibited. If you are not the intended recipient, please contact the sender by

reply email and destroy all copies of the original message.

mathiaswagner · 2016-02-22T16:13:23Z

@lcosmai Can you share some more of the surrounding MILC output as well as your input file to MILC?
This makes it easier to track down where the error was triggered.

mathiaswagner · 2016-02-22T16:14:13Z

@maddyscientist Not sure whether you are already reading so just wanted to make sure you are aware.

lcosmai · 2016-02-22T19:44:20Z

I shared
(https://drive.google.com/folderview?id=0BxE4mI8SH7wsY2lianUyRDlCWG8&usp=sharing)
the directory where the job has been launched.
In the same directory there is also a README file with more details.

On 2/22/16 5:13 PM, Mathias Wagner wrote:

@lcosmai https://github.com/lcosmai Can you share some more of the
surrounding MILC output as well as your input file to MILC?
This makes it easier to track down where the error was triggered.

—
Reply to this email directly or view it on GitHub
#5 (comment).

Leonardo Cosmai
INFN Bari
Via Amendola 173
70126 Bari - Italy
office: +39 080 5443207
mobile: +39 340 3580207

mathiaswagner · 2016-02-23T09:10:27Z

Ok. I managed to reproduce the issue by using the MILC provided sample input

~/milc_qcd/ks_imp_rhmc/test$ ../su3_rhmc_hisq su3_rhmc_hisq.2.sample-in

using quda 0.8 and MILC 7.7.13.

mathiaswagner · 2016-02-23T09:15:02Z

As this might be an issue either in MILC or in QUDA I also created
lattice/quda#439
to have a pointer in the QUDA issues tracker.

mathiaswagner · 2016-02-23T11:21:48Z

Setting

    prec_pbp 2

seems to be a workaround. Still need to check why this worked with quda 0.7.2.

mathiaswagner · 2016-02-24T10:37:51Z

This will be fixed with quda 0.8.1. For now please stick to the workaround and lattice/quda#439

Gauge force typo

merge quda smear again

lcosmai changed the title ~~quda 0.8.0 and milc 7.8.0 "ERROR: Solve precision ..."~~ quda 0.8.0 and milc 7.7.13 "ERROR: Solve precision ..." Feb 22, 2016

mathiaswagner mentioned this issue Feb 23, 2016

MILC precision mismatch in rhmc (MILC 7.7.13 / quda 0.8.0) lattice/quda#439

Closed

detar pushed a commit that referenced this issue Oct 4, 2017

Merge pull request #5 from milc-qcd/develop

5a17bcc

Gauge force typo

This was referenced Oct 4, 2017

Hotfix/warnings #14

Merged

Feature/quda interface optimize #15

Merged

detar pushed a commit that referenced this issue Feb 7, 2024

Merge pull request #5 from ylin910095/production/omega.QudaGSmear

aa52da6

merge quda smear again

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

quda 0.8.0 and milc 7.7.13 "ERROR: Solve precision ..." #5

quda 0.8.0 and milc 7.7.13 "ERROR: Solve precision ..." #5

lcosmai commented Feb 21, 2016

mathiaswagner commented Feb 22, 2016

lcosmai commented Feb 22, 2016

detar commented Feb 22, 2016

mathiaswagner commented Feb 22, 2016

mathiaswagner commented Feb 22, 2016

mathiaswagner commented Feb 22, 2016

lcosmai commented Feb 22, 2016

mathiaswagner commented Feb 23, 2016

mathiaswagner commented Feb 23, 2016

mathiaswagner commented Feb 23, 2016

mathiaswagner commented Feb 24, 2016

quda 0.8.0 and milc 7.7.13 "ERROR: Solve precision ..." #5

quda 0.8.0 and milc 7.7.13 "ERROR: Solve precision ..." #5

Comments

lcosmai commented Feb 21, 2016

mathiaswagner commented Feb 22, 2016

lcosmai commented Feb 22, 2016

detar commented Feb 22, 2016

mathiaswagner commented Feb 22, 2016

reply email and destroy all copies of the original message.

mathiaswagner commented Feb 22, 2016

mathiaswagner commented Feb 22, 2016

lcosmai commented Feb 22, 2016

mathiaswagner commented Feb 23, 2016

mathiaswagner commented Feb 23, 2016

mathiaswagner commented Feb 23, 2016

mathiaswagner commented Feb 24, 2016