-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
getStatus() for job reference of finished light job should throw JobNotFoundException [HZ-2753] #19257
getStatus() for job reference of finished light job should throw JobNotFoundException [HZ-2753] #19257
Comments
When I run the following test it passes @Test
public void jobStatusTest() {
HazelcastInstance inst = createHazelcastInstance();
HazelcastInstance client = createHazelcastClient();
Job job1 = inst.getJet().newLightJob(streamingDag());
assertJobExecuting(job1, inst);
Job job2 = client.getJet().getJob(job1.getId());
job1.cancel();
assertThatThrownBy(job1::join)
.isInstanceOf(CancellationException.class);
sleepSeconds(1);
assertEquals(job1.getStatus(), FAILED);
assertEquals(job2.getStatus(), RUNNING);
assertNull(inst.getJet().getJob(job1.getId()));
assertNull(client.getJet().getJob(job1.getId()));
} 🤔 |
@burakgok this is very old issue, maybe already fixed. |
The title is incorrect because job proxies can only be obtained for existing jobs and that's why they cannot throw |
Job proxy is obtained before light job finishes but method on it is invoked after the job finished. |
Job proxies cannot claim that the job to which they refer doesn't exist. If they cannot perform a function because the job is completed and ultimately removed, they have to report the situation as such. For |
Slightly modified test from @fbarotov demonstrates one strange behaviour of the API: @Test
public void jobStatusTest() {
HazelcastInstance inst = createHazelcastInstance();
HazelcastInstance client = createHazelcastClient();
Job job1 = inst.getJet().newLightJob(streamingDag());
sleepSeconds(1); //TODO: use assert with timeout
assertJobExecuting(job1, inst);
Job job2 = client.getJet().getJob(job1.getId());
job1.cancel();
assertThatThrownBy(job1::join)
.isInstanceOf(CancellationException.class);
sleepSeconds(1);
assertEquals(job1.getStatus(), FAILED);
assertEquals(job2.getStatus(), RUNNING); // succeeds!
sleepSeconds(2);
assertEquals(job2.getStatus(), RUNNING); // fails
assertNull(inst.getJet().getJob(job1.getId()));
assertNull(client.getJet().getJob(job1.getId()));
} This is because first |
Currently sometimes they do because the join future is lazily created if you create proxy using id. For example: public static DAG blockingBatchDag() {
DAG dag = new DAG();
dag.newVertex("v", () -> new MockP().initBlocks());
return dag;
}
@Test
public void jobStatusTest() {
HazelcastInstance inst = createHazelcastInstance();
inst.getConfig().getJetConfig().setCooperativeThreadCount(1);
HazelcastInstance client = createHazelcastClient();
Job job1 = inst.getJet().newLightJob(blockingBatchDag());
sleepSeconds(1);
Job job2 = client.getJet().getJob(job1.getId());
assertNotNull(job2);
MockP.unblock();
job1.join();
sleepSeconds(1);
job2.join(); // throws JobNotFoundException
} |
also in the last example you get wrong job status if query on client side twice with sleep: RUNNING then FAILED which is wrong. |
Internal Jira issue: HZ-2753 |
JetInstance.getJob()
for a finished light job doesn't throwJobNotFoundException
, but instead reports the job as failed with theJNFE
as the cause, which is wrong. The JNFE should be thrown from thegetJob()
method.Reproducer:
The text was updated successfully, but these errors were encountered: