Can't get 100% accuracy in Sub-Goal evaluation with ground-truth actions and masks. #19

bhkim94 · 2020-03-27T10:30:52Z

I'm trying to produce Sub-Goal evaluation results with ground-truth actions and masks.
But I got index-out-of-bound errors and couldn't get 100% PLW for some trajectories in seen and unseen validation sets. (both SR and PLW should be 100% as it's evaluated with ground truths.)

This is the changes I made only in eval_subgoals.py, (line 69 and 128)

...

 68: expert_init_actions = [a['discrete_action'] for a in traj_data['plan']['low_actions'] if a['high_idx'] < eval_idx]
 69: expert_init_actions_gt = [a['discrete_action'] for a in traj_data['plan']['low_actions']]

...

127: mask = np.squeeze(mask, axis=0) if model.has_interaction(action) else None
128: action = expert_init_actions[t]['action']
     compressed_mask = expert_init_actions_gt['args']['mask'] if 'mask' in expert_init_actions_gt['args'] else None
     mask = env.decompress_mask(compressed_mask) if compressed_mask is not None else None
129: # debug
130:     if args.debug:

...

If the changes are correct to implement ground-truth actions and masks, is there any idea why I can't get 100% PLW?
And I don't understand why I got index-out-of-bound errors with ground-truth trajectories.

Thanks for replying!

The text was updated successfully, but these errors were encountered:

MohitShridhar · 2020-03-28T19:17:21Z

Hi @bhkim94, can you post the index-out-of-bound error?

We did have a related bug, but it's fixed in the latest master.

bhkim94 · 2020-03-29T08:25:42Z

Hi, thanks for replying.

This is the index errors I got.

...

No. of trajectories left: 805
Resetting ThorEnv
Task: Place two salt shakers in the drawer.
Evaluating: data/json_feat_2.1.0/pick_two_obj_and_place-PepperShaker-None-Drawer-10/trial_T20190912_221141_608117
Subgoal GotoLocation (0)
Instr: Look down, turn left, walk straight, turn left to face the fridge, walk straight, turn right when you reach the fridge, walk st
raight, turn left to face the counter with the bread on the counter and look up to the cabinet.
-------------
GotoLocation ==========
SR: 40/40 = 1.000
PLW S: 0.994
------------
No. of trajectories left: 805
Resetting ThorEnv
Task: Place two salt shakers in the drawer.
Evaluating: data/json_feat_2.1.0/pick_two_obj_and_place-PepperShaker-None-Drawer-10/trial_T20190912_221141_608117
Subgoal GotoLocation (2)
Instr: Look down, turn right, walk straight, turn right to face the pot on the counter and turn right.
Traceback (most recent call last):
  File "/home/user/Desktop/alfred/models/eval/eval_subgoals.py", line 49, in run
    cls.evaluate(env, model, eval_idx, r_idx, resnet, traj, args, lock, successes, failures, results)
  File "/home/user/Desktop/alfred/models/eval/eval_subgoals.py", line 132, in evaluate
    action = expert_init_actions_gt[t]['action']
IndexError: list index out of range
Error: IndexError('list index out of range',)
No. of trajectories left: 804
Resetting ThorEnv
Task: Put two shakers in the second drawer.
Evaluating: data/json_feat_2.1.0/pick_two_obj_and_place-PepperShaker-None-Drawer-10/trial_T20190912_221141_608117
Subgoal GotoLocation (0)
Instr: Make a right and step forward then turn left at the island.
-------------
GotoLocation ==========
SR: 41/41 = 1.000
PLW S: 0.988
------------
No. of trajectories left: 804
Resetting ThorEnv
Task: Put two shakers in the second drawer.
Evaluating: data/json_feat_2.1.0/pick_two_obj_and_place-PepperShaker-None-Drawer-10/trial_T20190912_221141_608117
Subgoal GotoLocation (2)
Instr: Turn right then walk forward. Turn around once you are past the sink.
Traceback (most recent call last):
  File "/home/user/Desktop/alfred/models/eval/eval_subgoals.py", line 49, in run
    cls.evaluate(env, model, eval_idx, r_idx, resnet, traj, args, lock, successes, failures, results)
  File "/home/user/Desktop/alfred/models/eval/eval_subgoals.py", line 132, in evaluate
    action = expert_init_actions_gt[t]['action']

...

I've found that an agent takes more actions for a subgoal than the validation set has.
For example, the first subgoal (Goto) of "pick_two_obj_and_place-PepperShaker-None-Drawer-10/trial_T20190912_221141_608117" consists of 19 low-level actions, but it took 23 actions to accomplish the subgoal.
I think this could be a cause for the errors because this makes a subsequent subgoal (i.e. the second subgoal) start from the 24th action, which isn't for the subgoal and therefore the agent eventually fails at completing subsequent subgoals.

+ I've validated a model with GT actions and masks in the same way above, and I didn't get 100% SR nor PC in both seen and unseen validation sets.

In seen validation set,
SR: 818 / 820 = 0.998
PC: 2104 / 2109 = 0.998
PLW SR: 0.998
PLW PC: 0.999

In unseen validation set,
SR: 819 / 821 = 0.998
PC: 2118 / 2120 = 0.999
PLW SR: 0.998
PLW PC: 0.999

I think in #7 three trajectories are missed in validation, but I include the whole validation set (820 for seen, and 821 for unseen).

MohitShridhar · 2020-03-31T01:43:03Z

@bhkim94 something seems strange. If there are 19 low-level actions in the expert trajectory, I don't know why the agent takes 23 actions when you are simply replaying the expert actions. Could you share your fork so I can get a better understanding of what's happening?

bhkim94 · 2020-03-31T05:09:10Z

This is the fork, https://github.com/bhkim94/alfred

There are only changes in eval_task.py and eval_subgoals.py with marked with # in the codes.
And I remove the index errors by adding explicit exit codes because there is no explicit stop token in a trajectory.

if t == len(expert_init_actions_gt): # GT doesn't have the STOP token.
    break

but the index error itself is not the main point because the incompletion of a subgoal as stated in a validation trajectory causes the index errors.

MohitShridhar · 2020-04-08T18:24:37Z

@bhkim94 sorry about the delay. Will take a look at this soon.

Meanwhile, you can just ignore these 5 trajectories, while I try to figure out what's happening. Since the SR is 99.8% anyway, this issue shouldn't hinder your progress.

MohitShridhar · 2020-04-09T05:59:15Z

@bhkim94, it seems like the issue is being caused by some non-deterministic behavior in AI2THOR.

For instance, when PutObject is used to place a Knife, the object occasionally slips and falls, invalidating the GT mask in the dataset:

No Slip:

Slip (rare occurrence):

We will take a look at fixing this, but it needs to be done on the simulator side. Fortunately, it's a rare occurrence, so you still get 99.8% SR. This shouldn't affect your modeling progress for now.

Thanks again for pointing this out!

MohitShridhar added the bug Something isn't working label Oct 26, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can't get 100% accuracy in Sub-Goal evaluation with ground-truth actions and masks. #19

Can't get 100% accuracy in Sub-Goal evaluation with ground-truth actions and masks. #19

bhkim94 commented Mar 27, 2020

MohitShridhar commented Mar 28, 2020

bhkim94 commented Mar 29, 2020 •

edited

Loading

MohitShridhar commented Mar 31, 2020 •

edited

Loading

bhkim94 commented Mar 31, 2020

MohitShridhar commented Apr 8, 2020 •

edited

Loading

MohitShridhar commented Apr 9, 2020

Can't get 100% accuracy in Sub-Goal evaluation with ground-truth actions and masks. #19

Can't get 100% accuracy in Sub-Goal evaluation with ground-truth actions and masks. #19

Comments

bhkim94 commented Mar 27, 2020

MohitShridhar commented Mar 28, 2020

bhkim94 commented Mar 29, 2020 • edited Loading

MohitShridhar commented Mar 31, 2020 • edited Loading

bhkim94 commented Mar 31, 2020

MohitShridhar commented Apr 8, 2020 • edited Loading

MohitShridhar commented Apr 9, 2020

bhkim94 commented Mar 29, 2020 •

edited

Loading

MohitShridhar commented Mar 31, 2020 •

edited

Loading

MohitShridhar commented Apr 8, 2020 •

edited

Loading