Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

turning on flip #11

Open
mateuszwyszynski opened this issue Feb 19, 2024 · 3 comments
Open

turning on flip #11

mateuszwyszynski opened this issue Feb 19, 2024 · 3 comments
Assignees
Labels
bug Something isn't working

Comments

@mateuszwyszynski
Copy link
Owner

Training the model using configs shared with the pretrained models:

data:
  amass_dir: ./amass_samples/
  data_dir: ./training_data/
  flip: true
  num_pts: 10000
  single: false
experiment:
  bodymodel: smpl
  data_name: PoseData
  exp_name: small
  inp_name: single
  num_part: 21
  root_dir: ./posendf/replicate-version2/
  test: false
  type: BaseTrainer
  val: false
model:
  DFNet:
    act: softplus
    beta: 100
    dims: 256, 512, 1024, 512, 256, 64
    ff_enc: false
    in_dim: 126
    name: DFNet
    num_layers: 5
    num_parts: 21
    total_dim: 960
  StrEnc:
    act: softplus
    beta: 100
    ff_enc: false
    in_dim: 84
    name: StructureEncoder
    num_layers: 2
    num_part: 21
    out_dim: 6
    pose_enc: false
    use: true
train:
  abs: true
  batch_size: 4
  body_enc: true
  clamp_dist: 0.0
  continue_train: true
  device: cuda
  disp_reg: true
  dist: 0.5
  eikonal: 0.1
  eval: false
  grad: false
  loss_type: l1
  man_loss: 0.1
  max_epoch: 200000
  num_worker: 4
  optimizer: Adam
  optimizer_param: 1.0e-05
  pde: false
  square: false
  train_stage_1: 100000
  train_stage_2: 100000

results in the following error:

Traceback (most recent call last):
  File "/home/mateusz.wyszynski/Code/PoseNDF/trainer.py", line 37, in <module>
    train(opt, args.config, args.test)
  File "/home/mateusz.wyszynski/Code/PoseNDF/trainer.py", line 21, in train
    loss,epoch_loss = trainer.train_model(i)
  File "/home/mateusz.wyszynski/Code/PoseNDF/model/train_posendf.py", line 90, in train_model
    for i, inputs in enumerate(self.train_dataset):
  File "/opt/conda/envs/posendf/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 681, in __next__
    data = self._next_data()
  File "/opt/conda/envs/posendf/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 1376, in _next_data
    return self._process_data(data)
  File "/opt/conda/envs/posendf/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 1402, in _process_data
    data.reraise()
  File "/opt/conda/envs/posendf/lib/python3.9/site-packages/torch/_utils.py", line 461, in reraise
    raise exception
IndexError: Caught IndexError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "/opt/conda/envs/posendf/lib/python3.9/site-packages/torch/utils/data/_utils/worker.py", line 302, in _worker_loop
    data = fetcher.fetch(index)
  File "/opt/conda/envs/posendf/lib/python3.9/site-packages/torch/utils/data/_utils/fetch.py", line 49, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/opt/conda/envs/posendf/lib/python3.9/site-packages/torch/utils/data/_utils/fetch.py", line 49, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/home/mateusz.wyszynski/Code/PoseNDF/model/load_data.py", line 71, in __getitem__
    amass_poses, _  = quat_flip(amass_poses)
  File "/home/mateusz.wyszynski/Code/PoseNDF/model/load_data.py", line 15, in quat_flip
    is_neg = pose_in[:,:,0] <0
IndexError: too many indices for array: array is 2-dimensional, but 3 were indexed

Setting the flip to false allows to start the training, so the mistake is most likely somehow caused by this functionality. Based on the paper, the flip is an additional information for the network that we always have two quaternions representing the same rotation (i.e. q and -q)

@mateuszwyszynski mateuszwyszynski added the bug Something isn't working label Feb 19, 2024
@mateuszwyszynski mateuszwyszynski self-assigned this Feb 19, 2024
@mateuszwyszynski
Copy link
Owner Author

mateuszwyszynski commented Feb 19, 2024

Ok, so the error is cause by the lines 70-71 in load_data.py.

I believe the poses sampled from the amass dataset are not represented as quaternions (one can see that the pose has shape 63 not 21 x 4). I've checked the original repo and in the original implementation the code is the same. I think we have to correct it on our own.

No idea how they obtained the original results though. A good question is how long they have been training the network? Maybe I should continue the training without the flip and wait until I get similar results.

@mateuszwyszynski
Copy link
Owner Author

Another thing is that I should probably use indices 3:66 to generate the pose. The position of the root is of no interest for us, but the orientation impacts the pose. This is how this problem is treated in vposer (line 102)

This should be revised with viser after #8 is closed

mateuszwyszynski added a commit that referenced this issue Feb 19, 2024
The one published by the authors is ok. One has to simply turn of the flip (check issue #11 for a more detailed discussion) and use correct paths to the data
@mateuszwyszynski
Copy link
Owner Author

mateuszwyszynski commented Feb 19, 2024

Another thing is that I should probably use indices 3:66 to generate the pose. The position of the root is of no interest for us, but the orientation impacts the pose. This is how this problem is treated in vposer (line 102)

This should be revised with viser after #8 is closed

Ok, so based on my preliminary observations using viser we should actually use indices 0:63 when dealing with samples generated from AMASS using data/sample_poses.py and indices 3:66 when dealing with the original AMASS raw data. Have to understand better what is the role of the remaining parameters, but wanted to save a note for future reference.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant