Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tensorboard in feedforward scheduler and fixed mu (trainer_basic) #507

Open
smilesun opened this issue Oct 9, 2023 · 7 comments
Open

tensorboard in feedforward scheduler and fixed mu (trainer_basic) #507

smilesun opened this issue Oct 9, 2023 · 7 comments
Assignees

Comments

@smilesun
Copy link
Collaborator

smilesun commented Oct 9, 2023

we want to compare against feedfoward mu controller (exponential warm up) and fixed mu for ell () + mu*R(), see YAML file below:

https://github.com/marrlab/DomainLab/blob/fbopt/examples/benchmark/benchmark_fbopt_mnist_jigen.yaml

it is interesting to know how they behave in tensorboard regarding ell loss and R loss

@smilesun smilesun added the fbopt label Oct 9, 2023
@smilesun smilesun changed the title tensorboard in feedforward scheduler tensorboard in feedforward scheduler and fixed mu (trainer_basic) Oct 10, 2023
@smilesun smilesun added the priority Further information is requested label Oct 25, 2023
@smilesun smilesun assigned smilesun and unassigned agisga Oct 31, 2023
@smilesun
Copy link
Collaborator Author

smilesun commented Oct 31, 2023

To reuse the tensorboard code we only need to change this line 85 which is
self.set_scheduler(scheduler=HyperSchedulerFeedback)
From file domainlab/algos/trainers_train_fbopt_b.py

And replace the HyperSchedulerFeedback to HyperSchedulerWarmup or HyperSchedulerWarmupExponetial

The above two classes can be imported from domainlab/algos/trainers/hyper_scheduler.py

@smilesun
Copy link
Collaborator Author

smilesun commented Nov 2, 2023

https://github.com/marrlab/DomainLab/pull/624/files

@agisga , could you compare the phase portrait of the two settings?

@smilesun
Copy link
Collaborator Author

smilesun commented Nov 2, 2023

sh run_fbopt_mnist_feedforward.sh

@agisga it is not working yet, i will see if i could fix it.

File "/home/playtime/domainlab/main_out.py", line 17, in
exp.execute()
File "/home/playtime/domainlab/domainlab/compos/exp/exp_main.py", line 68, in execute
self.trainer.before_tr()
File "/home/playtime/domainlab/domainlab/algos/trainers/train_fbopt_b.py", line 91, in before_tr
self.set_model_with_mu() # very small value
File "/home/playtime/domainlab/domainlab/algos/trainers/train_fbopt_b.py", line 119, in set_model_with_mu
self.model.hyper_update(epoch=None, fun_scheduler=HyperSetter(self.hyper_scheduler.mmu))
AttributeError: 'HyperSchedulerWarmup' object has no attribute 'mmu'

@smilesun
Copy link
Collaborator Author

smilesun commented Nov 2, 2023

smilesun added a commit that referenced this issue Nov 7, 2023
@smilesun
Copy link
Collaborator Author

smilesun commented Nov 7, 2023

h run_fbopt_mnist_feedforward.sh 



no algorithm conf specified, going to use default



/home/sunxd/domainlab/domainlab/arg_parser.py:252: UserWarning: no algorithm conf specified, going to use default
  warnings.warn("no algorithm conf specified, going to use default")

using device: cuda

/home/sunxd/anaconda3/lib/python3.9/site-packages/torch/nn/_reduction.py:42: UserWarning: size_average and reduce args will be deprecated, please use reduction='sum' instead.
  warnings.warn(warning.format(ret))
/home/sunxd/anaconda3/lib/python3.9/site-packages/torchmetrics/utilities/prints.py:36: UserWarning: Metric `AUROC` will save all targets and predictions in buffer. For large datasets this may lead to large memory footprint.
  warnings.warn(*args, **kwargs)
b'cdf0c565'
model name: mnistcolor10_te_rgb_31_119_180_jigen_bcdf0c565_2023md_11md_07_15_43_03_seed_0

 Experiment start at: 2023-11-07 15:43:03.725952
Traceback (most recent call last):
  File "/home/sunxd/domainlab/main_out.py", line 17, in <module>
    exp.execute()
  File "/home/sunxd/domainlab/domainlab/compos/exp/exp_main.py", line 68, in execute
    self.trainer.before_tr()
  File "/home/sunxd/domainlab/domainlab/algos/trainers/train_fbopt_b.py", line 92, in before_tr
    self.set_model_with_mu()  # very small value
  File "/home/sunxd/domainlab/domainlab/algos/trainers/train_fbopt_b.py", line 121, in set_model_with_mu
    self.model.hyper_update(epoch=None, fun_scheduler=self.hyper_scheduler)
  File "/home/sunxd/domainlab/domainlab/models/model_dann.py", line 70, in hyper_update
    dict_rst = fun_scheduler(epoch)  # the __call__ method of hyperparameter scheduler
  File "/home/sunxd/domainlab/domainlab/algos/trainers/hyper_scheduler.py", line 39, in __call__
    dict_rst[key] = self.warmup(val_setpoint, epoch)
  File "/home/sunxd/domainlab/domainlab/algos/trainers/hyper_scheduler.py", line 31, in warmup
    ratio = ((epoch+1) * 1.) / self.total_steps
TypeError: unsupported operand type(s) for +: 'NoneType' and 'int'

smilesun added a commit that referenced this issue Nov 7, 2023
@smilesun
Copy link
Collaborator Author

smilesun commented Nov 7, 2023

 Experiment start at: 2023-11-07 15:49:52.197946
Traceback (most recent call last):
  File "/home/sunxd/domainlab/main_out.py", line 17, in <module>
    exp.execute()
  File "/home/sunxd/domainlab/domainlab/compos/exp/exp_main.py", line 68, in execute
    self.trainer.before_tr()
  File "/home/sunxd/domainlab/domainlab/algos/trainers/train_fbopt_b.py", line 99, in before_tr
    self.hyper_scheduler.set_setpoint(
AttributeError: 'HyperSchedulerWarmup' object has no attribute 'set_setpoint'

@smilesun
Copy link
Collaborator Author

smilesun commented Nov 7, 2023

pr: #626

@smilesun smilesun removed the priority Further information is requested label Jul 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants