Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Results isssue #36

Open
Maryam483 opened this issue Apr 5, 2024 · 5 comments
Open

Results isssue #36

Maryam483 opened this issue Apr 5, 2024 · 5 comments

Comments

@Maryam483
Copy link

Hai,
I have trained supervised model just like steps you give (step 1-4), after training supervised baseline model on COCO dataset,
I have run semi_dest.sh with corresponding file paths to determine performance of supervised model, and the performance is
12% ( I have used 10% as partially labelled data) but in Table 1 of your paper the result is "23.70 ± 0.22". How I solve this issue??

Secondly, I am training model on 1 GPU device. This is the only difference.

I am waiting for your positive response and guidance please. Thanks.

@chenbinghui1
Copy link
Owner

Hai, I have trained supervised model just like steps you give (step 1-4), after training supervised baseline model on COCO dataset, I have run semi_dest.sh with corresponding file paths to determine performance of supervised model, and the performance is 12% ( I have used 10% as partially labelled data) but in Table 1 of your paper the result is "23.70 ± 0.22". How I solve this issue??

Secondly, I am training model on 1 GPU device. This is the only difference.

I am waiting for your positive response and guidance please. Thanks.

Learning rate and running epochs might be adjusted for different GPU numbers.

@Maryam483
Copy link
Author

Maryam483 commented Apr 8, 2024

I did not understand should I kept learning rate low or high, with one GPU device????
Because, I have result of 12% and your results is 23%. I have added code of config file below? Please tell me which parameters should I tune???
Secondly, does GPU devices matter in achieving same performance like you used 8 GPUs, but i have used 1 GPU.??

==================================
Code

**# model settings
model = dict(
type='FCOS',
backbone=dict(
type='ResNet',
depth=50,
num_stages=4,
out_indices=(0, 1, 2, 3),
frozen_stages=1,
norm_cfg=dict(type='BN', requires_grad=False),
norm_eval=True,
style='caffe',
init_cfg=dict(
type='Pretrained',
checkpoint='open-mmlab://detectron2/resnet50_caffe')),
neck=dict(
type='FPN',
in_channels=[256, 512, 1024, 2048],
out_channels=256,
start_level=1,
add_extra_convs='on_output', # use P5
num_outs=5,
relu_before_extra_convs=True),
bbox_head=dict(
type='FCOSHead',
num_classes=80,
in_channels=256,
stacked_convs=4,
feat_channels=256,
strides=[8, 16, 32, 64, 128],
norm_on_bbox=True,
centerness_on_reg=True,
dcn_on_last_conv=False,
center_sampling=True,
conv_bias=True,
loss_cls=dict(
type='FocalLoss',
use_sigmoid=True,
gamma=2.0,
alpha=0.25,
loss_weight=1.0),
loss_bbox=dict(type='GIoULoss', loss_weight=1.0),
loss_centerness=dict(
type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0)),
# training and testing settings
train_cfg=dict(
assigner=dict(
type='MaxIoUAssigner',
pos_iou_thr=0.5,
neg_iou_thr=0.4,
min_pos_iou=0,
ignore_iof_thr=-1),
allowed_border=-1,
pos_weight=-1,
debug=False),
test_cfg=dict(
nms_pre=1000,
min_bbox_size=0,
score_thr=0.05,
nms=dict(type='nms', iou_threshold=0.5),
max_per_img=100)
)

img_norm_cfg = dict(
mean=[103.530, 116.280, 123.675], std=[1.0, 1.0, 1.0], to_rgb=False)
train_pipeline = [
dict(type='LoadImageFromFile'),
dict(type='LoadAnnotations', with_bbox=True),
dict(
type='Resize',
img_scale=[(1333, 640), (1333, 800)],
multiscale_mode='value',
keep_ratio=True),
dict(type='RandomFlip', flip_ratio=0.5),
dict(type='Normalize', **img_norm_cfg),
dict(type='Pad', size_divisor=32),
dict(type='DefaultFormatBundle'),
dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']),
]
test_pipeline = [
dict(type='LoadImageFromFile'),
dict(
type='MultiScaleFlipAug',
img_scale=(1333, 800),
flip=False,
transforms=[
dict(type='Resize', keep_ratio=True),
dict(type='RandomFlip'),
dict(type='Normalize', **img_norm_cfg),
dict(type='Pad', size_divisor=32),
dict(type='ImageToTensor', keys=['img']),
dict(type='Collect', keys=['img']),
])
]

dataset_type = 'CocoDataset'

recommend to use absolute path

data_root = '/gruntdata1/bhchen/factory/data/semicoco/'
data_root="I:/DSL4/data/semicoco/"
data = dict(
samples_per_gpu=2,
workers_per_gpu=2,
train=dict(
type=dataset_type,
ann_file='I:/DSL4/DSL-main/data_list/coco_semi/semi_supervised/[email protected]',
img_prefix=data_root + 'images/full/',
pipeline=train_pipeline),
val=dict(
type=dataset_type,
ann_file='I:/DSL4/DSL-main/data_list/coco_semi/semi_supervised/instances_val2017.json',
img_prefix=data_root + 'valid_images/full/',
pipeline=test_pipeline),
## This bellow is used for generating the initial pseudo-labels for unlabeled images when running tools/inference_unlabeled_coco_data.sh
test=dict(
type=dataset_type,
ann_file='I:/DSL4/DSL-main/data_list/coco_semi/semi_supervised/[email protected]',
img_prefix=data_root + 'images/full/',
pipeline=test_pipeline)
## This bellow is used for testing model performances over COCO dataset, if you use toos/semi_dist_test.sh, please uncomment this bellow codes and comment the above
#test=dict(
# type = dataset_type,
# ann_file = 'data_list/coco_semi/semi_supervised/instances_val2017.json',
# img_prefix = data_root + 'valid_images/full/',
# pipeline = test_pipeline,
# )
)

learning policy

optimizer = dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0001,paramwise_cfg=dict(bias_lr_mult=2., bias_decay_mult=0.))
optimizer_config = dict(
#delete=True,
grad_clip=None)

learning policy

lr_config = dict(
policy='step',
warmup='linear',
warmup_iters=500,
warmup_ratio=1.0 / 3,
# 0.01/0.02 data 150-180-200; 0.05 data 90-120-140; 0.1data 50-80-100; full data 16-22-24
step=[50, 80])
runner = dict(type='EpochBasedRunner', max_epochs=100)
evaluation = dict(interval=1, metric='bbox')

checkpoint_config = dict(interval=5)

yapf:disable

log_config = dict(
interval=10,
hooks=[
dict(type='TextLoggerHook'),
# dict(type='TensorboardLoggerHook')
])

yapf:enable

custom_hooks = [dict(type='NumClassCheckHook')]

dist_params = dict(backend='nccl')
log_level = 'INFO'
load_from = None
resume_from = None
workflow = [('train', 1)]**

@chenbinghui1
Copy link
Owner

@Maryam483 Sorry, I didn't test on 1 gpu. So I cannot tell you how to change the value. You can try by your self, good luck~

@Maryam483
Copy link
Author

The semi-supervised model requires 158 days on my "1 GPU device" . Why??

You have used 8 GPUs but I have 1 GPU device.

I have "Titan XP (12GB VRAM) GPU with 16 GB RAM". I have added screenshot below.

history

@chenbinghui1
Copy link
Owner

@Maryam483 ;
Please check
(1) Whether moving model to GPU for computing

If yes, emm, you can change a task. Titan Xp might not be good for detection task.

I use 8 NVIDIA-V100 for training

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants