About the model explanation #10

ohwi · 2021-03-17T11:22:59Z

Hi. Thank you for your impressive work.

I've read your work and want to understand your model clearly.

From #2 , I know there is no paper, but I found similar paper with your work.

Does the figure below explain your work?

Thank you!

ohwi · 2021-03-17T12:26:02Z

I saw little difference at the backbone. The paper uses ViT and this work uses CNN.

saahiluppal · 2021-03-17T14:20:21Z

Hey, Thanks for the feedback.

This work is inspired from Facebook AI's (Detection Transformer) which aims to do object detection with transformers.

The paper you've enclosed is very recent work on this similar topic, but they have not provided any implementation.

ohwi · 2021-03-17T15:41:04Z

Thank you for your reply.

I think I understand the structure of your work. Thank you!!

parthskansara · 2021-03-28T13:27:02Z

Hi @saahiluppal, I am trying to understand where the object detection part is occurring in the code, and what exact algorithm you're using.

saahiluppal · 2021-03-29T15:39:35Z

Hey,
The model is not doing Object Detection at any phase.

Image is fed to a resnet and this backbone will give us the feature embedding along with the corresponding mask for the image.
Then these features and mask are fed to the transformer,
and the rest is handled by attention.

That is the versatility of attention mechanism.

saahiluppal · 2021-03-29T15:42:40Z

PS: Recent research shows that doing "Object Detection" prior to "Image Captioning" doesn't bring any additional improvement, instead it will just increase complexity.

ohwi · 2021-04-01T01:58:34Z

PS: Recent research shows that doing "Object Detection" prior to "Image Captioning" doesn't bring any additional improvement, instead it will just increase complexity.

Hi. Would you let me know what is the paper you referenced? Thank you.

saahiluppal · 2021-04-01T05:59:42Z

I've read it in ablation studies of some paper, not sure which paper.
I'll share the name of the paper as soon as i come across it again.

Tough-Stone · 2022-10-29T03:13:51Z

Have you found which paper the structure of this code refers to？Thanks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About the model explanation #10

About the model explanation #10

ohwi commented Mar 17, 2021

ohwi commented Mar 17, 2021

saahiluppal commented Mar 17, 2021

ohwi commented Mar 17, 2021

parthskansara commented Mar 28, 2021 •

edited

Loading

saahiluppal commented Mar 29, 2021

saahiluppal commented Mar 29, 2021

ohwi commented Apr 1, 2021

saahiluppal commented Apr 1, 2021

Tough-Stone commented Oct 29, 2022

About the model explanation #10

About the model explanation #10

Comments

ohwi commented Mar 17, 2021

ohwi commented Mar 17, 2021

saahiluppal commented Mar 17, 2021

ohwi commented Mar 17, 2021

parthskansara commented Mar 28, 2021 • edited Loading

saahiluppal commented Mar 29, 2021

saahiluppal commented Mar 29, 2021

ohwi commented Apr 1, 2021

saahiluppal commented Apr 1, 2021

Tough-Stone commented Oct 29, 2022

parthskansara commented Mar 28, 2021 •

edited

Loading