DIFFMOD

DIFFMOD is an image captioning model. Image captioning models are currently owned by some big companies such as Instagram, Facebook and Google. And the models which are available, are either monetised or not working at all.

DIFFMOD is different from such models as we want our model to be publically available. Inspired by the open source models like stable diffusion and auto gpt, we want DIFFMOD to be an open source library to revolutionise the image captioning community.

Executable File : app.py

Tech Stack

Python: Keras, Tensorflow, Flask, OpenCV

Model: EfficientNet

Demo:

Future Plans

We trained it over flickr8k. We're now planning to upscale the model, and training it to MSCOCO with over 330,000 images. Also planning to deploy online on one of our subdomains.

Authors

Avdhan
Satya
Mansi

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
.ipynb_checkpoints		.ipynb_checkpoints
__pycache__		__pycache__
model_weights		model_weights
static		static
storage		storage
templates		templates
ImageCaption.py		ImageCaption.py
README.md		README.md
app.py		app.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DIFFMOD

Executable File : app.py

Tech Stack

Future Plans

Authors

About

Releases

Packages

Contributors 2

Languages

ChSatyaSavith/ImageCaptioning

Folders and files

Latest commit

History

Repository files navigation

DIFFMOD

Executable File : app.py

Tech Stack

Future Plans

Authors

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages