Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Let model authors specify filetypes for inputs and outputs (audio, video, image, etc) #1341

Open
zeke opened this issue Oct 18, 2023 · 2 comments

Comments

@zeke
Copy link
Member

zeke commented Oct 18, 2023

The cog.Path object is used to get files in and out of models. It represents a path to a file on disk. Path is used for all files, regardless of whether they're text files, zip files, videos, images, audio files, etc.

What kind of file does the model want? 🤷🏼

When looking at the schema for a model, it's not easy to tell what type of file is expected:

$ curl -s -H "Authorization: Token $REPLICATE_API_TOKEN" \
  https://api.replicate.com/v1/models/stability-ai/sdxl | jq ".latest_version.openapi_schema.components.schemas.Input.properties.mask"

SDXL's mask input expects an image file, but that's not clear from the schema. Unless the model author writes a description that says what kind of file is expected, users of the model can't reliably know what's expected:

{
  "type": "string",
  "title": "Mask",
  "format": "uri",
  "x-order": 3,
  "description": "Input mask for inpaint mode. Black areas will be preserved, white areas will be inpainted."
}

Being explicit about file types

What if, instead of defining the mask in the predictor as a Path, it could be an ImagePath, which would really just be a Path under the hood with some extra constraints?

from cog import BasePredictor, Input, ImagePath

class Predictor(BasePredictor):
    def predict(
        self,
        mask: ImagePath = Input(
            description="Input mask for inpaint mode. Black areas will be preserved, white areas will be inpainted.",
            default=None,
        )
    )

This may be a naive suggestion about how to approach making input and output types more apparent to model consumers, but I'm open to other ideas that address the issue.

Related issues:

@zeke zeke changed the title Introducing more specific input and output types for images, audio, video, etc Let model authors specify filetypes for inputs and outputs (audio, video, image, etc) Oct 18, 2023
@zeke
Copy link
Member Author

zeke commented Nov 16, 2023

Maybe it could be a property of the existing Path, like a list of mimetypes or something.

@zeke
Copy link
Member Author

zeke commented Oct 23, 2024

Related: #2014

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant