Let model authors specify filetypes for inputs and outputs (audio, video, image, etc) #1341

zeke · 2023-10-18T22:30:28Z

The cog.Path object is used to get files in and out of models. It represents a path to a file on disk. Path is used for all files, regardless of whether they're text files, zip files, videos, images, audio files, etc.

What kind of file does the model want? 🤷🏼

When looking at the schema for a model, it's not easy to tell what type of file is expected:

$ curl -s -H "Authorization: Token $REPLICATE_API_TOKEN" \
  https://api.replicate.com/v1/models/stability-ai/sdxl | jq ".latest_version.openapi_schema.components.schemas.Input.properties.mask"

SDXL's mask input expects an image file, but that's not clear from the schema. Unless the model author writes a description that says what kind of file is expected, users of the model can't reliably know what's expected:

{
  "type": "string",
  "title": "Mask",
  "format": "uri",
  "x-order": 3,
  "description": "Input mask for inpaint mode. Black areas will be preserved, white areas will be inpainted."
}

Being explicit about file types

What if, instead of defining the mask in the predictor as a Path, it could be an ImagePath, which would really just be a Path under the hood with some extra constraints?

from cog import BasePredictor, Input, ImagePath

class Predictor(BasePredictor):
    def predict(
        self,
        mask: ImagePath = Input(
            description="Input mask for inpaint mode. Black areas will be preserved, white areas will be inpainted.",
            default=None,
        )
    )

This may be a naive suggestion about how to approach making input and output types more apparent to model consumers, but I'm open to other ideas that address the issue.

Related issues:

Design: file handling #496

The text was updated successfully, but these errors were encountered:

zeke · 2023-11-16T17:37:33Z

Maybe it could be a property of the existing Path, like a list of mimetypes or something.

zeke · 2024-10-23T15:26:03Z

Related: #2014

zeke changed the title ~~Introducing more specific input and output types for images, audio, video, etc~~ Let model authors specify filetypes for inputs and outputs (audio, video, image, etc) Oct 18, 2023

zeke mentioned this issue Oct 23, 2024

Add an image mask file type #2014

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Let model authors specify filetypes for inputs and outputs (audio, video, image, etc) #1341

Let model authors specify filetypes for inputs and outputs (audio, video, image, etc) #1341

zeke commented Oct 18, 2023 •

edited

Loading

zeke commented Nov 16, 2023 •

edited

Loading

zeke commented Oct 23, 2024

Let model authors specify filetypes for inputs and outputs (audio, video, image, etc) #1341

Let model authors specify filetypes for inputs and outputs (audio, video, image, etc) #1341

Comments

zeke commented Oct 18, 2023 • edited Loading

What kind of file does the model want? 🤷🏼

Being explicit about file types

zeke commented Nov 16, 2023 • edited Loading

zeke commented Oct 23, 2024

zeke commented Oct 18, 2023 •

edited

Loading

zeke commented Nov 16, 2023 •

edited

Loading