Skip to content

Ideas for datasets to uses? (WIT just came out) #110

@afiaka87

Description

@afiaka87

Hey all,

I'm compiling a list of the various datasets we'll need and how to download them:

Keep in mind, not all of these datasets ship with captions. However many of them do ship with a class descriptor of some type. I've only done mild testing with this, but usually you can just generate labels by doing something like "an image of {class_name}". Not sure what the best way to go about that would be though.

#109

As it stands, this is turning out to be humongous. I just added the new Wikipedia dataset (11 million images).

Does anyone know of other captioned datasets we could use?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions