Looking forward to when/if there is a fully openly-trained model some day.
There's one in the works called Public Diffusion. Much of the data is taken from Wikimedia Commons, so it does have some problematic images dotted about, whether due to quirks of the site (cosplay of copyrighted characters is allowed for some reason), differences in copyright terms across countries, or just blatant copyright infringement that didn't get caught. They also use a scrape-trained language model for captioning and another for interpreting the captions, which may or may not matter copyright-wise.
(Another project, Elan Mitsua, is stricter on both counts, but the terms of use are likely too strict for OGA (and aside from public-domain images, it's also trained on works submitted specifically for training). It's also not quite there in terms of quality for those who want to generate ready-to-use assets, but it can serve as inspiration, if nothing else.
it's a nice art in general but the lines are often not perfectly 2:1 diagonal, which makes them look bumpy and uneven, commonly known as jaggies
here's my attempt at cleaning it up (i hereby place my modifications under cc0), i'm not sure what to do about the inner walls of the ceiling though (could just drop the lower lines down a pixel but i'm not super sure about that)
mannequin vibes, i like it!
it really whips the llama's ass
There's one in the works called Public Diffusion. Much of the data is taken from Wikimedia Commons, so it does have some problematic images dotted about, whether due to quirks of the site (cosplay of copyrighted characters is allowed for some reason), differences in copyright terms across countries, or just blatant copyright infringement that didn't get caught. They also use a scrape-trained language model for captioning and another for interpreting the captions, which may or may not matter copyright-wise.
(Another project, Elan Mitsua, is stricter on both counts, but the terms of use are likely too strict for OGA (and aside from public-domain images, it's also trained on works submitted specifically for training). It's also not quite there in terms of quality for those who want to generate ready-to-use assets, but it can serve as inspiration, if nothing else.
here's a text-to-speech engine with a bunch of voices under various cc licenses, some foss-compatible: https://github.com/rhasspy/piper
(though most of them were trained by starting from the model for "lessac", which has a restrictive research license; idk how much that matters though)
and here's an asset pack made with it: https://rancidbacon.itch.io/dialogue-tool-for-larynx-text-to-speech
another format i've seen in use is a single grayscale image, where each pixel's brightness determines at what point it switches to the second image
it's natively supported in ren'py and it seems to be used somewhat widely outside of it too (godot tutorial, game maker shader, rpg maker plugin, compatible generator)
it's a nice art in general but the lines are often not perfectly 2:1 diagonal, which makes them look bumpy and uneven, commonly known as jaggies
here's my attempt at cleaning it up (i hereby place my modifications under cc0), i'm not sure what to do about the inner walls of the ceiling though (could just drop the lower lines down a pixel but i'm not super sure about that)