cloned voices as open source assets

Tuesday, July 25, 2023 - 05:54

Would it be possible to share cloned voices as assets in this site or is there any incompatibility with the rules I am not aware of?

just wondering because one of the things that are currently possible and hard to find is exactly that, voices to use on games, and since finding voice actors without a budget is quite difficult, I was wondering if it would be a good idea to render them with machine learning to use on open source projects...

Umplix

joined 3 years 6 months ago

Tuesday, July 25, 2023 - 06:37

Well, I think it depends on what program you used. I believe OGA would only allow AI generated art/music/voices if both the following are true:

1. The company permits commercial use.

2. If the AI was trained on a public domain dataset.

There are some out there, e.g. Replica Studios(an AI voice generation app) uses a dataset licensed CC-BY 4.0, and I believe you can upload your own voice samples to it. They do have a free trial if you want to try it out.

___________________________________________________________________

No mind to think;

No will to break;

No voice to cry suffering.

Ragnar Random

joined 4 years 1 month ago

Tuesday, July 25, 2023 - 06:37

without the consent of the recording artist (the voice that is being cloned) and without the recorded work (the actual audio files being used to clone the voice) being open content, i would say this falls under the same peril as other AI art. using a dataset that consists of copyrighted material is stealing. thats my opinion, courts haven't really decided yet how this will be treated.

for now, i would think that submitting voice clones made using a non-open-content dataset would be cause for a "files temporarily unavailable" flag on the asset.

with that said, creating an open content set of voice clones would be a great idea. it would be difficult but not impossible to find voiceover recording artists to sign off on their voices being cloned, because it would be a one-time-pay-me-now-then-i-am-obsolete type of deal.

do you know much about voice cloning ai, and how to train datasets? if so, i would encourage you to start a project looking for volunteer voiceover actors to record audio for the dataset. i imagine with some work you could create something that would be a worthwhile asset to indie devs and open source projects.

i tried doing some voice cloning on some of the cc0 voiceover assets submitted by Kenney, but with very poor results. you need quite a lot of recorded audio to get a good dataset.

i dont have a good microphone at the moment, but i could in the future get one. if we can come up with a standard script that would be good for training a dataset,. for voice actors to read that script, this could become a thing. i will do some research on voice cloning datasets, and see if we can come up with a script. then all we would need would be voice actors willing to participate under the terms that their voiceovers would be part of an open content dataset. i would be interested in participating in this project, but i would be an advocate for CC0 licensing and not -BY or -SA. the fewer restrictions there are on content the better it is for the world, in my opinion.

glitchart

joined 3 years 2 months ago

Tuesday, July 25, 2023 - 06:51

I've never did machine learning training and just used trained so called weights to test the technology, but I think I could do that. The problem with the A.I. being trained on non open datasets I think should be discarded if you only use a voice from someone who permits his / her voice to be cloned, correct me if I'm wrong. The challenges to achieve a collection of good quality cloned voices as "ragnar Random" mentioned above would be to have a large and high quality collections of audios and another challenge would be to find people willing to let their voice be "open sourced".

There are other issues that come to my mind such as someone making a bad use of those cloned voice even if does not break the license, but nevertheless I think there is a lot of potential in a set of cloned voices for creative projects and it could eventually enhance the quality of them if well used...

Ragnar Random

joined 4 years 1 month ago

Tuesday, July 25, 2023 - 07:23

yeah the issue is the dataset for the most part

the algorithm that uses the dataset has a license of course, but that algorithm is software. if you draw a picture in Photoshop, the art produced does not inherit the Photoshop license.

if the algorithm is run on an online webhost, then the person hosting the algorithm can have their own Terms of Use that could conflict with licensing the generated works as open content even if you train a dataset using open content on their platform.

would be nice if we could get someone in the OGA community with good knowledge of python to team up, most of the ai algorithms i am familiar with use python....

glitchart

joined 3 years 2 months ago

Tuesday, July 25, 2023 - 21:42

Ragnar Random: scripts make life easier, specially for such technology that most of the time does not even have a gui, so if you create one that works please let me know, I'm very interested.

eugeneloza

joined 10 years 8 months ago

Wednesday, July 26, 2023 - 00:32

> using a dataset that consists of copyrighted material is stealing. thats my opinion, courts haven't really decided yet how this will be treated.

Well, when using databases of copyrighted material benefited Google and other big search engines who used it for commercial purposes - it was not stealing but fair use. Now as the threat arises that it may benefit individuals courts will decide that it's stealing and will impose limitations only large corporations will be able to comply with (because suddenly Microsoft is leading the parade of C2PA ;) and mass hysteria in social media may just be proper marketing effort).

IMHO if a model is trained on whatever material is ok for as long as the output doesn't contain anything from that materials in a recognizable form. The problem with current AI datasets is that when you ask for "cute little monster" (we've talked about that in a different topic :D) you can easily get something that looks close enough to some pokemon which will get you into trouble with Nintendo - and as such it's better to avoid using such datasets for this specific reason. My idea of using AI generated pictures as portraits in JRPG quickly cooled down when I got Nick from Zootopia as one of the results :D

But let's not dwell on that :) Too much has already been said and nobody was asking any of us.

> voice actors willing to participate under the terms

If you do get to some good result with that, ping me. I have a more or less good mic and some experience in voice acting. Not sure if I'll manage to pull out enough dedication to finish the job (I know I'm quick to promise but often fail in the end, so no promises), but let's try.

Plus if it works I can do a dozen of different voices.

> come up with a script

I'm afraid there can be some problems with actual in-game voice. Reading a monotonous text is one thing. Having computer game character emotionally react to events in-game is a different thing. I might suggest to check experience of https://github.com/DanRuta/xVA-Synth - they've created an in-app editor for tempo and pitch (allowing to control emotion). Not sure how easy is to use this generator though.

They also have a dataset at least from Bethesda's games. So, using a dataset from some of the opensource games (like FreeDroid RPG, Valyria Tear or Dink Smallwood) can be a good (tested with time) option.

I didn't study the issue too deep, but XVASynth license looks like GPLv3, so we can even use this tool, just to generate a copyright-clean database.

EDIT: I've just realized that taining on LibriVox recordings may be a good start. It won't give any expressions, but the public domain recordings are already there together with public domain texts of the works. Just needs someone to listen and prepare for processing. This way the model can already have literally dozens to hundreds of voice actors ready. Still voiced reading (or even real-time synthesis) of text in-game may be beneficial as game asset.

glitchart

joined 3 years 2 months ago

Wednesday, July 26, 2023 - 02:27

eugeneloza, in order to get good results, first I need a good collection of audio to train the A.I.

One thing I noticed is that you can make a voice speak any language, but the cloned voice make the same mistakes a non native speaker who doesn't master the language would do regarding pronunciation: bad pronunciation of vowels and consonants due to missing examples in the given language. That's why it would be important in my opinion to have voices in several languages.

I've never trained an A.I. to clone voices but it would be an interesting (an eventually useful) experiment.

I already tested and experimented with cloned voices and they sound quite good but I'm not sure how much training and data was needed to create them.

glitchart

joined 3 years 2 months ago

Monday, May 20, 2024 - 00:56

I have another question that is related to this topic so I decided not to open a new one, what about cloned voices for singing / adding vocals if they are free? I notice many voices from https://vsinger.com/ and https://vocadb.net/ which are used in known software such as vocaloid and other similar apps to sing melodies along the tracks have a free license. Does anyone knows what that free license means? is it possible to release tracks with them and license them as public domain?

Here there is another software which use A.I. and make use of some those cloned vocals and info about the license: https://support.acestudio.ai/article/23-introduction-of-ace-studio-licen...

As it can be seen on that site many virtual singers are offered as "license free". I would be interested to know what's the legal situation when releasing tracks using those vocals, could they be public domain too?

MedicineStorm

joined 12 years 8 months ago

Monday, May 20, 2024 - 09:13

Ace Studio: The "License Free" license says "use them for commercial or non-commercial purposes for free" but that is not the same as PD or CC0. Often such licenses have extra conditions like "...but you may not redistribute them as-is nor resell them". Reasonable terms, and you could use such assets in your own project, but it would make them ineligible for hosting on OGA as we are technically a stock asset hosting service and all licenses on OGA permit resale, et cetera. I don't actually know if those extra conditions are present because they don't list the actual license text. It seems you must "apply" for a copy of the full license text by filling out personal details, which I am unwilling to do. I'd love to take a look at the actual license text if anyone else is willing to apply for it and share it here.
However! The license granted by Ace Studio is not the same thing as a license (if any) granted by the owners of the training data. As is common in AI these days, AI trainers will scrape publicly available assets without obtaining permission for their use. "Publicly available" is not the same as "Public Domain". i.e. images from Google Image Search are publicly available, but 90% are copyrighted and non-free. As eugeneloza mentioned, this may be considered Fair-Use.... Buuuuut 1) Fair-Use is not Public Domain and it comes with caveats on how it can be used, and 2) This Fair-Use defense is an assumption being generally made by AI trainers. Everyone is just assuming the courts will conclude its ok to not ask permission from the owners of the training data. OGA can make no such assumptions.
Voca DB: I can't seem to find any clear indications on the terms of use nor any information about their training data. That doesn't mean it isn't there, I just didn't find it. If you see what I'm missing, by all means direct me to the details. However, in the absence of that information, we must assume the terms are "non-free" despite blurbs or license deeds simply saying "it's free!". As with Ace Studio, we can't trust statements of freedom without seeing the full license text.
Vsinger: I wasn't able to find any terms of use at all. In fact, the page scared my malware protection system and halted the site from fully loading. Not a great endorsement of trust to start with, but let me know if anyone else has better luck locating the details of the licensing and training data origins.

These are assessments from the perspective of OGA policy and do not neccessarily mean individual users would be unable to legally use such assets in their projects. What OGA is allowed to do is not the same as what you are allowed to do. That being said, until we have more details on those licenses and training dataset origins, the answer to this question:

"Would it be possible to share cloned voices as assets in this site...?"'

... is "no", unfortunately.

--Medicine Storm

glitchart

joined 3 years 2 months ago

Monday, May 20, 2024 - 20:18

Thanks for the clarification medicine storm, I agree with you, if the license is not publicly available and you have to privately request it and the sites do not states PD or CC0 on their tools and cloned voices they should not qualify for OGA.

I would not publish any work here or anywhere using them either without a clear publicly available license that clearly states the uses I can do with the software and voices.

I'm simply curious and interested on the potential of those technologies and since I have not much idea of what those self advertised "free license" statements means, I wanted to have more feedback which you kindly provided.

I still believe voices could be trained and provided in a way that is completely compatible with foss software and public domain licenses and that the results of such new technologies can make a great impact for the better on those creations, but definitely those from which I asked are not created and provided in such way.

Even if such trained data and tools could be made completely compatible with CC0 licenses, it's completely ok not to want A.I. generated content in a site, I was genuinely interested on feedback regarding those questions and I'm grateful for your response.

hecko

joined 2 years 8 months ago

Wednesday, May 22, 2024 - 00:14

here's a text-to-speech engine with a bunch of voices under various cc licenses, some foss-compatible: https://github.com/rhasspy/piper
(though most of them were trained by starting from the model for "lessac", which has a restrictive research license; idk how much that matters though)
and here's an asset pack made with it: https://rancidbacon.itch.io/dialogue-tool-for-larynx-text-to-speech

glitchart

joined 3 years 2 months ago

Wednesday, May 22, 2024 - 01:55

hei thanks for sharing hecko, I find it very interesting. the voices on the demo sounds to me almost real if it not were fore some slight robotic sounds and the intonation also needs some improvement but I find it totally usable and some of those problems could be at least masked with some post processing effects.

I'm thrilled by how those technologies improved and how they can be used to improve creations, despite all the dangers and potential misuses they will also carry.

User login

cloned voices as open source assets