hei thanks for sharing hecko, I find it very interesting. the voices on the demo sounds to me almost real if it not were fore some slight robotic sounds and the intonation also needs some improvement but I find it totally usable and some of those problems could be at least masked with some post processing effects.
I'm thrilled by how those technologies improved and how they can be used to improve creations, despite all the dangers and potential misuses they will also carry.
Thanks for the clarification medicine storm, I agree with you, if the license is not publicly available and you have to privately request it and the sites do not states PD or CC0 on their tools and cloned voices they should not qualify for OGA.
I would not publish any work here or anywhere using them either without a clear publicly available license that clearly states the uses I can do with the software and voices.
I'm simply curious and interested on the potential of those technologies and since I have not much idea of what those self advertised "free license" statements means, I wanted to have more feedback which you kindly provided.
I still believe voices could be trained and provided in a way that is completely compatible with foss software and public domain licenses and that the results of such new technologies can make a great impact for the better on those creations, but definitely those from which I asked are not created and provided in such way.
Even if such trained data and tools could be made completely compatible with CC0 licenses, it's completely ok not to want A.I. generated content in a site, I was genuinely interested on feedback regarding those questions and I'm grateful for your response.
I have another question that is related to this topic so I decided not to open a new one, what about cloned voices for singing / adding vocals if they are free? I notice many voices from https://vsinger.com/ and https://vocadb.net/ which are used in known software such as vocaloid and other similar apps to sing melodies along the tracks have a free license. Does anyone knows what that free license means? is it possible to release tracks with them and license them as public domain?
As it can be seen on that site many virtual singers are offered as "license free". I would be interested to know what's the legal situation when releasing tracks using those vocals, could they be public domain too?
Hi, good idea, I think it is absolutely possible to achieve what you intend.
I will give you some advice based on my own experience and hopefully you find anything useful out of it:
Training a network to achieve your goals is an horrendous ammount of work and energy, probably a lot more work than drawing the whole assets yourself.
A.I. is still unable to follow direction and to achieve coherence, the only way I know to achieve coherence with A.I. art is by adding dual conditioning: prompts + base images. Another thing that helps if the image is too complex (like the dragon smoking you said in your thread), is to divide the images in meaningfull objects (you can generate the dragon and ciggarrete or smoke in different steps and if you achieve the right perspective composite them manually).
There are enough trained network for most art styles so it is a lot more practical to find and use the closest one as to train one yourself unless you really need very specific outputs.
In my experience conditioning the A.I. with such resolution (16x16 or 32x32) results most of the time in garbage. I would suggest to create high resolution base images and then find the right way to reduce the size.
So in resume, the only way to spare time instead of spending more time (and energy) in achieving the results you want is to find the closest weight that fits your needs, to use dual conditioning (prompts + base images) and to use high resolution and high quality base images for the conditioning.
I hope any of this feedback helps you and wish you good luck with your project and that you achieve great results.
Just for clarification, I don't know how modules for druppal are created and that is just a simple implementation in pure javascript / html, but it works and even if you don't create a module for druppal you can inject it in the code (although that's harder to maintain afterwards).
I have no account on github and since is not a druppal module which I think is the best way to add such feature I'm sharing it here, In the hope that you find it useful and / or inspiring!
I forgot to put some context about the main javascript library used for the player and here many examples and documentation can be found: https://wavesurfer.xyz/examples/?zoom.js
Thank you medicineStorm, I'm glad you like it and that I choosed the right license this time :)
Sorry, I'm still learning about licenses, I changed it to CC-BY 4.0, is it correct now?
hei thanks for sharing hecko, I find it very interesting. the voices on the demo sounds to me almost real if it not were fore some slight robotic sounds and the intonation also needs some improvement but I find it totally usable and some of those problems could be at least masked with some post processing effects.
I'm thrilled by how those technologies improved and how they can be used to improve creations, despite all the dangers and potential misuses they will also carry.
I'm really pleased to know you liked the track and that it fits in your project. Thanks for the feedback!
This mix would not be possible without the original tracks created by : https://opengameart.org/users/centurionofwar
Thanks for the clarification medicine storm, I agree with you, if the license is not publicly available and you have to privately request it and the sites do not states PD or CC0 on their tools and cloned voices they should not qualify for OGA.
I would not publish any work here or anywhere using them either without a clear publicly available license that clearly states the uses I can do with the software and voices.
I'm simply curious and interested on the potential of those technologies and since I have not much idea of what those self advertised "free license" statements means, I wanted to have more feedback which you kindly provided.
I still believe voices could be trained and provided in a way that is completely compatible with foss software and public domain licenses and that the results of such new technologies can make a great impact for the better on those creations, but definitely those from which I asked are not created and provided in such way.
Even if such trained data and tools could be made completely compatible with CC0 licenses, it's completely ok not to want A.I. generated content in a site, I was genuinely interested on feedback regarding those questions and I'm grateful for your response.
I have another question that is related to this topic so I decided not to open a new one, what about cloned voices for singing / adding vocals if they are free? I notice many voices from https://vsinger.com/ and https://vocadb.net/ which are used in known software such as vocaloid and other similar apps to sing melodies along the tracks have a free license. Does anyone knows what that free license means? is it possible to release tracks with them and license them as public domain?
Here there is another software which use A.I. and make use of some those cloned vocals and info about the license: https://support.acestudio.ai/article/23-introduction-of-ace-studio-licen...
As it can be seen on that site many virtual singers are offered as "license free". I would be interested to know what's the legal situation when releasing tracks using those vocals, could they be public domain too?
Hi, good idea, I think it is absolutely possible to achieve what you intend.
I will give you some advice based on my own experience and hopefully you find anything useful out of it:
Training a network to achieve your goals is an horrendous ammount of work and energy, probably a lot more work than drawing the whole assets yourself.
A.I. is still unable to follow direction and to achieve coherence, the only way I know to achieve coherence with A.I. art is by adding dual conditioning: prompts + base images. Another thing that helps if the image is too complex (like the dragon smoking you said in your thread), is to divide the images in meaningfull objects (you can generate the dragon and ciggarrete or smoke in different steps and if you achieve the right perspective composite them manually).
There are enough trained network for most art styles so it is a lot more practical to find and use the closest one as to train one yourself unless you really need very specific outputs.
In my experience conditioning the A.I. with such resolution (16x16 or 32x32) results most of the time in garbage. I would suggest to create high resolution base images and then find the right way to reduce the size.
So in resume, the only way to spare time instead of spending more time (and energy) in achieving the results you want is to find the closest weight that fits your needs, to use dual conditioning (prompts + base images) and to use high resolution and high quality base images for the conditioning.
I hope any of this feedback helps you and wish you good luck with your project and that you achieve great results.
beautiful work, I implemented it on a chess fork which can be seen on this repository: https://codeberg.org/glitchapp/love-chessboard
Thanks for the great attention to details!
Here you are:
Code: https://ufile.io/4fwlpdii
Just for clarification, I don't know how modules for druppal are created and that is just a simple implementation in pure javascript / html, but it works and even if you don't create a module for druppal you can inject it in the code (although that's harder to maintain afterwards).
I have no account on github and since is not a druppal module which I think is the best way to add such feature I'm sharing it here, In the hope that you find it useful and / or inspiring!
I forgot to put some context about the main javascript library used for the player and here many examples and documentation can be found: https://wavesurfer.xyz/examples/?zoom.js
The code is open source and it has BSD license: https://github.com/katspaugh/wavesurfer.js?tab=BSD-3-Clause-1-ov-file#re...
I'm not sure if there is any incompatility with such license as I've never heard of it before.
Pages