Using AI to tag videos
-
I have accumulated about 20TB of gay movies and clips which I try and tag in a database (Videonizer) with attributes like "big dick" "hanging balls" "nice asshole". I can then conduct a search for any combination of attributes I want to see on that occasion. However I have only had time to thoroughly scan about 20% of my stash. I wondered if I could use AI to automate this process by "training" the AI on some still images of attractive attributes and then letting it loose on my whole database. Anyone had any experience of this or offer any tips?
-
@underchubs Have you not tried using scrapers to grab that sort of data from online sources?
-
Relialistically, no. AI models excel at sorting stuff, but the amount of data required to train them is huge. To train a model for your particular needs, you need to sort and tag all of your collection manually first.
Currently there is no pretrained AI model that is capable of spotting "big dick, hanging balls and nice assholes", the amount of data required to train one would be greater or at least equivalent to what you have unsorted. If it was already sorted, you could use it to train a model and keep it for posteriority, but with an unsorted collection, there's nothing you can do.
If you try to train a model with the small portion you have sorted, it won't be enough to have it be reliable, it would pretty much missort and mistag a bunch of stuff and make it much harder to fix later. This happens because models tend to generalize if undertrained or trained with too little data, specially if you are training it for highly specific and subjective traits. That alone wouldn't work, but here's an even better reason:
Training AI models is a very insufferable experience, take for example the "nice ass" tag, if you feed your model ONLY with nice asses, it will see absolutely any ass as being as nice. So when you are feeding your model data to learn from, it must be properly labelled with the traits you're training it for, but it must be labelled in equal detail with "not big dick" or "not nice ass" because this is how AI models learn. So even if you wanted to train a model with the very little content you have sorted already, you would need to sort that content even further to add counterexamples.
Not only that, but we are talking primarily about videos, we have modern stuff that's filmed in 4K with great lighting, and we have crappy flash videos from early 2000s, or even older stuff, that makes the task painfully harder.
It's more feasible that you're tagging your stuff based on physical attributes rather than the action depicted, that is easier to spot and harder to get wrong compared to action recognition, but it still requires a lot of presorted data.
-
@ianfontinell-0
Very helpful - thanks for such a reasoned and detailed response. Guess I'll just have to sit down and grind through my database manually. Maybe one day I'll be able to monetize my beautifully tagged database! (dream on...... )