Animate Anyone 2: High-Fidelity Character Image Animation
(humanaigc.github.io)164 points by ToJans 5 days ago
164 points by ToJans 5 days ago
I think we have already crossed that threshold. There are videos today that you'd have a hard time telling if they are AI generated. Not always, but sometimes.
It's been decades since we haven't been able to blindly trust image and videos in court, image and video manipulation are almost as old as photography and cinema.
And paper documents are still being used on court today despite being trivially counterfeit.
Why so? Because court never trust documents blindly, the defendant can always object that it is fake and try and question their origin. If the concerns are deemed legitimate by the court, the document is going to be rejected (and an investigation will occur and the producer of the fake document will be charged heavily).
It depends on the court. Typically evidence is shared with the opposing parties before the court assembles. If there is damning evidence like video of you dancing with the Joker, your side will have an opportunity to preview the video, examine it for being deepfaked, and have a chance to either try to exclude it or possibly will let it slide so that your experts can expose it in trial for being a fake, which could be a strong point in your favor.
From my experience juries are not smart, but if you can show them they've been lied to they will destroy the side that lied to them, not to mention the punishments for lawyers that try to use AI technology to obtain verdicts in their favor by deception.
This also vastly overestimates the intelligence of the average juror.
This sounds insulting but it is intended to be a frank statement.
In 4 years of working cases I would estimate 1 in 6 jurors are above the 85-115 IQ range of average intelligence, and maybe half are at or below the 100 line.
Add in that anyone over the age of 55 is on average far more susceptible to deepfake technology simply because they don't have the life experience and perceptual skills needed to discern the tells in the video, and you have a recipe for disaster.
If you are in court and your opponent might use fabricated video evidence against you, you better hope that your jury is younger or that your lawyers and judges have the expertise needed to expose any deepfake technologies like this for what they are, or you might be cooked.
> In 4 years of working cases I would estimate 1 in 6 jurors are above the 85-115 IQ range of average intelligence, and maybe half are at or below the 100 line.
Maybe I'm missing the joke, but isn't IQ meant to follow a normal distribution with a mean/median of 100 with a standard deviation of 15, in which case you'd expect half of jurors to be below 100 and ~15% to be above 115, which is pretty close to what you've seen?
What I mean to say is that if the average intelligence person, especially those over the age of 50-55 is very susceptible to believing deepfake video, then any video you can successfully show a jury of 12 over 50's people will likely fool 1/2 by default, 10/12 more likely than not, and 12/12 at least half the time.
If you're in a case where there is the possibility of deepfakes being used against you, you had better hope that either your jury is mostly in the 25-45 range and above average intelligence or that your lawyer knows how to deal with those videos since they'll get to review them before they are shown.
It's better than Viggle, but Alibaba doesn't release their models like Tencent does.
Waiting for the Tencent version of this. It feels like Tencent releases a new model or two every single week. If they do release an equivalent, everyone will be able to clone Viggle's entire product offering.
Models are becoming commodity faster than ever these days. We had five foundation video models come out last week. I don't know how Runway ML, with their $300M of fundraising, will be able to stay ahead or raise again. They don't have any special magic, and there's nothing spectacular about their product.
I wouldn't want to be a foundation model company these days. China and any third string company release weights openly to gain network effects and salt the earth for foundation model value accretion. Product and brand awareness are all that matters.
Edit: if anyone from Alibaba or Ant Group is reading, can you release your code and weights? Pretty please?
> Tech has come a long way since Monty-Python style JibJab from 2004.
No doubt! I'm surprised people even remember JibJab.
Besides just being really cool, what are some of the clearest use-cases for this?
Where I work in VFX (RSP) we have an ML department and a lot of what they do is about replacing things in footage (face replacement onto double-actors, de-ageing and so on). What they achieve (that brings significant value) seems fairly adjacent to the tech in the article. Our ML crew have the benefit of a strong compositing department (and other support) that can help integrate the results nicely and smooth over any rough edges. The ML team were nominated for a VES award for their work on Furiosa recently!
If this has anything close to usable rigging data (from what I briefly read, it doesn't seem to), this would be a cheap pipeline for generating animations for real time applications out of video references.
As it is, it seems like it might help for video editing and VFX. Being able to isolate out a human being lets you do all sorts of editing with less issues that can then be composited over later for a final shot.
100% of the capability could be absorbed by YouTubers looking to improve engagement (read: profit) with more interesting visuals. Since they have a bottomless need for that, this would get widely used within days of a commercial release.
Since we live in a social media age, anything that looks "really cool" has basically immediate commercial applications. Even that phrase should set alarm bells off.
Even though the examples have a clear green-screen effect, I would think some of the tricks/filters they used in old video games to make them look "realistic/cinematic" could work here, then advertisers would be able to plug in a picture of a model wearing their latest outfit to a stock video for social media and digital ads.
Memes, people will use it for the memes. I saw the NBA footage and yes, i want a gif of me posterizing Ben Simmons.
Catfishing
Convincing older people that their children/relatives are in jail/hospital/kidnapped and need money
Disinformation/chaos by faking important speeches by politicians
Fake celebrity endorsements of products and causes
Rewriting historical events with "newly uncovered" footage
Other creative ways of ripping people off and lying
> Disinformation/chaos by faking important speeches by politicians
I do wonder what the end game here would be. A race to the bottom where any televised speech is brought into question. I wonder if that would destabilize all power bases to the point that people will finally look at the actions of their politicians rather than their words. Wishful thinking perhaps.
Unreleased models should not be treated as interesting
I really wouldn't use Donald Trump as an example character model...
Because that scary state that you read about in dystopian novels and hear about in dystopian histories is a state that Americans live in at the moment. We are all actually people that are living in an important moment, and should be aware of Normalcy Bias.
My guess is that it would cause people like myself who intensely dislike him to avoid your technology because of the association.
If I don't like it, I won't show it to my boss, he won't see it, and our company won't use your technology.
Same could be said for showing Hillary Clinton or Kamala Harris for the other side, or Putin or Kim Jong Un or Mohammad and Jesus
A smart tactic would be to avoid sensitive figures in general if your goal is to have your technology adopted by a wide audience.
I never thought americans were this weak and pathetic. Y'all always screamed about "muh guns" and stuff like that, but now... You're just sitting idly while your new king takes over, even discouraging others to do anything that may upset your supreme ruler?
So I guess we're just about 2 or 3 iterations away from no longer being able to trust video as evidence in a courtroom. Just gotta fix the lighting and facial issues and bam! Precedent will be set.