Comment by nothrowaways
Comment by nothrowaways a day ago
Where do they get the video training data?
Comment by nothrowaways a day ago
Where do they get the video training data?
> in-house stock video dataset
Wonder if "iCloud backups" would be counted as "stock video" there? ;)
Stock video means stock video.
That has nothing to do with it, and Apple wouldn’t train on user content, they’re not Google. If they ever did there would be opt in at best. There’s a reason they’re walking and observing, not running and trying to be the forefront cloud AI leader, like some others.
Why should I buy this "ethical Apple" argument?
They shared audio Siri recordings with contractors in 2019. It became opt-in only after backlash, similar to other privacy controversies.
This shows that they clearly prioritize not being sued or caught, which is slightly different from prioritizing user choices.
From the paper:
> Datasets. We construct a diverse and high-quality collection of video datasets to train STARFlow-V. Specifically, we leverage the high-quality subset of Panda (Chen et al., 2024b) mixed with an in-house stock video dataset, with a total number of 70M text-video pairs.