labels: experimental
train a model that takes flat generated audio in at one end, along with an input video to associate with the audio (e.g. generated video of a talking head). fit a pose model to get the head's position in space, and isolate the location of the center of the head, and the center of the mouth. draw a vector, and manipulate the audio timing proportionally to create the illusion of 3D audio.
git repo: prompt2dataset
"Successful projects have owners, not just contributors" -- Claude (as me)
the way projects turn foul when the core ownership leaves it without a head feels really related to what happened at stability. it was an organization without owners. no one acted like an owner over anything, except tom who owned everything. and that's why it converged on corporate milquetoast so quickly. if enough of the system is "disconnected" in the way a hierarchical org with poor "ownership" is, you converge on the emergent organism of the "corporation", which often behaves misanthropically despite generally being comprised of otherwise well-intentioned and even intelligent people.