The system used two forms of conditioning for
image generation.
-
The first was a real-time diffusion model that generated new images via text prompt and layered over a pre-authored animation. This predefined animation acted like a "guide" for the AI to
interpret and shape new imagery.
-
The second conditioning was gesture-based user input. Through a set of programmed rules, users could partially or fully transform the live animation using only their hands.
While this project was enriching, I believe future systems inspired in this use-case can leverage the power of more intuitive gesture controls specially when thinking about use-cases where this systems have to
synch with creative live visual effects. Think of:
-
Theatres
-
Interactive installations
-
Live performances
Think also spaces where users can rely on gestures to dynamically control digital assets such as
light,
color,
sounds or
images that can
match a mood or key moment during a live performance. Leveraging this type of systems require usability tests and on-set iterations, but thanks to technology that something we can always do.