The system used two forms of conditioning for
image generation.
-
The first was a real-time diffusion model that generated new images via text prompt and layered over a pre-authored animation. This predefined animation acted like a "guide" for the AI to
interpret and shape new imagery.
-
The second conditioning was gesture-based user input. Through a set of programmed rules, users could partially or fully transform the live animation using their bare hands.
While this project was enriching, I believe future systems inspired in this use-case can leverage the power of more intuitive gesture controls specially when thinking about use-cases where this systems have to
synch with creative live visual effects. Think of:
-
Theatres
-
Interactive installations
-
Live performances
This type of projects is the perfect scenario to build my way up as a pipeline developer who understand some of these cutting-edge AI vision models as a extension of my current available toolkit for creative
workflows. "AI" didn't create my artwork for the ehxibition, instead, it became a piece inside my pipeline helping me to enhance the visual effects rendering for the animations.
How do we build credibility beyond this AI hype?
Simple, by asking some questions like these:
-
Does it reduce labor time?
-
Is this architecture scalable and modular?
If yes, then take those AI models, apply it to your pipeline and be happy. You just just save and energy to you and possibly, to your team ;)