Out of the Kinetic Narratives project, I came up with the idea to wrap this gesture-based system into a tool that others working with real-time visual pipelines may find useful.
This tool is been built with two core strengths.
- Adaptability
Having a highly configurable and modular system, allowing multimodal and customized inputs to drive different behaviors of digital assets.
- Accessibility
No proprietary hardware, no expensive licenses, just open-source frameworks and a regular webcam.
The key value of this tool lies in its rule-based architecture, where creators can define custom mappings between body movement and real-time control of 2D/3D assets. For instance:
- Landmark-based tracking (via MediaPipe) allows you to capture hand gestures, facial expressions, or joint positions.
- Image-based tracking (via YOLO, MoveNet, etc.) enables more complex pose recognition using RGB data.
- Color-based motion tracking, able to translate movement of physical props into 3D spatial motion authoring.
On the other hand, it's a hardware-software agnostic system that relies on open source libraries like OpenCV, MediaPipe or OSC-UDP. Its seamless integration with other 2D/3D environments
like Unreal engine, Unity or TouchDesigner making it suitable for:
-
Live performance and theater
-
Interactive installations
-
Virtual production
-
Creative learning environments
I see this tool as a way to reclaim gesture as a performative language — turning movement, motion, and expression into meaningful data. It unwraps internal states and translates them into
dynamic, immersive storytelling.