With Marco Pisano we've been exploring the possibility of connecting large language models (LLMs) with Unity and controlling real-time systems.
We developed a working prototype that runs a local LLM (in this case Mistral) and uses it to real-time control 3D assets in Unity. By communicating Python and C# and fine-tuning the communication pipeline, we achieved smooth interaction between the LLM and our virtual environment.
This involved customization of the model file and experimentation with core inference parameters like system prompts, temperature, top_p, and top_k values to find the right balance between coherence and ingenuity. Finding the sweet spot was key to getting responses that are not only context-aware but also stable enough to form a response logic in Unity.
The result was a fully local setup running on a laptop where our language model respond intelligently and almost instantly to user inputs, converting them to commands and sending them to control the 3D assets within a Unity Scene. This opens doors for AI-driven, dynamic storytelling all without relying on cloud-based APIs.
Open source communities, large language models and the ability to run them locally are the key factors to develop innovative projects with greater privacy, control, freedom and customization.