Voice Canvas: Building with What We Have

Nov 18, 2025 Estimated Reading Time:

There’s a lot of noise right now about AI investment cycles, potential bubbles, and concerns about the scaling of mega models. While those conversations are happening, I find myself increasingly convinced of something simpler: the current models and capabilities we have right now are sufficiently interesting that there’s just a ton to do. In other words, in my opinion, if we made no progress today, we have you know work for at least a few years just to deploy the current capabilities in a way that kind of works for everyone. So again, that depends on no next new innovation.

Exploring Voice Canvas

I’ve been deep in exploration mode with voice-first interfaces and local model usage. The centerpiece of this exploration is Openscout, a transcription app I’m building as the foundation for something larger: what I’m calling a “voice canvas.”

Why Voice Feels Different

There’s something about voice interaction that feels fundamentally more powerful and delightful than typing. And now the blend with LLMs with text-to-speech and speech-to-text models that allows us to have dynamic, permissive systems that aren’t some decision tree, that they’re flexible and on par with the JNI kind of LLMs, and so the quality is great and a lot of it can be done on device, which is a huge unlock. It’s one of the most profound net new high-quality interface paradigms that has emerged from this AI wave. Speaking is immediate, natural, and captures nuance in ways that typing often misses. For me, it frees up my eyes and my mind.

I believe we’re going to see a cultural shift here. The stigma of speaking publicly into your phone or headset is going to fade. People will get comfortable with it. Even offices—traditionally quiet, keyboard-focused environments—are going to adapt to this new mode of interaction.

MidjourneyAI-GENERATED

ref:url('https://cdn.midjourney.com/a59aba27-2184-4e2e-9443-293d11f3ba1b/0_0.png')

'0_0.png'

The Vision: Voice + Local + Private + AI

My next exploration is bringing together these elements:

Voice-first interfaces that feel natural and powerful
Local model execution for privacy and speed
AI capabilities woven into the workflow
Asynchronous processing that works with your rhythm

Openscout is currently a transcription tool, but I’m building it as the foundation for a canvas: a set of apps that work together. The iOS component will be empowered to do voice recordings (think Voice Memos app, but connected) and take asynchronous actions based on what you say.

Building Now

While others debate the future of AI scaling, I’m focused on building with what we have. The capabilities are here. The interfaces are emerging. The opportunity is now.

More updates to come as this exploration continues.

Download