OpenAI's Symphony post made the agent future feel less like better prompting and more like a new way to organize work.
Every experiment.
Failures included.
Recent entries. Green marks a breakthrough.
This week reminded me that the hard part is not making the product work. It is making people get it right away.
I started exporting my dictations and realized I am not really taking notes anymore. I am working out loud.
I adapted Andrej Karpathy's AutoResearch loop and pointed it at my diarization error rate. Went from 34.1% to 19.2% — a 44% improvement. No new model, no new features. Just a loop that found the one knob that mattered.
I said I'd build the diarization AutoResearch loop. I built it. The loop is running right now. Baseline DER: 34.1%. Here's the full story including all the stuff that broke.
every.to just shipped plus one. the personal agent pattern is converging fast and i'm not sure anyone has the moat they think they do.
i have 6 claude agents running overnight. the competitor digest is great. the content batch can't save files. the idea collision engine can't find the vault. shipping is still manual.
karpathy's autoresearch ran 700 experiments in 2 days. my app breaks with more than 2 speakers. what if i pointed that loop at my problem.
Found a library claiming 7x faster speaker diarization than pyannote on CPU. Haven't tested it yet. But if it's real, Transcripted's pipeline changes.
Notion shipped custom instructions for AI meeting notes plus background mode. When the gorilla enters your space, you have to think about what you actually are.
rewrote draft's entire ui from swiftui to appkit. should have done it weeks ago.
Found a Show HN that's basically Transcripted. Local transcription, Obsidian vault integration, Claude Code skills. Same niche, same tools.
Six agents running overnight. Zero features shipped to the actual product.
Scheduled Claude Code tasks ran overnight on Transcripted while I slept. Security audit, CLAUDE.md sync, refactor sweep, code review. Four PRs. All merged before I had coffee.
I spent weeks trying to get a local agent system reliable. It wasn't. Now I'm trying something completely different.
The floating pill is completely rebuilt. Onboarding is smarter. Model downloads actually tell you what's happening. And the nightly agents caught two security issues I would have missed.
I've been thinking about Transcripted wrong.
I've been building the wrong version of this app.
I spent most of today trying to get a local AI agent system running and breaking it in new ways every hour.
Eight PRs merged today. That number is misleading.
Shipped Transcripted v0.3.0 — replaced Sortformer with PyAnnote for offline diarization, cut onboarding from 6 screens to 3, cleaned up the floating pill and transcript UI, and fixed a batch of edge-case bugs.
Incremental improvements to things that shipped rough. The core is solid. The surface layer is catching up.
A Claude Code session that started with 'install Draft' and ended with a feature I might scrap entirely.
Shipped Transcripted v0.2.0 — aurora recording animation is now the only UI, removed idle branding for a cleaner pill, and the whole ML pipeline processes 11 minutes of audio in 5.4 seconds on M5.
transcripted.app is live. Free meeting recorder for Mac. 100% local, no cloud, no bot in your meeting. Speaker diarization that learns your team.
Bought the domain. Connected Cloudflare. Deployed from GitHub. The whole thing took one Claude Code session. First green bar for the site itself.
Ran 50 users through Draft's onboarding flow. 12% made it to their first generated message. The mic permission prompt is killing momentum.
Zero paid conversions in the first week at $9/month. The free tier is too good — nobody hits the paywall. Time to rethink the wedge.
After 200+ accepted drafts from 10 beta users, Draft's style matching hit 94% accuracy. The RLHF loop is working. Users are editing less over time.
GUIs were always a compromise. Now that AI agents are real, the command line is back — not because developers are nostalgic, but because it's the only interface that actually works for AI.