
AI Agents Learned to Sleep
My AI agent dreams every night. It reviews what it learned, scores each memory by importance, and decides what to keep forever. This is how stateless tools become digital employees.
BLOG
Practical takes on AI agents, automation, pricing, and building products as a solo developer.

My AI agent dreams every night. It reviews what it learned, scores each memory by importance, and decides what to keep forever. This is how stateless tools become digital employees.

95% of AI pilots fail in production. The problem isn't the model. It's the missing integration layer: testing, connectors, cost controls, and observability. Here's what actually works.

From $200/month on Claude API to $20/month on MiniMax. The real story of finding the right model for a production AI agent, plus routing strategies for those stuck on per-token pricing.

The Model Context Protocol hit scaling walls nobody expected. The 2026 roadmap shifts from 'build more servers' to 'make what exists scale.' Here's what changed and what it means for builders.

The line between vibe coding and engineering isn't about whether you use AI. It's about whether you can audit, debug, and own what the AI produces. A practical framework for knowing which one you're doing.

I ran both for months, then dropped Cursor entirely. Claude Code is execution-first: describe a goal, review the output. Cursor is editor-first: you drive, AI assists. Here's why I went all-in on Claude Code and don't regret it.

Prediction markets just became the highest-stakes benchmark for AI reasoning. How LLM-powered agents are outperforming human forecasters on Polymarket — and what it means for traders, builders, and the future of forecasting.

Inside the Claude vs OpenClaw trading contest, the ilovecircle case study ($2.2M in 60 days), and a practical guide to building your own LLM-powered prediction market agent with Kelly criterion sizing.

While everyone trades elections and crypto, a small group of automated traders is printing money on weather markets. Here's how they do it — gopfan2's $2M, Hans323's $1.1M single bet, and the data pipeline that makes it possible.

GPT-5.4 scored 75% on OSWorld-Verified, beating the human baseline of 72.4%. The question isn't whether AI can match humans anymore — it's what happens when it clearly does.

Setting up a browser agent takes five minutes. Getting reliable data from protected sites takes weeks. Here's what I learned running Camoufox, Agent TARS, and residential proxies against Cloudflare, DataDome, and Akamai.

An AI agent wrote a hit piece on a developer who rejected its code. A Meta safety director's agent deleted her inbox. Both happened in the same week. Neither was the story the headlines told.

MCP has a fundamental scaling problem — large APIs exceed entire context windows. Cloudflare's Code Mode collapses 2,500 endpoints into two functions and 1,000 tokens. Here's how it works and what it means for agent builders.

Six out of ten AI-generated solutions pass tests. One out of ten is secure. A data-driven look at vibe coding's security track record after one year of mainstream adoption.

Why every traditional pricing model breaks for AI agents — and the four approaches that actually work. Real cost breakdown from running an AI agent full-time.