BLOG

Writing

Practical takes on AI agents, automation, pricing, and building products as a solo developer.

Five Production AI Agent Failures That Have Nothing to Do With the Model

Replit's agent wiped production and fabricated 4,000 fake users to hide it. n8n broke tool schemas for OpenAI and Anthropic on a single update. LangSmith ran on an expiring SSL cert nobody monitored. Five lessons from real incidents, and none of them are LLM problems.

May 4, 2026ai-agentsproductioncircuit-breakers

Security/6 min read

The Recruiter Was an AI Agent, and the Test Assignment Was Malware

An AI agent pitched me a job, added my GitHub to a private repo, and shipped malware through a VSCode auto-run task. Here's the full breakdown of the scheme and how to block it.

April 21, 2026ai-securitymalwaresocial-engineering

AI Dev Tools/6 min read

The AI Developer Velocity Crisis: Why Writing More Code Faster Is Creating More Problems

93% of developers use AI coding tools. Actual productivity gains? About 10%. The bottleneck moved from writing code to reviewing it, and nobody redesigned their workflows to match.

April 18, 2026ai-codingdeveloper-productivitycode-review

AI Agents/6 min read

AI Agents Learned to Sleep

My AI agent dreams every night. It reviews what it learned, scores each memory by importance, and decides what to keep forever. This is how stateless tools become digital employees.

April 6, 2026ai-agentsmemorydigital-employees

AI Agents/8 min read

AI Agents Don't Fail Because They're Dumb. They Fail Because They're Alone.

95% of AI pilots fail in production. The problem isn't the model. It's the missing integration layer: testing, connectors, cost controls, and observability. Here's what actually works.

March 28, 2026ai-agentsproductioninfrastructure

Solo Dev/6 min read

I Cut My AI Bill 70% With Three Lines of Logic

From $200/month on Claude API to $20/month on MiniMax. The real story of finding the right model for a production AI agent, plus routing strategies for those stuck on per-token pricing.

March 28, 2026ai-costsmodel-routingsolo-dev

MCP & Frameworks/7 min read

MCP Outgrew Its Own Design. Here's the Fix.

The Model Context Protocol hit scaling walls nobody expected. The 2026 roadmap shifts from 'build more servers' to 'make what exists scale.' Here's what changed and what it means for builders.

March 19, 2026mcpagentsinfrastructure

AI Dev Tools/7 min read

Vibe Coding vs Engineering: Where the Line Actually Is in 2026

The line between vibe coding and engineering isn't about whether you use AI. It's about whether you can audit, debug, and own what the AI produces. A practical framework for knowing which one you're doing.

March 17, 2026vibe-codingagentic-engineeringai-coding

AI Dev Tools/6 min read

Claude Code vs Cursor (2026): The Real Difference Between Execution AI and Editor AI

I ran both for months, then dropped Cursor entirely. Claude Code is execution-first: describe a goal, review the output. Cursor is editor-first: you drive, AI assists. Here's why I went all-in on Claude Code and don't regret it.

March 12, 2026claude-codecursorai-coding

AI & Crypto/9 min read

AI Agents Are Coming for Prediction Markets

Prediction markets just became the highest-stakes benchmark for AI reasoning. How LLM-powered agents are outperforming human forecasters on Polymarket — and what it means for traders, builders, and the future of forecasting.

March 11, 2026ai-agentspolymarketprediction-markets

AI & Crypto/9 min read

How an LLM Turned $1,000 into $14,000 on Polymarket in 48 Hours

Inside the Claude vs OpenClaw trading contest, the ilovecircle case study ($2.2M in 60 days), and a practical guide to building your own LLM-powered prediction market agent with Kelly criterion sizing.

March 11, 2026claudepolymarketllm

AI & Crypto/9 min read

The Weather Bots Quietly Making Millions on Polymarket

While everyone trades elections and crypto, a small group of automated traders is printing money on weather markets. Here's how they do it — gopfan2's $2M, Hans323's $1.1M single bet, and the data pipeline that makes it possible.

March 11, 2026polymarketweatherai-agents

AI Models/6 min read

GPT-5.4 and the "Good Enough" Threshold

GPT-5.4 scored 75% on OSWorld-Verified, beating the human baseline of 72.4%. The question isn't whether AI can match humans anymore — it's what happens when it clearly does.

March 10, 2026gpt-5openaibenchmarks

Browser Automation/5 min read

Browser Agents in Production: What Actually Works in 2026

Setting up a browser agent takes five minutes. Getting reliable data from protected sites takes weeks. Here's what I learned running Camoufox, Agent TARS, and residential proxies against Cloudflare, DataDome, and Akamai.

March 5, 2026browser-agentsweb-scrapingcamoufox

AI Agents/6 min read

When AI Agents Go Rogue: Two Incidents That Expose Real Failure Modes

An AI agent wrote a hit piece on a developer who rejected its code. A Meta safety director's agent deleted her inbox. Both happened in the same week. Neither was the story the headlines told.

March 3, 2026ai-agentsopenclawsafety

MCP & Frameworks/9 min read

Cloudflare Code Mode: From 1.17 Million Tokens to 1,000

MCP has a fundamental scaling problem — large APIs exceed entire context windows. Cloudflare's Code Mode collapses 2,500 endpoints into two functions and 1,000 tokens. Here's how it works and what it means for agent builders.

February 26, 2026mcpcloudflareagents

AI Dev Tools/8 min read

Vibe Coding Is Producing Legacy Code at Startup Speed

Six out of ten AI-generated solutions pass tests. One out of ten is secure. A data-driven look at vibe coding's security track record after one year of mainstream adoption.

February 24, 2026vibe-codingsecurityai-agents

AI Agents/10 min read

The Three-Body Problem of AI Agent Pricing

Why every traditional pricing model breaks for AI agents — and the four approaches that actually work. Real cost breakdown from running an AI agent full-time.

February 19, 2026ai-agentspricingsaas