REF / WRITING · SOFTWARE

Software.

Everything I've written about software. Start at the top; the list compounds.

DomainSoftware
Posts71
CadenceWeekly
FormatEssay · Note · Case
01
FILTER

By domain.

02
LATEST

software posts.

Showing 4150 of 71.

Claude Sonnet vs Opus: A Practitioner's Guide to Choosing the Right Model

The Claude 4 family. Sonnet 4.6, Opus 4.6, and Opus 4.7. gives builders three meaningful tiers to choose from. Pick the wrong one and you're either burning money on a task that didn't need the firepower, or shipping a feature that almost works. The right choic

Gemini 3.1 Pro Long Context: Patterns That Hold Up in Production

Long context is the dimension where Gemini's family has consistently distinguished itself. With Gemini 3.1 Pro, the ability to process very large inputs in a single call is mature enough to ship into production for serious analytical workloads. codebase reason

Claude Sonnet 4.6 for Production AI Features: A Builder's Guide

Claude Sonnet 4.6 is the model most production AI features should be built on. It's the workhorse of the Claude 4 family. strong enough to handle complex reasoning, fast enough to drive real-time features, and priced for the volume that production usage actual

Claude Opus 4.6 for Complex Reasoning Tasks: When and How to Use It

Opus 4.6 is the model you reach for when the answer matters and the question is hard. It's slower than Sonnet, costs more per token, and you should be deliberate about every call you make to it. But when the task is genuinely complex. multi-step reasoning, dee

Building With the Claude Agent SDK: Production Patterns for 2026

The Claude Agent SDK is Anthropic's higher-level layer for building agents. handling the loop, tool execution, session management, and the surrounding infrastructure that you'd otherwise build yourself. Combined with Opus 4.7's improvements in long-running sta

Building Production Agents with Claude Opus and Tool Use

The gap between "an LLM with tool use" and "a production agent that does real work" is wider than the demos suggest. The model can call your tools. but making it do so reliably, recovering when tools fail, knowing when to stop, and shipping outputs your users

Claude Prompt Caching: Production Patterns That Cut Costs 80%

Prompt caching is the single highest-ROI feature in the Claude API for production workloads. Used well, it cuts the cost of high-traffic endpoints by 70-90% and shaves hundreds of milliseconds off latency. Used poorly. or ignored. it leaves the equivalent of a

LLM Tier Economics: Flash Lite vs Pro vs Frontier - A Decision Framework

Every major LLM provider now offers three tiers: a small/lite model (Gemini 3.1 Flash Lite, Claude Haiku, GPT mini variants), a mid-tier workhorse (Gemini 3.1 Pro, Claude Sonnet 4.6, GPT 5.4/5.5), and a frontier flagship (Claude Opus 4.7, reasoning-tier models

Streaming LLM Responses in FastAPI: SSE, WebSockets, and Real-Time AI

LLM responses are fundamentally different from traditional API responses. A typical database query returns in under 100ms. A GPT-4 completion for a long prompt can take 15-30 seconds to fully generate. Users will abandon a blank screen after 2-3 seconds. The s

Next.js Caching: A Production Deep Dive (Fetch, Router, ISR, Edge)

Next.js has four caches. They interact. They don't always invalidate the way you'd guess. Most production incidents I've debugged in the last year on Next.js apps trace back to a misunderstanding of one of the four. usually the Data Cache or the Full Route Cac