Writing.
Essays, playbooks, and short notes on software, digital marketing, and start-up operating. One new piece each week.
By domain.
All posts.
The Operating Cadence I Run for SMB Clients (and Why Weekly Beats Daily)
When I started taking on fractional CTO and operational advisory engagements, I defaulted to the cadence I knew from agency life: daily standups, weekly status reports, and monthly strategy reviews. Three months into my first few engagements, I noticed somethi
AI Customer Support That Doesn't Make Your Customers Hate You
I have deployed AI customer support for a Karachi restaurant chain, a Dubai property management company, and a Riyadh e-commerce brand. The failure modes across all three were almost identical. The customers who ended up hating the AI support did not hate it b
Why Most AI Agents Fail in Production, and the Boring Architecture That Doesn't
I have watched more AI agent projects fail than succeed. I have contributed to some of those failures. After enough of them, the failure modes stop being surprising and start being predictable, which means they are preventable. The agents that fail all share a
Building a Reliable AI Agent: Why Tool-Use Beats Reasoning for SMB Workflows
Most AI agent demos I see work by having the model reason its way through a problem, decide what to do next, and take action. This works beautifully in controlled demos and fails reliably in production environments with real business data, real edge cases, and
Migrating From GPT 5.4 to GPT 5.5: A Practical Migration Playbook
Migrating production LLM features from one model version to the next is mostly mechanical. change a string, ship the build. until it isn't. The places it bites: subtle output differences your prompt-test suite missed, cost shifts you didn't notice, and behavio
ChatGPT 5.5 Multimodal Patterns: Vision, Audio, and Mixed Inputs
Multimodal LLM features have moved from "interesting demo" to "real production capability" over the last two years. With GPT 5.5, vision and audio inputs are reliable enough to ship into customer-facing features for use cases like document analysis, visual sup
ChatGPT 5.5 for Coding Tasks: Where It Wins and Where It Doesn't
Asking "is GPT 5.5 good at coding?" is the wrong question. The right question is: for which coding tasks does it produce reliable, useful output, and when should you reach for a different tool? The answer is more granular than the marketing makes it sound, and
ChatGPT 5.5: What Changed for Developers
Each minor GPT version brings a mix of broadly-better and use-case-specific improvements. This article focuses on what GPT 5.5 changes for builders. the capability shifts that justify migration effort, the patterns that newly become practical, and the places w
Claude Opus 4.7 1M Context Window: Patterns and Pitfalls
The 1M-token context window in Claude Opus 4.7 is a genuine capability shift, not a marketing increment. But "you can fit it" and "you should fit it" are different questions, and the production patterns for long context are non-obvious. This article walks thro
Claude Opus 4.7: What's New, What it Changes for Builders
Claude Opus 4.7 is the current frontier of the Claude model family. The headline upgrade from 4.6 is the 1M-token context window. five times the size. but the more practical wins are in long-context recall, agentic stability over long sessions, and a noticeabl
Build Internal Tools Before Customer-Facing Ones: The Order That Saved Me Six Months
When I was setting up my third company, a B2B SaaS serving SMB clients in the real estate sector, I made a decision that felt wrong at the time and turned out to be the most operationally correct thing I did in the first year. Instead of building the customer-
Automating Without Disrupting: A Phased Rollout for Teams That Can't Afford Downtime
The biggest risk in automation for a small team is not the technology. It is the deployment. I have seen technically sound automations fail in production not because the code was wrong but because the rollout was wrong: the team was not prepared, the old proce
The Human-in-the-Loop Pattern: Where Automation Should Always Stop
The most reliable automations I have built all have one thing in common: they know where to stop. Not because the technology failed, but because the system was designed to pause at specific points and hand control to a human. This sounds like a limitation. It
From Zapier to Custom Code: When the Migration Pays for Itself
I have migrated five clients from Zapier or Make to custom code. Three of those migrations paid for themselves in under four months. One was a break-even after twelve months. One I should not have done. Here is the pattern behind each outcome and how to read t
Cost-Optimising ChatGPT 5.4 Production Deployments
The fastest path from a working LLM feature to a financially sustainable LLM feature is a set of cost optimisations that don't compromise quality. For most production deployments of GPT 5.4, these patterns cut spend by 60-85% with no measurable user-facing imp
Gemini 3.1 Pro vs Other Frontier Models: A Practitioner's Comparison
The frontier-model market is now a genuine multi-provider one. Anthropic's Claude, OpenAI's GPT, Google's Gemini, plus serious open-weight models. The "best model" varies by use case, by week, and by which evaluation suite you trust. The question that actually
ChatGPT 5.4 for Builders: Capability Patterns and Production Notes
When a new GPT model lands, the first wave of "what's new" coverage focuses on benchmark deltas. The second wave. the one builders actually need. is about which production patterns the model unlocks, where it changes the cost-quality calculus, and what to migr
High-Volume Classification and Extraction with Gemini Flash Lite
The two highest-volume LLM use cases in production today are classification (assign a category to an input) and extraction (pull structured fields from unstructured input). For both, small-tier models like Gemini 3.1 Flash Lite often produce identical-quality
ChatGPT 5.4: When to Use Reasoning Models vs Standard Chat
OpenAI now ships two distinct families of models for builders to choose between: standard chat models like GPT 5.4, and reasoning-tier models that produce longer, more deliberate outputs by spending more compute per request. They're not interchangeable. and ch
Gemini 3.1 Flash Lite: When Fast and Cheap Wins
The frontier-model conversation gets the headlines. The small-tier models do the work. In production AI systems with real volume, the lite-class models. Gemini 3.1 Flash Lite, Claude Haiku, GPT mini variants. handle the bulk of requests, while the frontier tie
Gemini 3.1 Pro for Builders: Strengths, Use Cases, and Production Patterns
Google's Gemini line has always been positioned as a frontier alternative to OpenAI and Anthropic. strong capabilities, deep integration with Google Cloud, and a willingness to lean into long-context and multimodal differentiators. Gemini 3.1 Pro is the curren
Claude Sonnet vs Opus: A Practitioner's Guide to Choosing the Right Model
The Claude 4 family. Sonnet 4.6, Opus 4.6, and Opus 4.7. gives builders three meaningful tiers to choose from. Pick the wrong one and you're either burning money on a task that didn't need the firepower, or shipping a feature that almost works. The right choic
Gemini 3.1 Pro Long Context: Patterns That Hold Up in Production
Long context is the dimension where Gemini's family has consistently distinguished itself. With Gemini 3.1 Pro, the ability to process very large inputs in a single call is mature enough to ship into production for serious analytical workloads. codebase reason
Claude Sonnet 4.6 for Production AI Features: A Builder's Guide
Claude Sonnet 4.6 is the model most production AI features should be built on. It's the workhorse of the Claude 4 family. strong enough to handle complex reasoning, fast enough to drive real-time features, and priced for the volume that production usage actual
Claude Opus 4.6 for Complex Reasoning Tasks: When and How to Use It
Opus 4.6 is the model you reach for when the answer matters and the question is hard. It's slower than Sonnet, costs more per token, and you should be deliberate about every call you make to it. But when the task is genuinely complex. multi-step reasoning, dee
Process Mapping Before Automation: A One-Hour Method That Prevents Six-Month Failures
One hour of process mapping before I start building has prevented at least three projects that would have taken six months and delivered nothing useful. The clients who skip it give me a problem statement that sounds specific ("automate our invoice approval pr
The Hidden Cost of n8n and Make: When Code Becomes the Cheaper Option
n8n costs $20 per month on the cloud plan. Make costs $9 per month on the basic tier. These numbers lead teams to choose them over writing code. They are also the least important numbers in the decision. After running both platforms at production scale for cli
The Automation Audit: How to Decide What to Automate vs. Leave Alone
I have run this audit more than forty times across clinics, law firms, e-commerce brands, logistics companies, and agencies. The clients who skip it spend six months automating the wrong things and then blame the technology. The audit itself takes about three
Localising Marketing Across Five Countries Without Five Agencies
In late 2023 I took on an engagement that was described to me as "expanding our paid and organic marketing into four new markets." The client was a SaaS product that had found product-market fit in the UK and wanted to add Germany, France, UAE, and Pakistan. T
When to Stop Bootstrapping and Raise: A Decision Tree With Real Numbers
Most founders I talk to treat the bootstrapping-versus-raising decision as a values question. Do you want to maintain control? Do you believe in the VC model? Are you building a lifestyle business or a venture-scale company? These are real questions worth answ
Killing a Product Line Without Killing the Company
In 2022, I helped a SaaS company shut down a product line that represented 35% of their revenue. It was the right decision. It was also terrifying, slow, and handled worse than it should have been. By the time we finished the wind-down, we had lost two custome
Hiring a Fractional CTO: The Scope, Cadence, and Red Flags
I have been the fractional CTO for over 20 SMB and SaaS companies at various stages. I have also watched founders hire the wrong person for this role repeatedly, in ways that are entirely predictable and largely preventable. This note covers what the role actu
Hiring Your First Engineer: The Trial Project That Actually Predicts Performance
The standard software engineering interview was designed to filter hundreds of candidates for a large company with predictable, well-scoped work. It was not designed for a three-person startup with an ambiguous problem and a codebase that changes direction eve
The First 90 Days as a Solo Technical Founder: A Week-by-Week Playbook
The first 90 days as a solo technical founder are the most leveraged and most wasted period in any startup's life. You have no customers to disappoint, no investors to report to, and no team to coordinate. You have total freedom and total ambiguity. Almost eve
Why 'Talk to Customers' Is Bad Advice, and the Five-Question Interview That Replaces It
"Talk to your customers" is the most repeated piece of startup advice and one of the least actionable. I have told founders this myself, in exactly those words, and then watched them come back from ten customer conversations having learned almost nothing usefu
Cofounder Disputes: The Operating Agreement Clauses Most Templates Forget
Most cofounder disputes are not about personality. They are about ambiguity. Two people build a company together, write an LLC operating agreement (or do not write one), and then discover at a critical moment that they have different assumptions about who owns
The Pre-Seed Stack: What to Build Yourself, What to Buy, What to Skip Entirely
The first engineering decision at most startups is made wrong. Not because the founders are inexperienced, but because the framing is off. "What should we build?" is the wrong question. The right question is: what is the smallest set of things we must build to
Equity Splits That Survive Five Years: A Framework Beyond Let's Just Go 50/50
The 50/50 split is not a decision. It is a postponed decision. Two founders who cannot agree on relative contribution, relative risk, and relative future commitment choose 50/50 because it avoids a conversation they are not ready to have. The problem is that t
Building With the Claude Agent SDK: Production Patterns for 2026
The Claude Agent SDK is Anthropic's higher-level layer for building agents. handling the loop, tool execution, session management, and the surrounding infrastructure that you'd otherwise build yourself. Combined with Opus 4.7's improvements in long-running sta
Building Production Agents with Claude Opus and Tool Use
The gap between "an LLM with tool use" and "a production agent that does real work" is wider than the demos suggest. The model can call your tools. but making it do so reliably, recovering when tools fail, knowing when to stop, and shipping outputs your users
Runway Math for Non-Finance Founders: The Spreadsheet I Wish I'd Had at Seed
Most non-finance founders get their first real lesson in runway math from an investor who says "your burn looks high." By that point, you are usually three months from the problem, not three months from the solution. I have acted as the de facto CFO (through t
The CAC-to-LTV Sanity Check Most Founders Skip Until Month 9
The conversation happens in almost every early-stage engagement I take on. A founder shows me their performance dashboard. CAC is trending down. They are pleased. I ask what the LTV is. They give me a number. I ask how they calculated it. They cite average ord
Claude Prompt Caching: Production Patterns That Cut Costs 80%
Prompt caching is the single highest-ROI feature in the Claude API for production workloads. Used well, it cuts the cost of high-traffic endpoints by 70-90% and shaves hundreds of milliseconds off latency. Used poorly. or ignored. it leaves the equivalent of a
Why Most Performance Marketing Reports Lie, and the Three Metrics That Don't
I have audited a lot of marketing accounts on behalf of founders who suspected something was wrong. The accounts usually come with a reporting package: monthly PDFs, Google Data Studio dashboards, weekly email summaries. The reports uniformly look good. Green
The Landing Page Audit I Run Before Spending a Dollar on Paid Traffic
I have been brought in to fix underperforming paid campaigns more times than I can count, and in roughly half of those engagements the problem was not in the campaign. It was on the landing page. The ads were fine. The targeting was acceptable. The budget was
Email Lists Still Outperform Everything. Here's How to Build One Without a Lead Magnet Treadmill
Email is the channel that never dies and never gets the credit it deserves. Every few years a new platform emerges and the industry declares email obsolete. Then the data comes back: email still produces $35-$42 for every dollar invested, which is higher than
Brand Positioning for SMBs: A One-Page Framework That Survives Contact With Reality
Every agency positioning engagement I have seen follows the same arc. Week one is a discovery workshop. Weeks two through four are competitive audits, archetype selection, and brand voice development. Week six is a 40-slide presentation deck. The client nods a
TikTok Ads for B2B: When It Actually Works (and the Three Times It Doesn't)
The first time I ran TikTok ads for a B2B client, it was something close to an accident. The client was a SaaS tool for restaurant managers. We were running Meta and Google, both performing reasonably well, and the marketing director asked me to try TikTok bec
Organic SEO After AI Overviews: What Stopped Working and What Replaced It
Google's AI Overviews rolled out to US search results in May 2024. By Q3 2024 I was looking at Google Search Console data for a content-heavy client site and watching informational traffic drop 35% over three months with no meaningful change in technical healt
The Creative Testing Cadence That Beat My Agency-Built Funnel by 40%
In early 2023 I was paying an agency $4,800 per month to run Meta ads for a SaaS product I had an equity stake in. They were competent. The campaigns were structured correctly. The ROAS was defensible. But after 12 months the CPL had drifted from $41 to $63 an
Attribution Is Broken. Here's the Three-Number Dashboard I Use Instead
I have sat through dozens of attribution model debates and come to a firm conclusion: every attribution model is wrong, most are also useless, and the energy spent perfecting them would be better invested in improving the marketing itself. Last-click attributi
Google Ads vs SEO: How to Split a $10k/Month Budget When You Can't Do Both Properly
The $10,000 per month marketing budget question is the one I hear most often from founders and early-stage marketing leads. It sounds like a tactical question. It is actually a business strategy question in disguise, because the right answer depends entirely o
Meta Ads in 2026: What Still Works After Advantage+ Ate the Manual Playbook
I spent the better part of 2024 fighting Advantage+ on behalf of clients who had carefully built manual campaign structures. I lost. Not because the clients were wrong to want control, but because Meta had redesigned the entire ad platform around their learnin
Document Extraction Pipelines: Combining Vision Models, OCR, and Validation
Three clients, three different industries, nearly identical problems. A London law firm could not extract clause information from contracts arriving in mixed formats (PDF, DOCX, scanned images). A Manchester manufacturer needed structured data from supplier in
OCR Pipelines That Actually Work in Production: Lessons From 2M Documents
Over the last two years I have processed roughly two million documents across five client projects: medical records for a Karachi cardiology clinic, customs forms for a Berlin logistics company, insurance claims for a Bahrain broker, property contracts for a D
Self-Hosted vs. API LLMs for Internal Tools: A Real Cost Breakdown
This is the question I get most often from clients who have been running LLM-powered internal tools for six months: should we keep paying OpenAI or move to a self-hosted model? The honest answer is that it depends on four specific variables, and I can give you
LLM Tier Economics: Flash Lite vs Pro vs Frontier - A Decision Framework
Every major LLM provider now offers three tiers: a small/lite model (Gemini 3.1 Flash Lite, Claude Haiku, GPT mini variants), a mid-tier workhorse (Gemini 3.1 Pro, Claude Sonnet 4.6, GPT 5.4/5.5), and a frontier flagship (Claude Opus 4.7, reasoning-tier models
Streaming LLM Responses in FastAPI: SSE, WebSockets, and Real-Time AI
LLM responses are fundamentally different from traditional API responses. A typical database query returns in under 100ms. A GPT-4 completion for a long prompt can take 15-30 seconds to fully generate. Users will abandon a blank screen after 2-3 seconds. The s
Next.js Caching: A Production Deep Dive (Fetch, Router, ISR, Edge)
Next.js has four caches. They interact. They don't always invalidate the way you'd guess. Most production incidents I've debugged in the last year on Next.js apps trace back to a misunderstanding of one of the four. usually the Data Cache or the Full Route Cac
Next.js Edge Runtime - A Production Reality Check
The pitch for the Edge Runtime sounds irresistible: your code runs in 300+ cities, the cold start is under 50ms, and your users always hit a server within a few hundred miles of them. Latency disappears. The reality, after building several apps that targeted E
Next.js App Router in Production: Patterns That Actually Scale
The Next.js App Router has been generally available for over two years now. The early "should we migrate?" debates have settled. the answer for new projects is yes, and the migration patterns for existing apps are mature. But the App Router rewards a different
React Server Components: A Mental Model That Actually Sticks
Most React developers I've onboarded onto an App Router project have the same first reaction: "Oh, Server Components are like SSR." This is wrong in a way that takes weeks to unlearn. The mistake leads to apps that ship JavaScript they don't need, fetch data i
React 19 Hooks: use(), useOptimistic, useActionState in Production
React 19 introduced three hooks that fundamentally change how production React apps handle async data and forms: use(), useOptimistic, and useActionState. Used together, they replace huge amounts of boilerplate that previously took libraries like SWR, React Qu
React Rendering Performance: When to memo, When Not to (2026 Edition)
For most of React's history, the standard performance advice was: profile, find re-renders, wrap things in useMemo, useCallback, and React.memo. The advice produced codebases full of memoisation that couldn't pass a real audit. half of it was wrong (memoising
Go Concurrency Patterns That Actually Hold Up in Production
Go's concurrency model is famously approachable: go func() and you have a goroutine. The trap is that easy to write is not the same as easy to write correctly at scale. Most production Go incidents I've debugged trace back to one of three things: leaked gorout
Go HTTP Services in 2026: net/http vs Gin, Echo, Chi, Fiber
For years, the standard advice for Go HTTP services was "use Chi" or "use Gin". anything to escape net/http's missing features. The standard library couldn't do path parameters, methods routing was awkward, middleware composition was painful. Frameworks closed
Go gRPC in Production: Patterns for Reliable Microservice Communication
When Go services talk to each other across a network, gRPC is the default for good reason: schema-first contracts, generated client and server code, streaming support, and a binary wire format that's faster and smaller than JSON. But gRPC's defaults are not pr
Spring Boot AI Error Analyzer: One Annotation, Plain-English Stack Traces
Every Java team I've worked with loses hours per week to the same ritual. An exception fires in production, an engineer copies the stack trace into a chat, scrolls past framework noise to find the one line that matters, then walks back through the code to figu
Spring Boot Modular Monolith: Better Than Microservices for Most Teams
The pendulum has swung back. After a decade of teams over-correcting from monoliths to microservices and discovering the operational tax (distributed tracing, network failures, eventual consistency, deploy-time coupling pretending to be runtime decoupling), th
OpenTelemetry in Spring Boot: A Production Observability Setup
OpenTelemetry has become the default observability stack for modern Java services. It's vendor-neutral (you can ship to Datadog, Honeycomb, Grafana Tempo, Jaeger. same code), it covers traces, metrics, and logs in one SDK, and Spring Boot's integration story i
Spring WebFlux vs Virtual Threads: Which Concurrency Model in 2026
For five years, Spring teams chasing high throughput had one answer: WebFlux. Reactive streams, non-blocking I/O, the whole reactive programming model. The cost was steep. every dependency had to be reactive (R2DBC instead of JDBC, reactive Kafka clients, reac
Spring Boot + Project Loom: Virtual Threads for High-Throughput Java Services
Java 21 shipped Project Loom as a production feature. Virtual threads. lightweight user-mode threads managed by the JVM rather than the OS. fundamentally change the performance profile of blocking I/O applications. For Spring Boot developers, this means near-W
Java Spring Boot: The Complete Guide to Building Production REST APIs
Spring Boot is the most widely deployed Java framework in the world. It powers banking systems, healthcare platforms, e-commerce giants, and the overwhelming majority of enterprise microservices. If you're building anything serious in Java, Spring Boot is your
Python FastAPI: The Complete Guide to Building Production APIs
FastAPI is the fastest-growing Python API framework. and for good reason. It combines Python type hints with automatic OpenAPI documentation, Pydantic v2 validation, and genuine async support. Teams routinely see 2-3× the throughput of Flask/Django for I/O-bou
Firestore Data Modeling That Survives Scale: Patterns, Pitfalls, and Production Lessons
Firestore's most common cause of failure isn't technical. it's data modeling. Bad Firestore schemas produce expensive queries, hit document size limits, require full collection scans, or make certain features structurally impossible. Good schemas are designed
Supabase RLS Patterns for Multi-Tenant SaaS: The Complete Playbook
Row Level Security (RLS) is Postgres's mechanism for enforcing data access rules at the database level. In Supabase, it's the primary security boundary between your application and your data. When implemented correctly, RLS makes it structurally impossible for
Supabase: The Complete Developer Guide for Modern Full-Stack Apps
Supabase is the open-source Firebase alternative built on Postgres. It gives you a hosted Postgres database, REST and GraphQL APIs auto-generated from your schema, real-time subscriptions, built-in authentication, file storage, and serverless Edge Functions. a
Firebase for Modern App Developers: The Complete 2026 Guide
Firebase is Google's application development platform. a fully managed suite of backend services designed for mobile and web apps. At its core: Firestore (a NoSQL document database), Authentication, Realtime Database, Cloud Storage, Cloud Functions, and Hostin
PostgreSQL for Application Developers: The Complete Guide
PostgreSQL is the world's most advanced open-source relational database. It's the default choice for serious applications. powerful enough to handle the most complex data requirements, reliable enough for financial and healthcare systems, and open enough to de
PostgreSQL JSONB: Indexing Strategies and Query Performance Deep-Dive
PostgreSQL's JSONB type is one of its most powerful features. and one of the most misunderstood. Teams reach for JSONB to store flexible data, then discover their JSONB queries are slow, their indexes aren't being used, or their query planner is making bad cho
software hub →
Everything I’ve written on software. Start here if you want the deep stack in one place.
Marketingmarketing hub →
Everything I’ve written on marketing. Start here if you want the deep stack in one place.
Startupstartup hub →
Everything I’ve written on startup. Start here if you want the deep stack in one place.
Automationautomation hub →
Everything I’ve written on automation. Start here if you want the deep stack in one place.