What we're thinking about.
Short, honest notes on software delivery, AI systems, local models, data pipelines, and the choices that actually matter once something has to work in production.
Human in the loop scraping: when automation hits its ceiling
Some scraping problems cannot be fully automated. CAPTCHA walls, ambiguous extraction, consent screens, phone verification, payment flows, a...
19 Apr 2026 · 6 min read
→Cloud Run Jobs, Compute Engine, and Selenium: how we actually run scrapers in production
The three tools people actually reach for when running scrapers on Google Cloud are Cloud Run Jobs, Compute Engine, and a browser automation...
19 Apr 2026 · 7 min read
→When to use an MLP vs. a Transformer for tabular data
For most structured tabular problems, a well-tuned MLP (or a gradient-boosted tree) still beats a Transformer on accuracy, training time, an...
19 Apr 2026 · 6 min read
→Goose vs. GitHub Copilot: which agent actually ships code?
GitHub Copilot autocompletes. Block's Goose ships. That is the fastest honest summary of where these tools sit in 2026. Copilot lives inside...
19 Apr 2026 · 5 min read
→Karpathy's AutoResearch and what it could mean for neuroscience data
Andrej Karpathy's AutoResearch project points toward a future where AI agents run the grunt work of scientific investigation autonomously. I...
12 Apr 2026 · 4 min read
→Block's Goose: an open-source coding agent worth running locally
Block's open-source AI coding agent Goose has been picking up serious momentum. It runs as a local agent on your machine, connects to your t...
11 Apr 2026 · 3 min read
→Gemma's second act: not ready for agents, but worth watching
We spent a week running the new Gemma models through the same agent workloads we push to Claude every day. Short version: if you were hoping...
10 Apr 2026 · 4 min read
→What is a Large Language Model (LLM)? A practical guide for Australian businesses
A Large Language Model (LLM) is an AI system trained on massive amounts of text to predict, generate, and reason with language. They power t...
8 Apr 2026 · 5 min read
→How AI agents work: a plain-English guide for Australian businesses
An AI agent is a system that uses a language model as its reasoning engine, gives it access to tools (APIs, databases, browsers), and lets i...
7 Apr 2026 · 5 min read
→What is RAG (Retrieval-Augmented Generation) and when should you use it?
RAG (Retrieval-Augmented Generation) is a technique that gives an LLM access to a knowledge base at query time, so it can answer questions u...
6 Apr 2026 · 5 min read
→What is an MLP (Multi-Layer Perceptron)? The foundational neural network explained
A Multi-Layer Perceptron (MLP) is the simplest form of neural network: layers of neurons connected by weights, trained to map inputs to outp...
5 Apr 2026 · 4 min read
→What is a CNN (Convolutional Neural Network)? How convolutions learn features
A Convolutional Neural Network (CNN) is a neural network architecture designed to process grid-structured data like images by learning local...
4 Apr 2026 · 4 min read
→Transformer architecture explained: the model behind every modern LLM
The transformer is the neural network architecture that powers GPT, Claude, Gemini, and every major LLM. Introduced in the 2017 paper 'Atten...
3 Apr 2026 · 5 min read
→What is an LSTM (Long Short-Term Memory)? Sequential modelling explained
An LSTM (Long Short-Term Memory) is a type of recurrent neural network designed to learn patterns in sequential data over long time spans. B...
2 Apr 2026 · 4 min read
→What is a BiLSTM (Bidirectional LSTM) and when does bidirectionality matter?
A BiLSTM (Bidirectional LSTM) runs two LSTM layers over the same sequence: one forward (left to right) and one backward (right to left). The...
1 Apr 2026 · 3 min read
→What is ResNet (Residual Network) and what did residual connections solve?
ResNet (Residual Network) introduced skip connections that allow gradients to flow directly through deep networks, solving the vanishing gra...
31 Mar 2026 · 4 min read
→What is CatBoost and why is it still the go-to for tabular data in production?
CatBoost is a gradient boosting library developed by Yandex that handles categorical features natively, trains fast, and consistently outper...
30 Mar 2026 · 4 min read
→