In my last post, I wrote about how using LLMs had given me back the curiosity I felt as an intern. The permission to not know things. The energy that comes from actually figuring something out.
That post was mostly about how I learn (check out the professor skill I built for that). This one is about what I’ve been reading and why I’ve started to think that reading is one of the most useful things I do.
I’m talking about the commute. The lunch break. The idle ten minutes before a meeting. All of it, lately, has been filled with papers, blog posts, and research writeups. Some directly applicable to my work. Some from fields I was in, in another lifetime. All of it though, genuinely interesting.
When science builds on itself in public
The thing that caught my attention recently wasn’t a single article. It was a chain of them.
It started with Karpathy’s autoresearch: an experiment where a coding agent autonomously improved a neural network training script. No human in the loop between experiments. The agent brainstormed from code context alone, ran the experiments, checked the metrics, and kept or discarded the results. How cool is that?
Then came pi-autoresearch, which took that loop and generalized it. Any project with a benchmark and a test suite. Shopify’s CEO Tobi Lütke ran it on Liquid, the Ruby template engine that processes $292B in annual merchandise volume. The agent ran ~120 experiments, producing 93 commits that cut parse and render time by 53% and allocations by 61%, with zero regressions across 974 unit tests. Just to emphasise CUTTING RENDER TIME BY >50%! 🤯
And then this week I read SkyPilot’s Research-Driven Agents post, which extends the loop with a step that, in hindsight, seems obvious: let the agent read first.
So they added a literature phase. Before running experiments, the agent reads papers, studies competing forks, and looks at what other backends already do. The same prep a senior engineer would do before touching unfamiliar code. In practice, it turned up interesting, but not obvious, things like. I won’t spoil it here, but you can read the post yourself.
What I can’t get over is the lineage. Karpathy publishes autoresearch → the pi-autoresearch team generalizes it → SkyPilot extends it with a literature phase → all of it is public, linkable, reproducible. Follow the thread, build on it, explore to one’s heart’s content. That’s science doing what science is supposed to do, just faster, and without any of the usual friction around access.
The pleasure of reading things you half-understand
I want to say something that might sound a little embarrassing: I’ve been genuinely enjoying reading papers and research posts again, even when I don’t fully understand them (especially when I don’t fully understand them).
Michael Chavinda’s post on what category theory teaches us about DataFrames did this for me. The premise is that pandas has 200+ methods, and most developers (me included) have memorized just enough of them to get things done. But if you look at what they actually do to a DataFrame’s schema, almost all of them collapse into three fundamental operations — Delta, Sigma, and Pi — which turn out to be concepts from category theory, specifically from Fong and Spivak’s work on database migration functors.
select, exclude, rename: all Delta. You’re reshaping the schema without inventing new data.
groupBy followed by an aggregation: Sigma. Many rows collapse to one per key.
join: Pi. Two schemas combine along a shared key.
The rest — filter, sort, distinct — operate within a single schema and fit into a different part of the categorical picture (toposes, if you want the word).
I’ve used pandas for years. And I’d never thought about any of this. Reading that post didn’t make me a better pandas user immediately. But it gave me a framework for thinking about data transformations that I didn’t have before, and I’ve already caught myself applying it at work.
What else has been good
Google’s TurboQuant paper is about finding smarter ways to compress the data that AI models need to hold in memory while they’re running. Every time you use a chatbot or AI assistant, the model is keeping track of the conversation by storing a running record in memory — and that memory gets expensive fast. TurboQuant is a new technique for squeezing that data down without meaningfully degrading quality. Their result: you can compress it to roughly a third of its original size and barely notice the difference.
The implication is real: AI is costly partly because of how much memory it needs, and papers like this are quietly chipping away at that problem.
And then there’s Eon Systems and the Nature paper they’re building on. A team emulated the entire central brain of Drosophila melanogaster using 139,255 biological neurons and 50 million synaptic connections, reconstructed from connectome scans. The computational model matched the biological fly’s neural responses with 91% accuracy, using nothing but connectivity data and neurotransmitter identity. No hand-coded behaviors.
The digital fly walks. It navigates. It grooms. It feeds. Pure structure producing function.
I find this result quite surprising. The same paper showed that activating sugar-sensing neurons in the model accurately predicted which biological neurons would respond to taste and drive feeding initiation (which they then validated experimentally). We are getting closer to digital twins and that is truly exciting.
What I keep taking from this
Every one of these articles came from somewhere different and about varying things. But truthfully, who cares? Isn’t this just so fascinating? Maybe because I feel as though I am seeing the future being built right before my eyes or maybe because smart people are doing smart things, but whatever it is, I am loving it.
See you in the next one! Who knows what I will be thinking about then?