Source of this website here
I'm currently at UVM doing my PhD in Complex Systems with Josh Bongard.
I like to swing dance, play piano, and train jiu jitsu when I'm not doing PhD things.
Some Links
Contact Me
Blog
03/27/2025
Hypothesis: the linearization of LLM output demands a short-term coherence that is only achievable by adopting a particular reference frame / relevance realization strategy. In the linearization of LLM output, I think LLMs get pigeon-holed into a particular "world view" wherein the attention mechanisms get locked into a particular attending strategy which necessarily blinds it to other important information hidden in its context.
Relevance realization is a way of attending to a subset of available information. As an LLM's context length grows linearly, the number of possible ways it can attend to what's there grows exponentially, making general-purpose attending strategies for long context lengths almost impossible to learn.
The commitment to a particular reference frame can sometimes be seen by a blandness in long-form content produced by LLMs. A lot of people call that the "averaging out" of all of its training data, but I tend to think it's actually the linearization of token generation which begets a committedness to a particular reference frame (which can be boring). Another way of putting it: The prompt sets the reference frame. Maybe a more insightful way of putting it: the reference frame needs to be set and reset by the prompt. This is not new, but it is important. If you've fundamentally misconstrued the frame of the problem you want to solve, or if you're trying to solve the wrong problem, you will not be helped.
This is not something like a fundamental limit on any LLMs. My guess is this is a consequence of reward strategies for training LLMs. Short-term coherence is rewarded by next-word prediction error minimization, and longer-term coherence/relevance is rewarded by RLHF. Coherence is the thing which disallows radically divergent thinking which may be necessary to solve a problem. The discontinuities and nonsequitors of human interactions are key mechanisms for insight. Switching frames is a key mechanism for wisdom. And right now, I don't think LLMs do this.
01/21/2025
I'm working on a project wherein a software system automatically comes up with interventions that are applied robotically to biological systems to optimize some human-given criteria. The software system dreams up schedules of physical stimulus (e.g. electrical fields, chemicals, optical, vibrational, etc.) to apply to the biological system to coax it into (human-)desired behavior.
That's about all the depth I can go into for that project, but I'll point to an already-published paper to give you a clear example of the power of the approach. The paper is Robotic search for optimal cell culture in regenerative medicine and the problem is this: there are an infinity of ways you might apply external stimulus to a stem cell to get it to differentiate into a particular cell type optimally. For example: add chemical A for X minutes, chemical B for Y minutes, and then chemical C for Z minutes. A, B, and C each could take on a gigantic number of values, because they could be plain-old-chemicals or they could also be incredibly complex transcriptional factors which are themselves synthetically engineered, or they could be anything in between. And then, how does one find X, Y, and Z? Chemical A for X minutes could be perfect, but X+3 minutes might destroy the whole system (luckily, we have reason to believe in the robustness of biological systems to such changes, but in principle it's an issue...).
Anyway, the problem is the size of the search space of all possible interventions. It's basically infinite. There are so many things we can do to a cell. The interventions which give rise to the phenotype we desire might be needles in a haystack. But maybe we can find a gradient, like in the paper above. The optimization they sought was the yield of the cell culture, i.e., what percentage of the beginning stem cells (iPSCs) can be successfully differentiated into retinal pigment epithelial (RPE) cells? So they defined a search space which parameterized the different chemicals & the amount of time to apply those chemicals to the culture. The order of the application of chemicals was already predetermined based on previous literature (you can imagine the permutational explosion if this wasn't known). Then, they randomly selected a batch of 48 intervention specifications in that search space, tried them all out by robotically administering these chemicals to 48 different cultures in 48 different petri dishes, and finally they automatically evaluated the RPE production efficiency for each intervention specification. That information was used to automatically generate another 48 batches to produce better RPE yield. It's a batch optimization process that took into account results from previous batches to better select the parameters for the next batch.
Zoom back out. What's being done here? It's a completely automated process which optimizes a protocol for more efficient manufacture of RPE cells. If you generalize more, it's a blueprint for an automated process which optimizes desired properties of biological systems. If you can measure what you want to optimize, just let the process run and the process will eventually converge to what you want!
I'm over-hyping. Of course there are hairy details. Defining the search space. The search process. The economics. But the applicability is wide-spanning and general. Use it in materials science or physics or chemistry. It works. It's just expensive (at the moment). What are the low-hanging fruit? What spaces should we search?
12/23/2024
Here's what I've read this year.
2024
And here's what I've read in the past few years.
2019
2020
2021
2022
2023
Reading goals for 2025:
11/04/2024
I distract myself a lot. I mean, a lot. The constant stimulus seeps into my day like sludge, filling every temporal nook and cranny which might otherwise be use for breathing. I fasted from food for 4 days. I would like to see how it would change my life to do a serious technology fast. A serious prohibition on podcasts, web surfing, YouTube, Twitter. I would like to couple the fast with the alternative behavior of generation. That is, creating stuff. Whether it's journal entries like this, reflections on my life, working through a math proof, coding up a toy model, writing a meditative essay, texting or calling friends and family, I want to generate and participate instead of passively consume.
Parameters of the fast:
Now how long? What if I said, forever? It really would change my life for the better, I think. YouTube, though, does enrich me through lectures and other quality content... there's a part of me that believes the sheer variety and number of the stimulus I see all the time (in my current mode of consumption) does something good for my brain. Like, by providing diverse data points. However, at this point, a lot of the content actually generally comes from the same distribution... Twitter's algorithm gives me stuff that it knows I want. There's like a weird type of novelty there, but it has become more like Tik Tok in the sense that it gives me shocking, humorous content, and a huge swath of stuff it knows I'm intellectually interested in but, in the face of the sheer quantity, I rarely do anything about. Though - what's the value of the rare thing that I do do something about? Hard to put a price on... I have no idea how my mind and thus my life will restructure to accommodate this prohibition. I'm hoping it will be good. Oh right, how long? How about until the end of this year? That's a solid 56 days. Bring it on.
10/02/2024
Two nails stick out to the left of my door frame (my front door, and it's to the left from the inside, not the outside), and I doubt I will ever remove them. Doing so would require a hammer, wood putty, and white paint. Which isn't a lot, but why bother? They're two harmless nails! Also, a thin slice of velcro, about two inches long, is stuck to the top of the door frame, and I doubt I will ever remove it as well. It matches with a larger piece of velcro above the doorframe, on the actual wall.
On second thought, I'm going to go try to remove them. How sticky can they be?
The thin slice took me approximately 1 minute to remove. I was worried about the adhesive sticking, but most of it's gone too. The majority of the time was spent getting just enough of the slice of velcro off the wall so I could pinch (with my strong hominid fingers) a hanging piece and just vigorously and hastily peel the rest. I am a bit worried, to be honest, about the larger piece of velcro. It can't be a coincidence that the two pieces of velcro are so close in proximity to each other; they must be of the same stubborn adhesive. The thin slice's adhesive was stubborn. I mean, a whole minute it took me. Wish me luck on the larger piece.
Well, it's off. It probably took me something like 70 seconds, which is shorter than I anticipated, so I guess I ought to be grateful. But let me tell you - I feel a dissatisfaction about the whole procedure. I'll tell you how it went. Firstly, I grasped for the upper-left corner of the piece of velcro, and there was already about 5mm or so peeled off. In fact, the entire top length of the rectangle was about 5mm peeled off already; so all there was was a flaccid, windworn substance you couldn't call adhesive along this 5mm stretch. I hypothesized one of two causes: some other person had already attempted a removal and gave up; or the one who placed the velcro initially never patted the adhesive entirely to the wall. The latter is a bit too horrifying to think about, so I guess I will go with the assumption that someone else tried and failed to remove the thing. I have them to be grateful for giving me a head start. So - I started picking at the top left corner, because 5mm is not enough to pinch with a thumb and a pointer, and eventually I got my fingers around it. The adhesive was still so stubborn, a firm pinch was not enough - I hope I haven't given you the impression that just because I was able to pinch the corner that it was smooth sailing from there. Remember, this thing was above the door. I was about 35-38 seconds in when my neck started to hurt because I was looking almost directly upwards at the removal site, and I'd only peeled about 20% of the surface area of the velcro off the wall. So I looked forward at the door (luckily the door window blinds were closed, else my neighbors might've been a tad skeptical), closed my eyes, and relied on my finger sensors to continue the surgical process. I felt the stickiness on one side, and, oh, I never mentioned - it wasn't the the abrasive half of the velcro. So it was not really painful or anything. But as I was aggressively fighting the adhesive forces, I grew anxiety about peeling off the paint along with the adhesive. I never looked up though, I just kept moving my hand nearer and nearer to the frontier where the adhesive was still firmly against the wall and the adhesive was breathing my stale apartment air for the first time in who knows how long. Eventually it came off. I looked up, and wouldn't you know it, paint did indeed come off the wall. So now instead of a nice white rectangular piece of velcro slightly off-center above my front door, there's instead a noisy and fractal blotch of white drywall peeking out from behind the light beige coat of paint above my door. I doubt I will ever paint over it, because the paint - well, it's in the storage closet which is like a whole minute down the hall. Actually, I just checked, the paint is in my apartment closet a few steps away from me right now, but a paint brush is still in the aforementioned storage closet, which is definitely a voyage for another day.
09/25/2024
It's an AI summer. Shirts are off and idols are peddling outlandishly optimistic prophesies of Heaven on Earth (which, hey - this Kool-Aid might actually be quite nutritious). It turns out, the productivity gains we've seen since the dawn of computing has come from multiplying matrices - a lot of matrices. How do you keep a level head in the world of AI during an AI summer when everyone is on coke?
I've been asking myself: what is the pragmatic value of my computer science degree? I can't help but come to the conclusion: less and less each day. LLMs are good at what I do most of the time: generate tokens. They do it better than I do. I sit, hunched over a tray of buttons, billions of pixels streaming through my optical nerves, through my electrical meat network, out my bony fingers back to the tray of buttons, generating tokens in a long-winded biological sensorimotor loop. These LLMs, man, they're built for this generating tokens thing! I can't compete! (I've started to use Cursor, which has assisted me in generating tokens faster)
The thing with LLMs is they don't know what to generate. They're not that goal-oriented. Their goal, ostensibly, is to reply to an immediate prompt to best satisfy the prompter. This is a narrow goal. In contrast, human goals are nested. Why am I generating tokens today? To build a software system to control a machine. Why am I helping build the machine? To help discover new knowledge of biology. Why do I want to help discover new biological knowledge? To invent better medical interventions and unveil the intrinsic problem-solving capacity of biological systems. Why do I want to do that? To heal and extend individual humans to give rise to a more empowered civilization. Why, why, why? (ask this endlessly, you end up in a religious realm.) The more you answer the why, the more abstract the goal, and consequently, the more paths there are towards satisficing the goal.
My point: LLMs help solve immediate goals. The trajectory of LLM development is in expanding their ability to achieve larger-scale goals (by chaining LLMs together, letting specialized LLM agent networks pursue goals collectively, etc etc). This is where software engineers might be able to keep up. We migrate our knowledge towards more abstract pragmatic engineering knowledge. Now that it is easier to create subsystems, it is the gluing together (and the testing) of subsystems which is more valuable skill. Concretely: software architecture; rapid software testing and design methods; systems engineering; etc etc. Also: domain knowledge. These software systems we build, they interact with the real world somehow. If they don't, they're easy to build! Tokens in, tokens out. You'll never compete with LLMs on those types of problems. I think there will be a premium on knowledge of the real-world systems that the token-crunching systems interact with, e.g. biological systems, engineering fields, physics...
The thing is: goals are specified in tokens. Ponder the abstract goal, "extend total human knowledge." You quickly realize there are effectively an infinite number of ways this might be 1) interpreted and 2) satisfied. And that's the thing which will always be difficult! How do intelligent systems decide to interpret and satisfy the goal? How does it decide to break down the goal into subgoals? Then the subgoals into subsubgoals? etc. Goal satisfaction always grounds out in moment-to-moment actions (like generating a token) which, hopefully, moves the state of the world towards the goal state. As AI gets better and better at achieving more abstract goals, the perverse instantiation problem becomes more and more pertinent.
I've gone off the rails. Sam Altman might've slipped something in my drink. These tech companies are juicing to win this "race." Towards what? AGI? ASI? The fountain of youth? The holy grail? The philosopher's stone? What happens on the comedown? What's the hangover like?
09/16/2024
This dialogue is in response to a paper which poses questions one ought to wrestle with if one wants to "form or express justified opinions about issues of intelligence, cognition, sentience, consciousness, or any of the related concepts."
Prompt: If one is referring to “life”, as a counterpoint to machines, does the definition of “life” apply to evolved beings that use a completely different substrate, or to ones that incorporate designed materials?
A system which is considered "Life", if I were to seek a salient difference from machines, can only result from a consideration of the system's history. Systems which constitute "Life" are part of a lineage of natural biological evolution in the physical universe (as opposed to any engineering process, even an automatic evolutionary process in a simulated world). If you were to define life in a non-historically-contingent way, you wouldn't be able to achieve a meaningful distinction between systems, because definitions based on system capacities (i.e. cognitive or physical capacities) will crumble thanks to advancing technology achieving these capacities, whatever they are.
What about "Life" which undergoes technological interventions, like gene editing, or the addition of non-evolutionary materials (e.g. lab-synthesized molecules, piezoelectric granules, etc.), or biohybrid systems?
If I am to commit to a binary distinction between Life and machines (with the definition I have given above), I would say these systems are still Life. A biological cell whose gene has been edited using CRISPR is undergoing a transformation in a novel (but still natural) environment. From the perspective of the cell, the change is simply a new selection pressure (though unpredictable from the cell's perspective, therefore may be considered as a new type of environmental noise) it has not seen in its evolutionary history. Same with the addition of non-evolutionary materials, like lab-synthesized molecules--these systems are still in Life's lineage (and therefore considered Life), however, their environments today are radically different from their evolutionary environments thanks to human technologies. For a biohybrid system, I would make a different argument to defend the binary category. A biohybrid system contains subcomponents which are part of Life's lineage, and subcomponents which are not. The biohybrid system can be thought of as a collection of interfaces between non-Living and Living components.
You've introduced this term "Life's lineage." How do you distinguish systems which are part of this Lineage, and those which are not?
Systems part of Life's lineage are systems with unbroken self-reproduction cycles leading back to the first successful self-replicating system, LUCA (or any other spontaneously arising self-replicating system across the universe). The Living system's death is characterized by a permanent breaking of this reproduction cycle. It is this system capacity of self-reproduction in conjunction with historical contingency which constitutes my definition of Life. Machines, then, are distinctly not part of this lineage, but rather systems which have been shaped by Living systems, and can therefore be considered offshoots from Living systems. That is, all machines and advanced cognitive technologies are offshoots of Life's lineage, created by the embodied physical capacities of Living systems. Even if these technological systems can self-reproduce (we may call these systems Artificial Life), their history is most meaningfully begun with a Living system's act of creation.
Can you please define an act of creation? It seems to be the hinge on which this definition rests, because a Living system begetting a distinctly non-Living system is where the category boundary seems to operate.
Yes. A Living system's act of creation depends on the notion of a system's environment. A Living system exploits the relationship between itself and its environment (i.e. takes actions) in order to combine resources (Living or non-Living) from its environment into a new system separate from itself. This is a non-teleological and mechanical definition of an act of creation. Thus Life creates new Life by combining or exploiting resources, (e.g. through crossover, parasitism, etc.) to reproduce itself, carrying on Life's lineage. There must be a little leeway for the concept of self-reproduction to account for imperfect, lossy reproduction in Life's lineage. To preempt questions of system/environment distinctions (and definitions of Self), I must recognize the malleability of these concepts. I think these touch on philosophy of language and categorization itself, which requires a far longer dialogue. In short, I think they are best employed for a specific purpose.
Tell me, why is this a meaningful distinction? What does this historical contingency get us?
A pragmatic engineering definition requires us to talk about system capacities, which, I think, are substrate and history agnostic. So perhaps it is not meaningful there. I do think it's worth asking: do we care about Life, as defined here? Do we care about the natural evolutionary lineage, as opposed to artificial/technological Life? Perhaps not. But, perhaps so - Living systems are indeed the ostensible subject of the entire field of biology, and we ourselves being a Living system would like to know about ourselves and our pasts.
I argue the distinction might be most fruitful for compelling storytelling. Take for example the emerging field of biological robots. By placing Living systems in novel environments (outside of their "normal" evolutionary trajectory), perhaps we can learn about and exploit their existing creative capacities. We give them novel conditions and resources and watch them respond creatively by changing both themselves and their environment. We actually shape the Living lineage's progression, literally offering new affordances to its development, new technological building blocks, and new environments. Instead of the total interruption of Living systems' lineages, perhaps we instead become collaborators in the creation of new Living, non-Living, and hybrid systems.
09/01/2024
I made a syllabus for myself which I'll be working through this semester: Formalizing Robustness and Evolvability
I'll be reading three books: 1. Robustness and Evolvability in Living Systems (RELS) by Andreas Wagner I'm not done with it yet, but RELS is hands down the best systems biology book I've ever read. Wagner presents the empirical evidence of biological robustness and how evolution proceeds both in living and man-made systems in a very clear way. It is a technical book. 2. Design for a Brain (D4B) by Ross Ashby 3. Introduction to Cybernetics (IC) by Ross Ashby Ross Ashby is my favorite cybernetician, and his writing is extremely clear. His original formalizations of system stability and ultrastability delineate the problem in systems terms (that is, substrate-agnostic). I'll primarily focus on his writing regarding his Law of Requisite Variety, stability, and ultrastability.
I'll read several papers on information theoretic formalizations, like redundancy measures and (functional) degeneracy measures. Information theory can help formalize information flows in a substrate-agnostic way.
I'll familiarize myself with error-correcting codes and associated algorithms. This is a weird one to include, but hear me out: systems need to write themselves into the next timestep of the universe. The universe is noisy, and so errors can happen. The structures which propagate through time in the universe are thus implementing some kind of error correction. This is perhaps so broadly true as to be useless, but I imagine some principles generalize, and I miss discrete math. Important note: when coding theory first developed, it was for the sake of reliable communication, and therefore perfectly recovering a message ("unique decodability") would be necessary. It turns out that most important problems today are incredibly high-dimensional and "fuzzy" error correction is actually more useful, e.g. by operating in a latent space. And biological systems are similarly fuzzy, and bowtie architectures are pervasive. So I think unique decodability is overrated today, and these ideas may be fruitless in the pursuit of engineering robust and adaptive systems.
04/24/2024
Self. Organized. Criticality. It is the prerequisite to complexity.
The OG model, the less intuitive and less approachable model than the sandpile model, was a lattice of springs with a pendulum at every vertex. Choose a random pendulum in the lattice to fully wind around its vertex a single time; this is the unit of energy going into the system. By winding a pendulum around once, it puts extra tension in the springs it is connected to. Each spring has a critical point where it cannot be wound more and it unwinds, releasing energy into the other springs it is attached to. Depending on whether or not those springs can tolerate the extra energy, they either take on the energy themselves or they also unwind, propagating the energy dispersal to their neighbors, and the local dynamics repeat in a cascade. "Chain reactions" or "avalanches" can occur, where winding a single pendulum once can set off a huge dispersal of energy throughout the system. The point here is, by injecting tiny bits of energy randomly throughout this relatively simple system (by winding a pendulum once around), the system automatically converges to a "critical state," wherein small local perturbations can have global effects. The size of these energy dispersal cascades (measured by the number of times all the affected pendula spin around during the avalanche), importantly, follows a power law.
Let's pivot to the more intuitive sandpile model. Start with a flat square table. Drop a grain of sand, one at a time, on the table, at the same position. Keep going. Eventually a pile will form, and this pile shows the same dynamics as the spring-pendulum system: each grain of sand sets off an actual avalanche of other grains of sand. The size of those avalanches follows a power law, most grains of sand you drop set off tiny avalanches, and rarely a grain of sand will set off huge avalanches.
We call it self-organized because the system behaves according to local rules and energy is added to the system in a random way. The criticality bit refers to the system being at a "critical state", which is a word originally used to describe systems undergoing a phase transition (e.g. the process of ice turning to water). This critical state is characterized by some intermediate state between an ordered system (crystallized, regular ice) and chaos (fluid churning water). Self-organized criticality refers to the process by which a system maintains this critical state, and actually returns to it if the system dynamics change.
This is actually incredible. It implies subsystems of the universe can be driven towards a state wherein complexity emerges. It helps explain why complexity exists at all. Why is the universe not a boring chaotic gas or a boring orderly crystal? Well--this might be the answer. It suggests systems have a "poise" to them, wherein perturbations largely don't make a lot of difference, sometimes they make some sizeable impact, and rarely they make monstrous sweeping changes to the whole system. We see this pattern in all kinds of systems, whether economic, physical, biological, social, or otherwise.
Effect of genotypic changes on phenotype
In Robustness and Evolvability in Living Systems, Andreas Wagner presents evidence that metabolic networks have this criticality property: that perturbations to the metabolic network have a power law of effect on the function of the network. Let's clarify this. A metabolic network can be represented as a directed graph of chemical reactions where organic molecules (the nodes) react (edges) to form other molecules (another node). The metabolic flux is the rate at which this network can convert some input molecule (e.g. glucose) into an energy-dense output molecule (e.g. ATP). These metabolic networks can have a huge number of molecules interacting in them (hundreds). The flux control coefficient of a given enzyme is a measure of how much the overall flux changes per unit of change in the activity of that particular enzyme. In essence, how important is that enzyme to the overall metabolic function? It turns out, if you observe many diverse metabolic networks in various species of bacteria, the flux control coefficients of a given network is distributed as a power law. That is, the majority of enzymes' activity have little to no impact on the overall flux. They can be removed from the network altogether and still have no impact on the overall metabolic function (as measured by flux). There are some few enzymes which have massive effect on the overall flux. This implies mutational robustness (because enzymes are encoded by genes) of metabolic networks. How does criticality relate to robustness? What about evolvability?
Cognition? Thinking?
Most thoughts have almost no effect on action. We're constantly thinking thinking thinking. Some thoughts have some impact on action: I'm hungry, I'll go eat. Relatively rarely throughout a human's existence, a thought will monstrously transform future actions. Maybe the impact of a thought on a human's motor function follows a power law. Perhaps the right framing is within the predictive processing framework. Sensory input is reconciled with an internal world model's predictive output; the magnitude of the error, if I were to guess, probably follows a power law. Most error is tiny. Some errors are mediumly sized, causing corresponding medium changes in the world model. Some errors are so massive that a massive update to the world model must be performed to account for it. If huge errors occur with no corresponding model update, let's call that trauma--it's psychophysiological tension that builds in the system, much like the springs building potential energy.
04/23/2024
Today I started reading Per Bak's book, "How Nature Works" (what a title, by the way). It reminds me of the value of the field of complex systems, and what the field actually claims to do. First, the field is necessarily an abstract one. It doesn't claim to predict any of the details of any physical, natural system. I was always attracted to complexity science, but this abstractness bothered me for the longest time, because... what's the use of vague, non-specific claims? Well... there are patterns which cannot be explained using the tools of any particular discipline. Complex systems seeks to explain, in broad strokes, those patterns that recur in diverse systems. For example, why do power laws show up when we study Earthquakes, language (frequency of words), or metabolic networks? Complex systems to the rescue.
In a debate between Stuart Kauffman and John Maynard Smith (an evolutionary biologist), JMS claims he is skeptical of the field of complexity science because it makes no specific claims about natural systems. Per Bak says: that is okay. There has been too much focus on the details of specific systems; the last 3-5 centuries of science has largely been a pursuit of reduction, recursively picking apart the components of systems. Don't get me wrong; reductionism has given us pragmatic technological advances and an incredibly explanatory view of reality. We have lightbulbs and cars and electricity and refrigerators and rocketships thanks to advances in physics! The fruit of reductionism in chemistry has given us all kinds of medicines and useful polymers and materials to work with. Reductionism in biology has unveiled the basic mechanisms of evolution and enabled many methods for tweaking the biology of ourselves and the organisms we share the planet with. But, Per Bak (and I) say, it is time for a discipline of integration and complexity. Knowing about a water molecule doesn't explain waves; knowing about DNA doesn't explain cells; knowing about a neuron doesn't explain consciousness; knowing about a human being doesn't explain how the global economy functions.
There is a tension between narrative and science Per Bak explores in the first chapter of How Nature Works. Stephen Jay Gould has said many disciplines of science "resort to" storytelling and narrative over experimentation; but what is an experiment if not a story? "I did this, and this happened." If you stack up enough of those precious nuggets of anecdotal evidence, a claim becomes scientific. When Darwin presented the theory of evolution by natural selection, he filled On the Origin of Species with a mountain of anecdotal evidence that, when taken in aggregate, could not be ignored. This is why we always look for the number of individuals in a medicinal trial! We want to see: does the pattern repeat itself, at different times, at different places, in similar systems?
In contrast, the study of history seems largely narrative. Things in history only happen once, and the study of history seeks to weave an accurate post hoc narrative account of the matter at hand. World War II only happened once, and what we seek there is to weave it into an accurate narrative of the characters and events that comprise it. What makes an historical account accurate are the empirical facts which undergird the overarching narrative. Of course, we can always abstract away the details and ask: what is common across all wars? Are there patterns there? Of course there are! And that might be considered a more general science, because then a statistical argument can be assembled and put forth: The Theory of War (or something). There are fields like cliodynamics and cultural evolution that purport to study these macroscale historical patterns.
The rest of the book is about self-organized criticality, which seems like the core of complexity science.
03/26/2024
In Darwin's Dangerous Idea Dan Dennett exapts the notion of a "forced move" in chess to evolution. In chess, when you get put into check, it forces you to make some subset of moves if you want to keep playing the game. The idea in its evolutionary form is this: the artifacts of an evolutionary process are solutions to problems, and some problems force systems into using a particular solution. Those particular solutions are forced moves. We can conceive of all kinds of life, whether artificial or extraterrestrial, but there are certain aspects of possible life which we can reasonably assume are true about them. Two examples are obvious: all possible life must have a metabolic process to self-sustain (this is almost definitional of life), and life must have a boundary, or a membrane of some kind that distinguishes it from the rest of the world.
The latter, that boundaries are going to appear anywhere we see life, wasn't totally obvious to me at first. But it makes sense: if we assume life is something like a self-replicative process, then that process either goes on indefinitely until it turns all of its surroundings into itself, creating a homogenous mixture, a totality (and I would say non-life), or, it dies off because its surroundings can't fuel its self-replicating process. This is all not to mention that if we are going to truly recognize anything as alive, it needs to be distinguishable in some way from the context it is embedded in.
And with those two forced moves, we basically have cells, no? A metabolic process which is separated from its world with some boundary? The chemical details may be incredibly different (or maybe not, because there may be "chemical laws" that constrain membranes largely to be phospholipid bilayers, or just bilayers, or maybe there's really only the ATP-based family of metabolic processes, who knows), but the physical necessities of energy maintenance and separation from environment are foundational "forced moves" in biology.
Forced moves add a dimension to evolutionary thinking because evolutionary adaptations can be more or less forced, more or less obvious solutions to problems organisms encounter. Do we need 5 fingers? Why not 4? 3? 2 seems suboptimal. Could we do more using a 6th? The lineage leading us to have 5 fingers could've been totally forced out of adaptive necessity or they could be a product of historical contingencies that were largely random. We know that most mutations in evolutionary history are neutral (Wagner), meaning they have negligible effect on the survival and reproduction of an organism. So organisms explore this "neutral space" without becoming more fit, and they can "drift randomly" into vastly different configurations. Are 5 fingers a result of this drifting, or is 5 somehow an optimal number, a forced move?
This idea is widely applicable. Dennett suggests mathematics is a forced move in conceptual space. If there are cogitating species elsewhere in the univere, they likely converge on/discover a mathematics isomorphic to our human mathematics here on planet Earth. This is because mathematics is simply correct, and there is no other way to do it. 2+2=4, no matter what. The details, the particular representations, of course, are almost certainly different, but the underlying structure must be the same. Mathematics is converged upon because it is useful for confronting a wide range of problems. Other pragmatic notions (e.g. cost-benefit analysis) "can be relied upon to impose themselves on all life forms anywhere." In the realm of ideas, these general principles are opposed to artifacts like Shakespeare's plays because Shakespeare's plays are contingent on being human, most specifically being the infinitely unique William Shakespeare. His plays are a unique confluence and expression of a vast number of processes operating on many different timescales, and it is vanishingly improbable that those artifacts could be produced elsewhere, time or space.
02/12/2024
What is intelligence exactly? and then consequently how do we build generally intelligent systems? To me, intelligence is the ability to solve problems in a variety of ways. Thus intelligence is proportional to 1) the scope of the problems you can solve and 2) the variety of ways you can solve those problems.
Let's take apart each of these. First, what's a "problem"? An agent interacts with its environment; the agent itself has a state, the environment has a state, and the relationship between agent and environment has a state (which is the dynamic coupling between agent and environment). Problems are essentially the difference between your current and desired state. In other words, a goal. Another way to articulate 1) would be "the scope of the goals you can achieve," where a goal is a desired relational state between self and environment.
How exactly would we represent a desired state generally? In the case of humans, maybe there's a phenomenological state... Or perhaps using this idea of information integration, it's like, the state of the topmost layer of information integration (which in biological systems is necessarily evolved and thus has pragmatic or evolutionarily advantageous "desired states").
And then, the scope of a desired state might be represented as the magnitude of the delta between the current state and the desired state. What is my intero- and exteroceptive state? Does it match my desired state? The size of this difference is the scope.
There's a hairy detail hiding here: the matter of an agent having an internal representation of its own desired state to guide its actions. Think about an agent without an internal representation of its desired state, and one with an internal representation. Or do they all have a "desired state" representation implicit in their dynamics? Well, it's obvious that the true end state of a system is implicit in its true dynamical relationship to the environment and the system will inevitably head towards that state, but these full dynamics and fateful outcomes are inaccessible to the agent. There are some systems (like us humans) that have abstract representations which guide actions towards some desired agent-world relationship; these representations are somewhere inside the true system dynamics, and they are abstractions (i.e. compressions) of the desired state. That right there may actually be the evolutionary step-change that happened with humans: we started to be able to internally represent our desired states on a much larger conceptual and temporal scale. Did we find a better compression algorithm? Language, perhaps?
Systems without internal representations are necessarily limited in scope. As a super simple example, a Braitenberg Vehicle's (the coward) "desired state" is actually just a state of minimal photonic activation in its sensors. This is a relational state that can be modeled mathematically and which captures the entire coupling of the vehicle's relationship to the environment. There is no internal representation: sensation directly affects action. All of this implies systems with internal representations have memory. It also implies the representation is a causal entity within the agent.
So, an agent's intelligence is proportional to the conceptual and temporal scope of desired states that can be fulfilled by the agent. This is essentially an empowerment framing for intelligence. The capacity of an agent to take actions in order to reach desired states.
11/09/2023
Communication relies on organizational structures. There is some message, which is to be interpreted by some organized entity. The message could be as simple as a bit and the organizational structure a NOT gate. The natural world communicates with us in some sense, because we are organizational structures which have evolved in order to interpret the messages of nature. When we enter the world of "interpretation," we enter the world of meaning. These messages from our environment mean something to us, i.e. they have a pragmatic implication for our actions. The "meaning" of a 1-bit to a NOT gate is simply to flip the bit, or output a 0-bit.
What is the nature of the human interpretational structure?
There are systems with and without memory. A system with memory has an ability to store information; the stored information can only be said to be part of the system if it alters the system's interpretation of messages. If information is stored "within the system" which does not actually alter the system's behavior, then the information ought not be considered part of the system. Then, interpretational structure is the same thing as the system itself.
When a message enters a system without memory, the result of the system's interpretation is only an injection of information back into the environment as a transformation of the inputted message. In the example of our NOT gate, the bit was flipped. The system "acted on" the message and re-injected information back out into its world. A system with memory has the option of updating itself internally for altered interpretation of future messages.
If time is discrete, a system without memory is any system in which the result of the interpretation of a message arriving at time $t$ is independent of all messages arriving before and after timestep $t$.
Meaning is not objective. It is not exactly subjective either, because the meaning of a message or the meaning of an experience is a function of both the objective "outside" world and the structure of the interpreting system. Meaning is, to use a term John Vervaeke coined, transjective. Meaning is the implication of a message for the future of a system's behavior.
I ask again, what is our meaning-making machinery, our interpretational structure? How do we update how we interact with the world based on our experiences (which are messages from the world)? What are these higher-order meaning structures which manifest in consciousness? Concepts? Affordances? What then is language?
I've bootstrapped too quickly, I think. But the cybernetics approach to this question of meaning is worth an investigation.
09/10/2023
How do you think about the universe, fundamentally? How would you characterize the universe's essence? Different ideas of the universe float around our zeitgeist, but there's a common one amongst scientists that tends to reign as the most "true." That universe, the mechanistic atoms-in-motion universe of randomness, chaos; that universe which is characterized by pseudo-infiniteness; that universe which contains us humans within it, but really truly only as a result of chance; that universe where our ultimate destiny is heat-death. We're quite familiar with this conceptualization of the universe--it's dominant among those who wish to be taken seriously, especially in Science. That conceptualization is one of many world views currently competing with a vast set of others, each with millennia-long histories, each with an army of scientific and philosophical behemoths as their backers throughout those histories.
Here, as a synthesis of the book Thematic Origins of Scientific Thought: Kepler to Einstein by Gerald Holton, I want to investigate the origins of core philosophical and scientific precepts and outline the dynamics of how these concepts tend to evolve. To do so, first I'll differentiate the term "science" into firstly the public institution of science and secondly science as the deeply personal and subjective pursuit of Truth. This will lead into another differentiation of scientific claims, namely the empirical, the analytical (i.e. mathematical), and finally the thematic components of scientific claims. The thematic axis and the personal practice of science is that which we'll look at: those themes which underly scientific ideas across centuries, those deeply discontinuous moments of insight or experiences of awe which may motivate scientists, those experiential and unprovable elements of science.
There are generally two meanings of the word science: 1) the public institution (e.g. "the science says X"), and 2) the process of the individual scientist (e.g. "he's doing science"). These obviously are two fundamentally different things. First, let's characterize the public institution, Science. It is the place where the individual scientists share their results, their methods, and frequently they give an interpretation of their results; their insights. The thing which is shared in this public arena is expected to be thoroughly cleansed of their unfounded personal convictions, scrubbed of their non-deductive reasoning, their winding and jumpy path to insight masked with a polished chain of reasoning. As it should be. This scrubbing is to ensure there are no cataracts in the eyes nor malfunctions of the mind of the scientist as they've observed some phenomenon. If the scientist saw the world clearly, then another scientist should be able to observe the same phenomenon, and it is with this precise and sometimes dry communication that this can be validated. This rigorous expectation is the most valuable thing Science offers civilization: a method by which we can collectively engage to reveal and articulate aspects of the world that do not depend on who's looking at it.
Science as a process happening in an individual is a fundamentally different process than Science as the public institution. In an individual, the pursuit of scientific discoveries is messy, strangely motivated, and can be incredibly irrational. If we autopsy a scientific insight by looking at the individual, we encounter flashes of insight that cannot be said to have been logically deduced; we encounter motivations that are religious in nature; we encounter the humanity of the pursuit, the trials and the perils, the almost mythical odyssey through a conceptual world which is hardly deductive.
Early Enlightenment scientists like Kepler or Copernicus or Descartes studied mathematics in order to come to know God's creation. If we want to understand the genesis of their ideas and more generally about the individual process of science and the pursuit of Truth, can we ignore this? Granted, it's hard to find anyone of prominence in their time who didn't at least feign religiosity. But if we do want to turn our analytical mind towards science itself, the experiential component is impossible to ignore. This, however, is how things can get hairy.
One of the messiest aspects of a scientific idea is its genesis, the development, the story of it; only in hindsight can that thread be woven, and never perfectly. Some of these stories become almost mythological in nature. Take Bell Labs: today, there's a certain mystical reverence people in science and technology have for that place and time. It's cited regularly, correctly or not, as the birthplace, the locus, of many world-changing technologies and ideas. Los Alamos 1943. The Vienna Circle. Beyond collective myths, there is also the archetypal lone scientist, stowed away in their laboratory or their study, vexed by some physical or conceptual knot. Bringing quantitative methods to study this dimension of science is an open field; it's what the Computational Story Lab at UVM is attempting.
A scientific claim, especially as presented in the public arena of science, might consist of two components: first an empirical component, which is the evidence for the claim's truth brought about by direct experience, i.e. first-hand observation. Second, the analytical component, the evidence brought about by logical or mathematical deductive reasoning which ultimately boils down to tautology. If a scientist is going to make a case for some new piece of knowledge, the claim must have one or both of these components. In TOST, Horton loosely imagines scientific propositions in Cartesian space, with the analytic and empirical components as the X and Y axes respectively.
The X axis, the analytical axis, is necessary if we are to admit mathematics is effective in describing the universe. I won't belabor this point, because it's been made repeatedly elsewhere, but in essence this entire analytical axis is permitted as a valid scientific argument because mathematics has proven itself repeatedly in the description of the world.
There are limits to the efficacy of the analytical axis. The first is somewhat philosophical. That is, where do the axioms of a mathematical deductive system come from? Are they "discovered," as Platonists believe? Are axioms mere human constructs, pragmatically selected from our irrational imaginations? Are those two views even at odds? At bottom, it does seem that there can be no rational basis for the selection of an axiom except if you adopt a pragmatist epistemology. The other limits of the analytical axis stem from Gödel's insights: a (sufficiently expressive) deductive mathematical system cannot be both consistent and complete. I'm not as versed in the implications of these limits, so I'll leave them at that.
The empirical axis, then, is the direct experience of the phenomenon being claimed. Can your scientific claim give rise to predictable impressions, as Hume puts it? Impressions being direct sensory experiences, we must admit the validity of an empirical argument because of how unnatural and absurd it would be to deny one's own direct experience of the world.
In cognitive science, there's a distinction between sensation and perception; sensation is the raw input data, whereas perception is how we end up consciously interpreting this data. Percepts make up our experience, unless you're in a deep meditative state or something. Whatever biological machinery which transforms sensations into percepts is evolved, which suggests what we perceive consciously is actually tuned for survival. We constantly generate percepts plagued with bias, because we're limited beings with limited experience to draw upon. We see ghosts, we're tricked by optical illusions and magicians, our brains (and thus our percepts) morph to deal with certain cultural and natural environments... all of this points to the fact that we do not have direct access to Reality.
I also want to draw attention to the magnitude of the empirical component of a scientific claim. Experience is high dimensional. Come into it with me real quick: you see these words, you hear the ambient noise of your environment, you smell your own musk, you feel your ass cheeks pressing against whatever surface, you taste the biomatter festering in your mouth crevices... you see? When Darwin proposed Natural Selection, he marshaled a mountain of empirical evidence that anyone could go out into the world and verify through experience. The more a scientific claim overlaps or engages with the many different modalities of experience, the more powerful the claim!
I hope it is obvious why analytical and empirical arguments are considered valid in the institution of Science. Finally we come to the thematic axis, which is the primary subject of Horton's book. This axis characterizes the components of a scientific idea which are not empirical and not analytical. What, then, does that leave us with?
Toddlers go through the experience of asking why? ad infinitum. Why is that light blue? Why do trees lose all their leaves sometimes? Why is that cat crouched? Why? Why? WHY?! Asking this recursively leads us to questions without answers. I always bottom out at "Why does anything exist at all?" We can venture a story to answer it, or we can say we don't know, and perhaps further we cannot know. Welcome to epistemological rock bottom, a dreary place where souls float around, vying to peer through the foggy darkness beneath them to see, to come to know, what might lie underneath. The one solid pebble you encounter in this ether was first discovered (or so the story goes) by Descartes: I think, therefore I am. This is undeniable knowledge, the knowledge of existence itself. The knowledge that something is happening. What more can we say with just as much certainty? And if certainty in this absolute sense is doomed, what do we build our scientific claims on? Perhaps we can say that whatever is happening, it is patterned. It is not uniform and without information, but it's also not pure chaos, it's not totally random. Existence is patterned. And I would say the objective of science is to describe these patterns. Out of this we can form foundations for the further pursuit of knowledge, solid ground which has a rational, if not provable, basis. These foundations are the concern of the thematic axis. They offer a starting point for analytic and empirical procedures, but they themselves cannot be proven analytically or empirically.
Let's look at examples of thema which have dominated science for centuries. First, the assumption that the world is continuous. Einstein's spacetime is predicated on the idea of the continuous universe. The complementary (or antithetical) assumption is that the world is atomized, that if you probe deeply enough the universe consists purely of discrete units.
There's the assumption of uniformity, that whatever patterns we observe here and now can be extrapolated to elsewhere in time and space in the universe.
Hell, time and space themselves are thema, they're useful frameworks for interpreting experience.
There's an assumption, maybe rather a hope, that the patterns of the world are dictated by an underlying simplicity (this assumption has been rather fruitful in physics).
There are a handful of these themata that have guided and provided the basis for scientific discovery for millenia, and we cannot dispense of them unless we want to dispense of the abundant fruitful structures built on top of them: these structures which render the world intelligible. Some of them are more or less fundamental than others. By examining this thematic component of scientific ideas we get insight into the process of science, we can delve into the deeply personal world of the scientist's mind, and we can tell a story of the conceptual development of a scientific idea which is a project that largely lies outside analytic and empirical study.
In the 1500s, Kepler posited a three-fold vision of the universe: the universe as a mechanistic machine in motion, the universe as a fundamentally beautiful and simple mathematical unity, and finally the universe as a structured theologic center pervaded by God. Kepler held these three views in balance, not privileging one over another. Yet today we see the dominance of one: the universe as a machine. The current popular form of this conceptualization is the universe as a computer (like a simulation).
The world today is full of concepts (Dawkins's "memes") each with a vast lineage, and it can be useful to follow back the lineage to understand the idea more deeply, just like it can be useful to understand an organism in terms of its genetic evolution. The evolution of these scientific ideas and concepts are messy and non-deductive, manifesting in moments of "insight" in an individual scientist, and the stories we tell about their genesis and evolution can be messy and sometimes simply wrong. By disambiguating the word "science" and dissecting scientific claims along the axes presented here, we equip ourselves to tell truer stories of the evolution of ideas and thus more deeply understand the conceptual world we inhabit.
06/15/2023
I'm reading a book Thematic Origins of Scientific Thought by Gerald Holton. Yesterday, I read a passage which explored the clashing of different epistemologies in the history of science. In correspondence with a friend, Albert Einstein described fellow physicist Ernst Mach's strict positivism (where all that can be said to exist is sensations, the rest isn't real) as destructive and not creative. A good quote: "Phenomenalistic positivism in science has always been victorious, but only up to a very definite limit. It is the necessary sword for destroying old error, but it makes an inadequate plowshare for cultivating a new harvest." And of this epistemological position, Einstein writes "It cannot give birth to anything living, it can only exterminate harmful vermin."
If epistemological positions are creative or destructive forces in the battleground of conceptual entities and scientific ideas, then we need a diversity of epistemological positions if we want an evolving and lively landscape of scientific ideas.
The alternative would be a homogeneous epistemology across science, the effects of which would depend on the particular position adopted uniformly. If it was a strict phenomenalistic positivism, new hypotheses might be shot down too quickly, a culture of safety might form due to the strictness, resulting in a more sluggish institution of science. If a radical relativism dominated science, all ideas must be given credence, and the institution would crumble or go insane or both. If pragmatism ruled, we might stop short of deeper, more nuanced descriptions of reality if they aren't useful to us.
It's also worth thinking about how these epistemological dynamics might play out across scales. If an individual adopts a strict positivism in their thought, are they less likely to venture new ideas? Can one cultivate a dynamical system of epistemological positions within their own mind in order to benefit from creative and destructive forces? And at other scales, like within a research department, or across a particular field, or across the institution of Science as a whole, how do these dynamics play out, and how are they related to the culture of these groups?
06/19/2022
"Macroeconomics" gives me the creeps. Sweeping generalizations, claims about how aggregated metrics like GDP behave, emergent phenomena grounded in a questionable foundation... Some of it is probably accurate. Probably I'm just stupid. An emergent complex system like the global economy is always going to be to some degree beyond modeling, so maybe I should give it a break. That being said, Austrian school kingpin Murray Rothbard's The Mystery of Banking puts macro-scale phenomena into terms a chimp like me can understand.
"When economics students read textbooks, they learn, in the "micro" sections, how prices of specific goods are determined by supply and demand. But when they get to the "macro" chapters, lo and behold! supply and demand built on individual persons and their choices disappear, and they hear instead of such mysterious and ill-defined concepts as velocity of circulation, total transactions, and gross national product." (p29)
And those "same laws" are supply and demand. And even the law of demand seems like a shaky and subjective foundation to me, but I’ve got nothing better. Here we'll start with those foundational laws of supply and demand, apply them to money itself, and then try to derive a "macroeconomic" explanation of inflation.
Money is a medium of exchange. Before money, we had to barter for things, which is obviously not scalable. We take it for granted now, but money might be the single most important human invention of all time. It lets civilization scale, because bartering relies on the coincidence that what you need is what I have and vice versa. Money lets us exchange what we have for a medium that we both value, which we can use to pay each other for what we want. Also, money is divisible; an egg or a shoe or my very fertile cow Peggy is not divisible. Money lets us peg the value of the goods in our village to a given quantity of money. This is the "price" of the good.
What decides price? Fundamentally, the laws of supply and demand decide the price of a good. The demand of a good is a measure of how much people in a market desire that good over other goods (including money). And supply is an objective measure of how much of that good is available to the market. Let's do an example, my favorite drug: coffee. The supply of coffee at any given moment is S. This is objective, even if hard to measure. The demand of coffee, though, is harder to measure, but we can observe a general trend as demand relates to price: the higher the price, the less likely people are to buy the good. The lower the price, the more likely people are to buy the good because they'd have to give up less for it. Thus you see the "falling demand curve" so often referred to (which is a trend, not a real equation). Again, demand is subjective and human behavior is weird.
When more coffee becomes available in the market, its price falls if demand is constant. Say coffee producers start using a new technology to increase yield by 10%; the market will be flooded with surplus coffee. Producers adjust the price down so people are more inclined to buy the excess. And if there's a drought in Brazil (the world's foremost coffee producer) there will be a shortage of coffee at the current price. Again, the profit motive of the producers drives them to increase their prices. Thus the "market forces" are always driving towards some stable price equilibrium where supply can meet consumers' demand.
Now let's look at money. Money is scarce, making it subject to the same laws of supply and demand. The supply of money, M, is the total money in circulation. The demand for money is some measure of one's inclination to hold money rather than buy goods. In other words, someone's cash balances relative to their spending habits is a reflection of their demand for money. An increase in the demand for one good implies a decrease in the demand for another. When you buy coffee, you value the coffee more than the money you exchanged for it. The supply and demand for money, because it can be exchanged for any good, has an effect on the prices of all goods. Let's call that the overall "price level," P. The higher the demand for money, i.e. the more cash people are willing to hoard (instead of spending it), the lower the price level. The lower the demand for money, the more people are spending their cash balances, the higher the price level. An equilibrium is eventually struck, just like with any other good. The purchasing power of money, PPM, is inversely proportional to the overall price level P. That is, PPM = 1/P. You could think of it as the "real" value of money, because it tells you how much a unit of money can actually buy on the market. Thus we see the purchasing power of money as a function of M.
Now we can imagine what happens when M increases or decreases. When the supply of money increases, often because the government prints more of it, people find themselves with more money than they need, so they spend it. Consumers buy more stuff across the board, the demand for all goods generally goes up, which induces producers to raise their prices. If M is fixed, the market seeks out equilibrium in the same way as for other goods.
What changes the demand for money? According to Rothbard, there are four main drivers, some more potent than others. First, the supply of goods and services. When there are more goods (i.e. supply curves shift right or a new good enters the market), the demand for money in exchange increases which corresponds to a fall in price levels. This took me a little to wrap my head around, but let's look at a tiny example. Say Sally Bob and Joe each have $100 to spend on goods. M=$300. There are three goods on the market: shoes for $50, beef for $20, and eggs for $10. Intuitively, the $300 can be spent in any combination of shoes, beef, and eggs. But let's say a new good–pants–enters the market for $60. Any purchase of pants corresponds to a lack of a purchase of eggs, shoes, or beef. Again, an increase in the demand for one good implies a decrease in the demand for another good. Resultantly, the other goods experience a surplus and their prices drop accordingly.
Second, the frequency with which people are being paid wages affects the demand for money. This is straightforward. If someone gets paid once a year, they have to keep a higher cash balance in order to pay for their expenses until next year. If someone gets paid weekly, their cash balances can be lower because they know they will get paid soon. How often people get paid doesn't change drastically and probably doesn't affect the overall price level very much.
Third, clearing systems lower the demand for money. Clearing systems are financial technologies that make cash balances more flexible. A credit card is a clearing system. When I pay at a restaurant with a credit card, the transaction isn't final. My credit card company pays the restaurant right there, and then I owe my credit card company the cost of the meal plus interest at a later date. Clearing systems, allowing people to delay final payment, only lowers the demand for money and consequently for prices to increase.
Finally, what Rothbard claims is the most important factor deciding demand for money, people's expectation of the price level in the future. People spend their money today differently depending on what they think their money will be worth tomorrow. If they have deflationary expectations, i.e. they expect prices to decrease over time, then they're more inclined to buy their house or car later when their money is more valuable. If they have inflationary expectations, i.e. they expect prices to increase over time, they'll be inclined to spend their money now because they can purchase more with it now.
Ludwig von Mises, granddaddy of the Austrian school, sketched what a prototypical inflation cycle looks like. I admit I'm skeptical of it, mostly because it's got exactly the kind of macroeconomic BS stench I hinted at earlier. But it's in the terms we just elucidated, and it makes some sense if we keep in mind it's a rough model. In Phase I of the inflation cycle, the government prints a lot of money usually to fund some unforeseen event (see: COVID-19, war), and there is some degree of price change. But people expect prices to go back to "normal" after the circumstances are over. These deflationary expectations cause an increase in the demand for money, which blunts the impact the increased supply of money has on price levels. The government starts to think, according to Mises, that it can continue to print money without a significant increase in prices. So they keep printing. Phase II begins when people slowly start to realize prices will not fall again (deflationary expectations become inflationary expectations), so they rush to spend their money, causing aggregate demand and thus prices to rise. Phase III looks different depending on how monetary officials respond. If they keep printing money, hyperinflation will take root, which is a gnarly feedback loop. People rush to spend money before it devalues, prices rise such that people can't afford anything, and the government prints more money to give to people so they can afford things. Rinse and repeat. If instead the government stops printing money, the market will work its way back to an equilibrium price level, higher than before but stable. Again, this is just a rough sketch of an inflation cycle. I'm skeptical of the dynamics and especially Mises's mind reading of governments' thought processes.
So we started with the principles of supply and demand and we've managed to clarify (to some degree) the macroeconomic phenomenon of inflation. The micro and macro dichotomy may be false, and it is at least blurry. I have found in Murray Rothbard, at last, someone who also seems to be perturbed by "macroeconomics" concepts due to the lack of a strong foundation in human behavior.
"Micro and macro are not mysteriously separate worlds; they are both plain economics and governed by the same laws." (p41)
05/27/2022
Evolution created humans, the most complex and generally intelligent agent we currently know of. Biological evolution creates biological intelligence; can we use artificial evolution to create artificial intelligence? More generally, can we harness evolution to design solutions to complex problems? Evolution is a great designer–again, it designed us. It’s been shown on multiple occasions that evolution can produce both novel and optimal solutions to engineering problems: NASA’s X-band antenna, Rechenberg’s pipes. There are further reasons to design with evolution: we can learn more about biological evolution, emergence, and intelligence; evolution integrates timescales by allowing behavior on shorter timescales to emerge; and in silico evolution is relatively unconstrained compared to biological evolution (constrained by computing power, which we are still improving). Primarily though, evolution is a process that can outperform human intelligence and we can utilize it to design solutions to problems including artificial intelligence.
Josh Bongard and Rolf Pfiefer suggest there are at least three timescales we should be able to think about when designing artificial intelligence. The “here and now” timescale describes a particular action or behavior in a moment. The “ontogenetic” time scale describes an intelligent agent’s development over the course of its lifespan. The evolutionary timescale describes change–directed by selection–over the course of many generations of agents. Change in the “here and now” and the ontogenetic timescales emerge from the evolutionary timescale. The selection process propagates agents whose behavior over its lifetime aligns with whatever is being selected for. In the case of biological evolution, organisms that can survive until reproduction are selected. Of course, the individual behaviors of an agent emerges from its ontogenetic imperative. The actions that align with what is being selected for in an individual’s lifetime will necessarily also be selected for. To be concrete, we can see clearly that copulation is a necessary “here and now” behavior for selection in biological evolution. Of course actions leading up to reproduction (e.g. staving off predators, generating energy, etc) are also selected for. Evolution integrates all three timescales, which is necessary to generate end-to-end intelligent systems with minimal design bias.
End-to-end systems–systems whose parameters are trained all at once–are desirable because the alternative is a modularized system. When designing a complex system, it is very difficult to get the relationship between modules of a system correct. It takes a certain arrogance to assume we can properly divide a system into subsystems when those subsystems are not obvious. Humans (and intelligent organisms generally) are continuous dynamical systems. It isn’t obvious where to draw clear distinctions in time between our behaviors or in space between certain parts of our bodies. Evolving systems ensure they are designed end-to-end, which blocks against human arrogance and folly.
Human bias in the design of intelligent systems can be severely detrimental. Our proclivity to anthropomorphize pervades our cognition (which is pretty much tautological, because our cognition is unique to our humanity). Our thoughts and ideas are grounded in our human experience of the world, the substance of which is determined by our body’s interaction with the environment. So our concept of an intelligent system is almost certainly human in nature. If we’re clear about what we want a system to accomplish, choosing evolution as a designer largely eliminates the opportunity for us to embed our biased idea of what an intelligent solution looks like.
Artificial evolution has been proposed as a solution to creating generally intelligent real-world AI. By using physics simulation and genetic algorithms, we can evolve robots (bodies and behaviors) to accomplish tasks. The flowchart above shows a high-level description of this process of artificial evolution. The details of this process deserves a deeper dive than this blog post is meant for (and frankly I’m not yet acquainted with the details!). I’ve articulated why evolution is a powerful engineering tool: it generates end-to-end systems, it largely eliminates human-centric design bias, it integrates all three important timescales, and it has already been shown to produce novel and complex solutions to tough engineering problems.
05/17/2022
Large-scale language models like GPT-3, Codex, and PaLM have recently been a hot topic in the AI conversation. One of the questions revolves around the usefulness of language models (LMs) for understanding general intelligence. To investigate, we'll compare the cognition we know intuitively--our own--with the cognition (or "computation" if that suits you) of a large language model. In an attempt for a fruitful inquiry, I want to avoid dichotomous thinking (e.g. conscious vs. unconscious, intelligent vs. not intelligent) so we don't argue over definitions. There is clearly something shared between an artificial neural net in the form of a language model and an embodied biological neural net. Comparing the two should give us a sense for the limits of large language models' contribution to our understanding of intelligence in general.
The human experience of the world is rich and varied. We can see a relatively small but fruitful band of the radiation spectrum (visible light). We can process information from the molecules and vibrations of the air. We can sense all kinds of things inside and outside our body. Our knowledge and understanding of the world is thus structured by both the information we get from the sensors of our body and the limitations of our body's actuators (our ability to act on the world to gain new information). It is in this way that our cognition is inseparable from our body. With our body we are constantly interacting with the environment we find ourselves in. The input data from the environment feeds into a central control center in our brains, which actuates a response with motor neurons, which changes the input data; this is a tight, inextricable loop that makes up our experience.
When we say we understand something in the world, that "understanding" consists of a large body of knowledge grounded in our experience of that thing. Think of a dog. Watch what your mind does. You might think of furriness, or the feeling of a dog licking your cheek. You might think of the slight tilt of a dog's head that gives you a deep feeling that it understands and empathizes with you. You may be reminded of the time your dog walked all around the carpets after coming inside from a rainy day. At the very least you probably conjure an image of a canonical dog. Hardly any of this understanding of what a dog is consists of language. It consists of knowledge aggregated from real experiences of the world.
What is a language model's "experience"? Let's take PaLM. PaLM's lifespan had two stages: the training stage, where it learned all it could about the environment it found itself in, and the inference stage, where it generates output using an "understanding" of its world. In its learning stage, PaLM was exposed to a monstrous corpus consisting of many languages in many contexts. The "environment" it found itself in was non-interactive; it could not act on the world and observe the consequences; it could not explore. Whatever PaLM can be said to experience or understand, it is purely linguistic. PaLM has no sense of shape, space, movement, or time. Its knowledge consists of words, relationships between words, syntax, and perhaps higher-order linguistic capabilities like summary or reformulation. The substrate of its world is tokens fed one-by-one through a single channel into its cognition.
Given a language model has only been exposed to language, its understanding of the "real" world is necessarily confined. Its "world" is very different from the world us humans know. It might be worth mentioning the similarities between our cognition and a LMs cognition. First, we both generate predictions about what we expect to experience next. In our human case, our predictions are multi-sensory, drawing from our entire prior experience, whereas LMs' predictions are linguistic in structure and drawing only from a relatively small set of possibilities (256,000 tokens in PaLM's case). During training, a LM's predictions receive a response from its world (backpropagation), updating and evolving the LM to become better at prediction. After training, the language model is isolated from feedback. It then is only given opportunities to execute its predictions when it is fed context, which we find useful and entertaining.
To say a language model of the scale of PaLM or GPT-3 lacks intelligence would be a mistake. It understands and predicts its own world extremely effectively, but we must be absolutely clear about the world that it knows about. The reality a language model can be said to know is quite different and rather abstracted away from the reality humans can be said to know.
We should expect a language model's reasoning abilities to have limitations. It does not have an experiential knowledge of the physical world and consequently cannot cognitively simulate it like we can. Without an experience of space or time, whatever understanding it can be said to have is limited to the words and relations between words. Anything like action or movement (concepts requiring an experience of space and time), then, is not understood in any deep sense. The substrate of a language model's reality is actually a reference to an entirely different reality (the human reality) that it does not have access to and thus does not truly understand. Its world is made up of pointers, and it has become exceptional at determining the relationship between these pointers, but it has no way of accessing the reality being pointed to (and in fact it doesn't even know they're pointers). Even in the most charitable view, language is an incredibly lossy compression of human reality.
The gap between human intelligence and artificial intelligence (in all its current forms) is still wide. The worlds which an AI can understand are becoming more and more complex. An artificial neural network's size is bounded by computing power, theoretically. If AI systems continue to grow, they will eventually surpass a human's understanding of the world. But superior understanding still does not give it the ability to act and affect the world; embodiment would be essential to an AI uprising. By performing comparisons between an AI and a human intelligence as we've done here, we can roughly gauge an AI's frame of reference, feel out some of the boundaries of its cognition, and gain a sense of what it can be said to truly understand.
02/26/2022
Dialogue, specifically dialectic, is the process by which we can discover Truth. Reasoning alone is an inferior process. When you reason alone, the only way important Truths can be reached is by starting at first principles and reasoning very carefully. Every jump from one premise to the next must be scrutinized by yourself, because we are generally bad at reasoning about things.
Dialogue on the other hand can reach Truth much faster. Imagine a venn diagram representing the knowledge of two people in dialogue. The shared area is where the dialogue can start, and that shared area is expanded as each person shares something from their area of exclusive knowledge. And when there are contradicting beliefs shared, those too can be investigated and resolved through dialogue. If you look at the format of Joe Rogan's podcast, there's even a third actor in the dialogue, Jamie, who often acts as a conflict resolver and fact provider.
I forget where I read this, but one of the main contributions of Socrates was to synthesize the pursuit of Truth into a practice: dialogue. The pursuit of Truth is a two-sided coin. We're interested in positively stating Truths about the world, but we're at the same time interested in avoiding falsehood. I could just start rattling off a plethora of statements, and some will be true. But I haven't exactly embodied the pursuit of Truth by doing so, I've just stated a bunch of things, some of which happen to be true. On the other hand, I can be too fervent in my avoidance of falsehood by being skeptical of everything. "How do you really know the external world exists? We can't conclude that it does." Socrates provided a format for doing both of these simultaneously through dialogue, the "socratic method," where one person takes the side of trying to state things that are true and the other attempts to avoid falsehood. The reality of dialogue is a more back and forth exchange where the roles flip fluidly throughout the conversation, but the core idea is insanely valuable.
We have to share certain values before dialogue is possible.
First we have to agree Truth is valuable. We have to agree it is fundamentally Good to pursue an accurate understanding of how reality is constituted. A more accurate understanding of reality can help us act fruitfully, and ultimately bring us all towards a better world.
We have to assume the other person knows something that we don't. This assumption takes Humility, and we all ought to really question our own degree of Humility. It can be easy to assume someone doesn't have a clue what they're talking about. At the same time, what the hell do you know? When you get that sense, take the role of avoiding falsehood: question the other person. "What exactly do you mean by that?" "But doesn't that contradict <shared truth>?"
A vigilance towards the vices of pride, reputation, and ego is necessary when practicing dialogue. I think when first starting this practice of dialogue, when you realize you're wrong about something, the instinctual reaction can be really distracting and counter-productive. In optimal dialogue, egotistical emotions don't arise; there's a shared flow state where revelation takes place and there is mutual awe at the Truth revealing itself.
02/19/2022
There are two forms of property acquisition, according to German sociologist Franz Oppenheimer: economic means and political means. Economic means describes the acquisition of wealth through production by labor and through voluntary exchange. Political means describes wealth acquisition through extortion, violence, or force, and usually simply by threat of force. The state is the organization of political means.
In America we tend to identify ourselves, at least partly, with the government. This proclivity is a weird societal side-effect of democracy. Many people, especially those who like to involve themselves with "politics," are under the illusion that they are indeed an integral part of the government. They vote for their representative and they wield power vicariously through them. As Rothbard points out, a true representative "cannot act contrary to the interests or wishes of his principal." Obviously, this is constantly violated in our "representative democracy," which begs for a more accurate articulation of what type of government we are subject to in reality.
Alliance between the State and Intellectuals
An alliance between State and the “intellectuals” of society is natural.
The State is coercive. States arise from conquest of one group over another and there is always subsequent predation of property. Because of this, the State must have the consent, or more commonly the mere submission, of the majority.
This consent or submissiveness to State rule is crucial to the State’s existence. If the majority is upset enough to organize and take action, it can take down or disrupt the State.
So the State needs to convince the majority populace of the goodness, the wisdom, or at least the inevitability of itself.
And because the masses generally don’t form their own opinions and generally adopt the opinions of the intellectual class, an alliance between the ruling class and the intellectual class is natural.
Maybe "intellectuals" is the wrong word--maybe "Opinion Formers" or "Opinion Molders" is a better name, because people like those on MSM are not intellectuals (though perhaps they are looked at the same way as intellectuals...)
The State's Preservation Toolkit
The State has a whole toolkit of ways to preserve itself by convincing the masses of its goodness or inevitability.
Any check on State power has to be enforced by the State itself. As a result, the "check" is usually transformed into a legitimation of the State's exercise of power over individuals.
"Originally, in Western Europe, the concept of divine sovereignty held that the kings may rule only according to divine law; the kings turned the concept into a rubber stamp of divine approval for any of the kings’ actions. The con- cept of parliamentary democracy began as a popular check upon absolute monarchical rule; it ended with parliament being the essential part of the State and its every act totally sovereign."
As another concrete example, the Supreme Court in the US, with the advent of judicial review (Marbury v Madison), has the supreme interpretation power in the US State. Thus, when the Supreme Court rules a certain law "constitutional," that law is legitimated whether or not the consent of the ruled has been obtained. There is a state-level check, of course, where states can veto the Supreme Court's ruling, but that begs the question: why stop at the state level? Why can't a city object to the ruling? Why not a neighborhood? Why can't an individual legitimately object to the exercise of power over his own life?
A state is fundamentally threatened by 1) other states and 2) its own subjects. Thus we constantly see massive investment of a State's resources into 1) a strong military and 2) propaganda directed towards the State's subjects. Being prepared for war is a straightforward enough motivation to understand. Producing propaganda to influence public opinion is a bit trickier, and States are very creative in this defense mechanism. State propaganda is worth its own study, so I will leave it there.
When trying to characterize a system truthfully, a detached attitude is necessary. One should not harbor animosity towards, or identify with, the system. It can be difficult when one is embedded within the system. But if the aim is a truthful model, detachment is a prerequisite. "What is this thing, here? What does it do?"
03/02/2021
If we’re going to build the future, we have to have some idea of what the future ought to look like. This exploration is based mainly on Geohot’s Technology Without Industry, where he gives his take on what “good” tech looks like. A summarizing quote:
“STOP BUILDING SHITTY TECHNOLOGY.
If it centralizes power, it’s bad. If it decentralizes power, it’s good. Build technology that is inextricable from its narrative. Build technology that will give us freedom, not enslavement.
One axis to consider. Does this centralize or decentralize power? The power itself is unstoppable. How we divide it is a choice.”
Centralized power
“Build technology that will give us freedom, not enslavement.”
Technology that empowers people is good - but if there exists some gatekeeper over that technology, they have power over those that use it because they can revoke access at any point. In the hands of the wrong people, the tech can become a weapon. The more power a centralized technology gives you, the more vulnerable you are to extortion by threat of removal of the power it gives you.
An unchecked centralized technology is like a hot potato. Control over the system will change hands over time and eventually it will end up in control of people that will exercise the freedom-limiting power they have.
Take something like a solar panel. When you purchase a solar panel, you gain an ability that can’t be revoked. You’re now able to generate power for yourself. If you buy a battery, you can store it for yourself too. If you consider a technology like Facebook on the other hand, access to it can be revoked arbitrarily. Today we see gatekeepers exercising their extortionary powers regularly.
Closed systems
“We need to decentralize the world, not build brittle systems that leak power.”
Consider how a technology leaks power. Let’s look at our solar panel and Facebook example again.
Think of the dotted line as delineating things in and out of your control. In the solar panel system, your technology is within that boundary, and the only things outside of your control are the sun and weather patterns - easy enough to understand. I drew Facebook as a monolithic black box because “Facebook” is an organism that no one human understands. As Facebook users, we’re at the mercy of this organism, its own motivations, and the pressures it is subject to.
In other words, you do not have full control over the abilities Facebook gives you. Geohot describes a world where civilization has died off and someone is rediscovering relics of our time:
“Imagine finding all that’s left is a smartphone. Useless without a solar charger, but add one and it’s a most valuable relic. And if they downloaded wikipedia offline, the discovery of that phone will allow you to rebuild it. Can we bootstrap faster?
That’s godshatter of the information age. A handheld device that allowed you to breathe underwater. A phaser. A portable device that could construct houses. The doctor’s mobile emitter. A machine that pulls the water from the air. A handheld flying machine. An arc reactor. All ruined if you have to connect to WiFi.”
These theoretical technologies are closed systems. They give the user a new ability that other humans don’t have sway over once the transaction is over. Tech that evolves over its lifespan is certainly cool, but we should be wary of technology that evolves according to unknown and/or complex pressures.
Technology that serves us
I’m trying to get at this idea that technology should exist to serve us. To give us a new power, a new ability. Maybe it’s a new lens to view the world through - a lens that isn’t smudged. We see technologies that seem to have this backward: its users are a means for some other goal. It’s easy to imagine how this can spiral out of control. As we build the future, let’s keep this in mind.
12/27/2020
What a wonderful year!
I honestly can’t recall what life felt like in January 2020. I can only remember the COVID omens that I only recognize with hindsight. I remember watching a video of the CCP dousing their streets with antiviral fluids, and thinking, “wow, that’s crazy, good thing we aren’t going through that.” I remember a classmate of mine coming back to the US from Malaysia and she was wearing a mask before it was cool mandated. I remember my friend - who literally never cleans his room - spend a day wiping down every object and surface in his room, and thinking it was a bit much.
But then, within the span of a week, it all became normal. Encouraged. Everyone went home for spring break, headlines were scarier by the hour, and eventually shit got real. Travel to Europe - banned. School - online for the rest of the year. Stock market tanked for a couple days. Restaurants shut down, stay inside, don’t go anywhere.
Life kept chugging for me. I can’t say I ever felt personally at risk; the death rates by age were clear at the beginning. Old people were/are dying at way higher rates. Young, healthy people aren’t going to die. I snagged a remote software engineering internship my last week of online classes, which was nice.
At some point it became clear things weren’t going back to normal anytime soon. I made the decision to skip the fall 2020 semester at KU, because the tuition prices hadn’t changed. Classes were online, most campus buildings were shut down, restrictions were tight in Lawrence. I thought I’d regret it, but I don’t - at all. I made the right decision. I was able to learn things I wouldn’t have learned in college (for CS students: Missing semester), follow my own curiosities by reading and writing on this blog, and make money. Remote work was okay, but near the end of the internship I felt near the top of the learning curve and I am ready for the next thing.
The next thing, by the way, is more college. Yeah, I know. It’s not ideal. College is overpriced. I could easily teach myself everything I need to know online. College degrees aren’t nearly as valuable. Especially CS degrees.
But I intend to take advantage of the opportunities only college students have access to. I never exploited my access to edge-of-human-knowledge researchers that exist only at universities. Nor did I befriend other ambitious geeks who want to build the future. I never utilized my exclusive student access to places on campus like woodshops or computer hardware labs or student clubs like robotics.
My cognitive dissonance regarding college attendance has only grown stronger. The energetic environment is invigorating, and there are a few exclusive opportunities. But good lord the cost! Why is there nowhere else to learn and grow as a young adult with others doing the same thing?
Anyway, 2020 has been a trip and a half. Can’t wait for more glitches in the simulation in 2021.
11/29/2020
Peter Thiel’s definition of a bubble necessitates the existence of a widespread, false belief. Is there such a belief in the United States? Surely there is, but I should clarify the question I’ve been pondering: is there a monstrous, catastrophic bubble that exists in the United States (or perhaps even the US is too small a scope), where the corresponding false belief is one that has not been questioned in decades, perhaps centuries? Something along the lines of, “America is the best country in the world”?
Let me try to explain myself: Americans have always had a sort of patriotism, an assumption that we are indeed living in the best (see: freest, wealthiest, most stable) country in the world. There is plenty of evidence that this is the case, but there is also a seemingly growing body of evidence that this is not the case. The central dogma of Americans, according to an outsider whom I respect greatly, is that tomorrow will always be a sunny day. Is that belief still on solid ground?
I have little knowledge of other countries to compare today’s US to, so I will use what sparse knowledge I have of the US’s history to make a comparison (self-comparison is a better method anyway). These are blasphemous questions I am asking, I am aware. And there are plenty of rebuttals to these concerns - I am likely being an alarmist. I’ll write a quick response to each of my concerns one by one to help me understand the other side.
First, the crumbling of the “legacy media” can’t be overstated - there is plenty of discussion about this phenomenon, so I won’t elaborate on it, but I will say particular forms of “new media” (i.e. social media, videos, podcasting) don’t seem to be much better at delivering a sensible narrative. Conversely, the replacement of legacy media with new media is inevitable, and it will happen slowly but surely – there is no reason this transition has to be the demise of America.
Second, a college degree seems to be of increasingly less value and higher cost – this is a bubble in itself, the corresponding false belief being that a college degree is necessary for success. I am less knowledgeable about what’s going on in higher ed., maybe because I am too intimate, but there’s definitely a noxious odor coming from it. The rebuttal to this is that the US university system is clearly still the best in the world (which I am fairly sure is true?).
The US ranks 17th on some plausibly valid Economic Freedom Index. Economic freedom is not the only type of freedom, I know; the US seems to have some of the strongest guaranteed protections of any other country. But it begs the question: are we really the freest country in the world? I am skeptical of this assumption that we are, as hard as it is to admit such a thing. The veil is coming off with this one in particular due to the heavy hand of the governance and the apparently unrestricted powers of the Federal Reserve throughout the COVID pandemic.
I want to mention a couple more things that have been making me feel this way, though I can speak far less knowledgeably about them. One is The Great Stagnation, communicated by Tyler Cowen, Thiel, and co. This is essentially the idea that there has been little technological progress except in the field of computing and communications since the 1970s. I am still questioning if this is true, but I thought I’d throw it out there as a possible contributor to the Great American Bubble. Another is the blatantly obvious increase in political polarization.
This is just another attempt at answering the question, ‘what the hell’s going on?’ I don’t purport to know, and I feel like a lunatic trying to connect these observations together.
Meta: I commit to having a comments section by the end of this year. I need someone to call me out on my BS!
11/16/2020
Virtualization is a concept that has eluded my understanding until recently and it deserves it’s own post. So this one is about virtualization in a computer system, which is achieved by a dance between the hardware and the operating system software.
First, what’s virtualization and why is it important? Virtualization is an abstraction of a computer’s hardware to make programming the computer easier. Essentially, the hardware is too complicated to program directly, so operating systems will employ virtualization of the CPU and memory in order to ease the burden of the programmer. As a programmer, it’s useful to know the nature of this abstraction layer because every computer you write programs for is going to have an operating system installed and running along with your program.
OS’s perform two major virtualizations: the CPU (the processor) and memory (i.e. RAM + storage). An OS can be thought of as a manager of these two physical components of a computer. A prerequisite concept to understanding how this is accomplished are processes. A process == a running program (a program that is not running is just a bunch of instructions stored in a drive). An OS will take a program and start running it by creating a process for it.
When you run a program (e.g. click on a .exe file), along with creating a process the OS will allocate a segment of physical memory for it. That allocation of memory is the virtualization of memory. From the program’s POV, it thinks it has the computer’s memory all to itself. From the OS’s POV, the program really only has a small segment. Here the programmer is relieved of some complexity: programs have their own memory that can be referred to without having to worry about other processes using the same memory. That segment of memory is referenced by the process’s virtual address space. Whenever you refer explicitly to some memory address in your program, that memory address is virtual. When the computer compiles or interprets the code, all of your address references goes through an address translation to get the physical memory address.
The scheduler helps with the virtualization of the CPU. Programs have instructions they need to run one at a time, and if you have 100 running programs and only 1 processor, the scheduler will handle the execution of all 100 programs in (hopefully) a reasonable manner. Again, letting the OS handle this lifts a heavy burden off of the programmer; worrying about whether or not their program will actually execute is something a programmer probably shouldn’t have to worry about. Something that should be stressed here: scheduling is a hard problem in operating systems research. There are a bunch of different approaches to effectively scheduling potentially hundreds of processes, but here I just want to give you a sense of what’s considered “effective” scheduling.
First is the idea of fairness. Most OS architectures share a communitarian philosophy and won’t exclude any process from execution. Operating systems may employ policies to distribute the timeshare of the CPU differently based on some metric of the importance of a process, but OS schedulers will always schedule some amount of CPU time for all processes' instructions.
Another performance metric is turnaround time, which is simply the average time it takes a process to run ALL of its instructions. This isn’t as straightforward as multiplying the number of instructions by the time per instruction; when there are hundreds of processes trying to run thousands of instructions each, how do we schedule all of them to? If we’re trying to optimize for turnaround time only, simply executing the program with the least amount of instructions (then the next least, then the next…) will provably do this. But that means the longest program won’t get to run at all until all the shorter ones have run, which is why a response time metric is also taken into account. Response time is essentially the time between a process’s first “arrival” (i.e. when it is first created and given to the scheduler) and when the process’s first instruction runs. If 100 processes were created at once, and we ran them all to completion in order, of course the response time will be atrocious for the last processes. So usually operating systems try to trade off the two in a satisfactory way by switching program execution at a set interval and employing policies to achieve fairness and a good balance of turnaround time and response time.
There are a bunch of different ways of doing all of this of course, so here are some pointers to currently used methods:
Proportional Share / Lottery Scheduling (lottery scheduling uses randomness, and though I don’t like adding randomness unless necessary, it is an interesting solution)
If you take away anything from this, it’s that the OS is a resource manager. Its major functions are to manage critical resources that programs need - namely, CPU time and memory. It does this by creating an abstraction called a process for each running program and allocating memory & CPU time to each of these.
11/12/2020
I have heard enough anti-libertarian arguments that I want to write them down and really consider them. In brief, here are three objections to the libertarian ideal that lead me to think it isn’t much of an ideal:
1. Nice in theory
Libertarianism is nice in theory. Contracts are law, no violation of individual freedoms, property ownership is cut and dry, so there’s no reason for violence. But in practice, none of these things can ever be guaranteed. Imagine you’re walking in the state of nature, roaming an apple orchard planted by God Himself. As you’re converting nature’s resources to your own property with your labor, you hear my voice behind you: “that apple is mine.” As you turn around, you think no it’s not… I picked it myself! But when you see the glock in my hand, you’re certain: the apple you hold is not yours, it is mine. “But… that’s so immoral,” you say. “Morals shmorals,” I reply.
"Coordination problems are cases in which everyone agrees that a certain action would be best, but the free market cannot coordinate them into taking that action."
Ideas of the form, “wouldn’t it be great if everyone just did X” are easy to come by. But unfortunately, it is often the case in a free market that if everyone did X, an incentive for not doing X arises. But X only works if everyone did it, so there will be defectors acting in their own self-interest ruining the optimal outcome. “Rent-seeking” is an economic behavior where an entity finds a way to extract value without contributing value. If you owned a segment of a river and you simply charged a fee for people to pass through your river, you’re a rent-seeker. Even worse is if you inherited that land; somehow you have acquired a free lunch, even though there ain’t no such thing.
3. What is property?
I understand “property” as some mapping of physical objects to… a consciousness? Wait, no - a bat might be conscious, so - a rational consciousness? Well, humans aren’t all that rational… Wait, what is consciousness again? My problem is mostly with the process of inserting an object into the ownership map. The hash function that makes the most sense to me is some positive account of property rights, where really, power over control of the object is equivalent to ownership of the object. This is the same idea as my owning your apple because my glock says so.
So these objections lead me to think libertarianism, taken to a logical conclusion, might be a dead-end. I still think it could be an escape rope out of our current situation in the US - the government is a monstrosity in its current form. I don’t think I’ll ever not be fond of libertarian principles, because freedom and individuality are awesome. Also, if any of the above are non-issues in your eyes, please convince me, because I’m feeling a little philosophically naked without libertarianism.
09/06/2020
The notion of computation was birthed and was developed significantly by mathematicians. In Alan Turing’s seminal 1936 paper On Computable Numbers and an Application to the Entscheidungsproblem, Turing defines a new subset of real numbers with his notion of computation. To make computation easy to grasp, Turing creates what is probably one of the most famous intuition pumps in history: the “universal computing machine.” As we know it, the Turing Machine. He uses it to solve a 19th century mathematical problem, the “entscheidungsproblem,” or, the decision problem.
Charles Petzold’s The Annotated Turing is a wonderful guide through Turing’s dense, theory heavy, 36 page paper. It can be very hard to traverse a mathematical paper like this. In fact, after reading Petzold’s book, I think this is the only way to traverse a research paper of this depth (as an outsider to the field). Petzold corrects many errors Turing managed to make in his original paper. He also provides useful examples, further explanation when none is given, and rich historical context. Reading a paper with none of these things is bound to result in misunderstanding. I wish a book like this existed for every challenging research paper. A summary of this book is moreso a summary of Turing’s paper plus the incredibly useful context and corrections added by Charles Petzold.
First, some historical context. What was Turing even trying to do here? Computation didn’t simply come to him in a dream, and it surely wouldn’t have been published if it came out of the blue. No, Turing’s notion of computation was an unbelievably novel solution for a mathematical problem named by David Hilbert in the 19th century:
“The decision problem [das entscheidungsproblem] is solved when we know a procedure with a finite number of operations that determines the validity or satisfy ability of any given expression.... The decision problem must be considered the main problem of mathematical logic.”
(From Hilbert and Ackermann’s Grundzüge der Theoretischen Logik (“The Restricted Functional Calculus”))
The question Turing was trying to address was this: given any mathematical/logical expression, is there a general way to say whether or not it is valid? Here, valid means it is a formula of a certain formal system (eg logic). So is there a decision procedure we can follow to determine whether or not a certain mathematical expression is well formed/valid? [spoiler] Turing shows that no, a decision procedure does not exist. “The Entscheidungsproblem cannot be solved.”
Turing lays out quite a bit of groundwork to prove this. The paper is divided into 11 sections. Sections 1-7 is the groundwork for his proof, where he introduces the idea of using a machine to calculate numbers, the definitions and notation he’ll be using, and “universal computing machines.” Sections 8, 9, and 10 lay out the logic and proves the lemmas necessary for his ultimate proof. Section 11, then, is the proof.
In section 1, “computing machines” are introduced. They consist of a tape of arbitrary length with squares containing symbols, and a scanner mechanism to “scan” the symbols on the tape. The scanner can perform essentially five operations: read the currently scanned symbol, erase the currently scanned symbol, print a symbol on a blank square, move left, and move right. “It is my contention that these operations include all those which are used in the computation of a number” (232). So that’s it: a system that has some pretty elementary operations, any finite number of symbols, and read/write capabilities. In section 2, Turing clarifies his definition of a computing machine: it is only those machines which result in a tape with 0 and 1 as the symbols that he considers computing machines. The computed number is whatever is left on the tape after computation, a binary decimal representing a number between 0 and 1, where the decimal point is implied to exist before the first binary digit. So the number .0101010101... on a tape is 1/3 in binary.
1/3 is precisely the number Turing computes with his first machine in the next section. An instruction (Turing names it a “machine-configuration”) is essentially made up of multiple operations conditional on the symbol on currently scanned square.
Here’s what the entire machine looks like, made up of 4 instructions:
Config | Scanned symbol | Operations | Next config |
---|---|---|---|
A | Blank | P0, R | B |
B | Blank | R | C |
C | Blank | P1, R | D |
D | Blank | R | A |
We start at A. The tape starts blank, so the scanned symbol is blank, so we perform the operations. We print a zero, then move right, then go into configuration B. In configuration B we see another blank, so we go right, then go into C. In C, we see a blank, so we print 1 and go right, then go to D. D goes right and repeats the whole process.
Clearly a simple machine. We don’t even get to see the conditional branching that occurs when there’s a chance we see some symbol instead of a blank square. But Turing has to start simple to build an intuition for the complexity that comes further along. Here’s what a configuration conditional configuration looks like:
Config | Scanned Symbol | Operations | Next Config |
---|---|---|---|
A | 0 | P0 | A |
1 | R, R, P1 | A | |
Blank | R, R, P0 | A |
There, we’ve condensed our 1/3-computing machine to a single instruction. We’ve also increased the number of operations one instruction can perform. The classic Turing Machine is limited to one print and one move, which are the rules his machines play by for most of his paper. If you play out the machine above, you’ll notice the binary digits are printed on every other square on the tape. Turing gives room after each digit so the digit can be “marked” by another symbol, maybe the symbol # or something. Subroutines - a word that Turing doesn’t use but helps understanding - are used frequently. An example of a subroutine can be abstractly described as “go left until you see the symbol #” or even “find the first digit of this number.” The notion of subroutines lets us venture into more complex machine behavior while still consisting of elementary fundamental operations. To be clear, every subroutine in the paper is well defined with a table like above, not with a sentence describing what the machine will do - every machine provably works.
An interesting step that Turing takes is proving the enumerability of the computable numbers. This deserves a whole digression into number classes, infinity, but I will leave it to the book to describe because it does it so elegantly. Essentially, Turing shows that every possible Turing machine can correspond with an unique finite integer. He calls it the Description Number of the machine, and since integers are enumerable, computable numbers are too. Description Numbers are used to show a few other things. You can’t, for example, determine whether a machine will eventually print a 1 or a 0 given the instructions of the machine (encoded in the Description Number). This probably inspired Martin Davis to describe the halting problem for the first time, stating that there is no general way to tell whether a machine will eventually stop running (which is out of place in the context of this paper, because Turing only described machines that run forever).
The grand finale of the paper consists of a rigorous and hard-to-follow proof, where Turing builds a monster of a formula consisting of 6 nested quantifiers and a surprising amount of errors that needed correction by Petzold. I definitely will not try to explain it because I am far from understanding it myself, but the result is the proof that no general procedure to determine the validity of a mathematical expression exists. It’s a result that’s quite rich in hidden implications, but “the structure Turing built to support this result - the imaginary device now known as the Turing Machine - ultimately would become more interesting and fruitful than the actual proof....”
Turing created something timeless here: a tool to describe what exactly “computing” is. A required class to complete my computer science degree is Theory of Computing, where Turing machines, decidability, and the limits of computation are explored. This paper was the foundation of this new field. The inquiry into computation has been a fruitful one. Computers now are capable of ridiculous things; it’s easy to forget that they are carrying out fundamentally Turing-esque operations. Computation has become a powerful epistemological tool, too. Many processes can be thought of in computational terms; many people (including myself) believe that the universe itself is a Turing machine. Which implies nothing that occurs in the universe is but a computation (including consciousness).
I can’t recommend this book enough if you want to get a seriously well researched and thoroughly explained introduction to the conceptual bedrock of computing.
09/04/2020
Reputation is frail. It is others’ opinions of you, your outward-facing image, the entity people talk about when you are brought up and you are not around. People tend to care deeply about their reputation (including myself) - it’s one of those outdated software programs we all seem to have installed in our brains. During the thousands of years we spent in communities where everyone knows each other, those who spent time cultivating and maintaining their tribal reputation were probably more likely to survive. Like many of our other monkey behaviors, I believe we should be wary of our desire to build or keep a good reputation.
In Robert Greene’s 48 Laws of Power, Law 5 states:
Reputation is the cornerstone of power. Through reputation alone you can intimidate and win; once it slips, however, you are vulnerable, and will be attacked on all sides. Make your reputation unassailable. Always be alert to potential attacks and thwart them before they happen. Meanwhile, learn to destroy your enemies by opening holes in their own reputations. Then stand aside and let public opinion hang them.
I don’t think I disagree with his idea that reputation is something that can be manipulated in your favor in one way or another, but I do reject the entire worldview that is presented here. Even if we set aside the pursuit of power and the uber-competitiveness, there’s something inherently gross about tending to your reputation and wielding it as a tool.
If your reputation is anything other than a description of who you actually are, you are a faker. There’s this John Wooden quote that sums up this idea:
“Character is what you really are. Reputation is what people say you are. Reputation is often based on character – but not always.”
So here’s exactly the problem with the worldview presented by Robert Greene: “cultivating a reputation” is what you do when you’d like others to perceive you to be something you’re not. The person who cannot supply enough value to the tribe resorts to convincing others that they are providing value or exaggerating the little value they do provide.
I feel I’ve been a little harsh here; I should reiterate that this is an annoyingly innate instinct we have. I frequently catch myself worrying about my reputation (i.e. what others think/say about me), and I’m certain there are times where I never do catch myself. The desire to fit in or be accepted is among some of the strongest desires we are subject to. Many times the urge to embellish or defend your reputation or attack someone else’s is automatic. But to do so is fundamentally fraudulent and in direct opposition to truth.
Your reputation is frail. Your “true self,” your character, is significantly more robust. Instead of spending time convincing others that we’re this or that, we should be spending time becoming who we actually want to be.
08/13/2020
So I thought I’d write sort of a summary of my political beliefs because I intend to write about politics here. A disclaimer here: my political thoughts are not well-formed. I confess that I have many beliefs that I probably wouldn’t be able to defend. Writing about them will hopefully sharpen them.
Truth is what I’m aiming at when it comes to beliefs. Assessment of a claim’s truth (assuming it’s not some logical or mathematical claim) demands probabilistic reasoning. Hardly any approach to nuanced topics is a catch-all, except maybe Bayesian reasoning in the form of “given we know (x1, … ,xn), how likely is y?” This is the paradigm of reasoning I try to use, and I think most people ought to use. For example, “given we know the state has executed innocent people before, should we keep the death penalty around?” is a sort of Bayesian inquiry I’d like to explore.
Anyway, a super brief summary of my politics:
I lean libertarian when it comes to economic policies, and probably even more libertarian when it comes to social policies. I am generally a fan of free markets, despite some of the problems that arise when you scale them.
Laws are fundamentally moral claims. “Don’t do X, or else you receive punishment Y” is the form most laws take, which means the government has deemed those actions X to have some moral reprehensibility. To me, this means the government ought to hold a secular morality (ie separate church & state). When we talk about large-scale moral issues like COVID, I am mostly utilitarian. When our scope narrows (eg individual ethical decisions), my ethical beliefs become more deontological.
When we talk about policies on a scale the magnitude of the US, dogma will destroy any type of rational conversation. Political identity makes you stupid. Adopting beliefs in packages almost always results in trouble. When you adopt an attractive package (maybe a set of empathic-looking beliefs coming from the far left), you have a poor understanding of why you believe some of the beliefs in that package. Wrapping up your identity with a set of beliefs means if one of those beliefs is criticized, your ego gets clipped and starts to cause trouble. (I know this because I experience it firsthand; it isn’t easy to keep identity out of things)
I acknowledge I haven’t really said anything controversial here. The intent of this post was to expose my current biases and provide a summary of my beliefs for a historical record. As we near the election in November, I do intend to make bolder claims and maybe some predictions about the future so they’re written down publicly and I can look back and see where my reasoning went wrong.
08/11/2020
Productive conversations are rarely about large topics like “society” or “the economy.” My least favorite is “the system” in reference to society as a whole. It’s hard to shift conversations that have this type of orbital perspective of the world to a more pointed conversation about tangible problems or potential solutions. If we try to solve the whole world’s problems, we’re doomed to fail. I sympathize with the sentiment “the system is f***ed, man.” It totally is. There are flaws everywhere when I look around at “the system.” Channeling the dissatisfaction with the status quo toward something specific has proved useful to me; undirected dissatisfaction will consume you.
When you’re trying to change a large system, learn more about it. What makes it flawed? What are its constituent parts? I’ve recently been thinking about the public education system in America. It’s way too big a system to just go and change it. But as I dig into the lower levels I learn that some parts of the system are more malleable than others.
The lower you go in a large & robust system, the more you can change it. To make effective changes at higher levels, the more expertise about the system is required. This is how the “corporate ladder” works: employees who demonstrate knowledge and competency in the company tend to get promoted and subsequently given more agency to change the company.
So start small. Don’t declare war against the entire system. You’re right, it’s hugely flawed, but you’re picking a battle that you simply can’t and won’t win. The only options you have are complaining about it or burning it all to the ground.
08/06/2020
You walk in at 9am. You check the events board. There are 3 lectures scheduled for the day. One is with a physicist from the microfabrication company nearby who’s going to lecture about the promises of nanotechnology. One is with an evolutionary biologist from the nearby university, and she brought octopi you’ll be able to hold! And at the end of the day, a local jazz musician will be giving a concert and talking about the importance of practice.
You decide to attend the microfabrication one, but that’s in 2 hours, so you head to the library to log into The Forum, the online space for projects in the school, idea boards, discussion boards, and your favorite school book club. You’ve been working on a rebuttal to your friend Carter’s anarcho-capitalist post from yesterday, so you hop on a beanbag in the library and start writing. You decide to look around for Milton Friedman’s Why Government is the Problem to better understand Carter’s view; it’s not in the library, but that’s fine, you just order it online with your student book allotment. You have $245 left for the semester.
You wrap up writing around 10:40, almost done with the forum post - gotta get front row seats to the lecture. Karina joins you (she never misses a lecture) just before it starts. The physicist introduces himself, says the usual “I wish I came to this school when I was y’all’s age!” He gives a decent lecture about his company’s nanofabrication work (they make biosensors), and is wrapping up with how nanotechnology will change the world when a novel idea strikes you for an elegant drone design. You get the antsy feeling that you don’t get very often. “Yo, I gotta go,” you say to Karina as you politely sneak your way out before the lecture ends.
You check the chem lab to see if your avionics-savvy friend Jose is there (for the past few weeks he’s been trying to recreate Tesla’s open source battery). He is, and without explanation you walk up to the wall next to him (oh yeah, the entire school is lined with whiteboards) and start drawing the design you imagined.
“Well hey friend,” he says when he notices you.
“I had an idea in the lecture this morning,” you say, “it’s a drone.”
“Hey, that looks pretty cool...” Jose says as you draw. He frowns, squints at your drawing, then smirks. “That might just work,” he says.
You two head to the computer lab to start modeling. The lab has three-monitor setups with powerful computers running Ubuntu of course. You guys walk past the embedded systems area with Raspberry Pis strewn about, past the circuit boards and soldering station, past the compromised server people use to practice hacking, and finally to the corner you both know well. This is where the magic happens.
Since the school doesn’t close, you two stay there until around 8pm drafting the rotors and body of the drone on AutoCAD. You take your flash drive with the models over to the 3D printers in the fabrication lab and start the printing process. “See you here in the morning?” you say to Jose. “Yeah,” he says, “I’ll probably go sleep and come back around 5 or 6. Hopefully we can try to fly this thing by tomorrow night! See ya.”
You stay for another hour watching the 3D printer go back and forth as it constructs what was just an idea this morning.
———
No grades, no formal classes, everything is optional. There are few staff, mostly to mentor or counsel. They also know the ins and outs of the equipment, so they’re there for that too. But otherwise, the kids are free to roam and do what they please. I guess there’s a nurse and a counselor for health. There are kitchen staff too, 3 meals a day.
The school’s mission is to encourage exploration. Everything is there to be utilized in the pursuit of knowledge. The school is centered around doing and building. Running an experiment is the unrivaled method of gaining knowledge. Real conversations are encouraged, because ideas are meant to be discussed and tested. The lectures are there to inspire, and they’re never required. If you’re interested, show up.
This is the school of my dreams. I have no guess about the feasibility of this idea. It could turn out that in practice it’s the same as our current ideas for flying cars: mostly impractical. But it’s fun to theorize, just like it’s fun to theorize what an anarchocapitalist society might look like. It’s a thought experiment - if you could design an education system completely from the ground up, what would it look like?
07/18/2020
I’ve recently been more excited about a scientific idea than I’ve ever seen before, and that’s predictive processing. The general idea is relatively simple, but the implementation seems really complex. It seems to be a holistic theory of what the brain is doing, at a level of complexity that seems about right. Predictive processing seems to be right in the middle of what neurons are doing and what the temporal lobe is for in terms of complexity, which gives us room to move up and down those layers in this framework. It’s also fundamentally computational in nature, which is attractive to me because I have a bias for the idea that everything is computation.
The basic idea of predictive processing is this: what we perceive is a computational reconciliation between a “top-down” stream of information and a “bottom-up” stream of sensory information. This is what we’re going to attempt to unpack.
Let’s start with the bottom-up stream. This stream is comprised of both our outer senses (i.e. vision, hearing, etc.) and our interoceptive senses, which just means information about our body’s internal state. This can include a sense of hunger, info about our own blood sugar, a sense of where our limbs are in relation to the rest of the body, etc. So our brains are constantly getting multiple “input” streams, from a variety of sources, at once.
The top-down stream of information is made up of concepts encoded in our prior experience. This stream can be thought of as a constant stream of predictions about what the bottom-up sensory stream is going to look like next.
Prediction Error Minimization
Where the streams meet, a computation occurs. According to predictive processing, a comparison is made between what was predicted by the top-down stream and the actual sensory information.
Mismatches between predictions and actual sensory input are not used passively to form percepts, but only to inform updates of representations which have already been created (thereby anticipating, to the extent possible, incoming sensory signals). The goal of these updates is to minimize the prediction error resulting from the prediction (feature #5, Prediction Error Minimization (PEM)), in such a way that updates conform to the norms of Bayesian Inference [1]
Alright, hopefully this isn’t too overwhelming. If it helps, think about the top-down stream of information as your expectations, and when the top-down can’t match the bottom-up, the experience is one of confusion. I hesitate to use such imprecise words as “confusion” and “expectation,” but I think it’s useful here. Imagine you take a drink of vodka expecting it to be water, the brief experience of confusion there is intense and unpleasant, and your knowledge of the world is very quickly updated as you spit it out.
These error minimization computations happen hierarchically. At every level, the two streams meet to try to resolve any error. In the case of water-but-expecting-vodka, at the lowest level, a prediction of the taste of water streams in from above. When the prediction is compared with the actual sensory signal, that level says to the one above, “We got it way off. See what you can do.” So that level tries to resolve the error, gets it way off, and levels of the hierarchy keep relaying it to different levels to attempt to resolve the error. I think this is a great segue into how action is incorporated into predictive processing.
Action
Fundamentally, there are only two ways an organism can resolve prediction error. One is internally, by continually searching for a prediction that better fits the sensory signals. The other is by changing the sensory signals to fit the prediction. As agents capable of movement, we are able to alter our sensory signals through action. In the previous example, this is equivalent to resolving error by spitting out the vodka.
Here, “action” doesn’t have to mean interaction with the external world. We mentioned interoceptive signals earlier, and these can be altered by “action” in interesting ways, too. Let’s say your brain gets word that your blood sugar is low. Your perception is a craving for sugary things, and your brain has two options to resolve the error: 1) eat a sugary thing (interaction w/ the external world) or 2) metabolize fat stores (interaction w/ internal bodily functions) [1].
This has an interesting implication:
In short, the error between sensory signals and predictions of sensory signals (derived from internal estimates) can be minimized by changing internal estimates and by changing sensory signals (through action). What this suggests is that the same internal representations which become active in perception can also be deployed to enable action. This means that there is not only a common data-format, but also that at least some of the representations that underpin perception are numerically identical with representations that underpin action.
This is called the Ideomotor Principle. In this model, action and perception have the same neural representation. I don’t fully understand this idea yet, but I don’t think it’s necessary to dive into for this introduction.
Explanatory
I am attracted to this theory because it seems to have such explanatory power, at least regarding many things related to our conscious experience. Attention, for example, is explained as “the process of optimizing precision estimates.” [1] Makes sense. We become pretty certain pretty quickly about whatever our attention is on. There’s also explanations for cognitive diseases like schizophrenia (too much trust in top-down prediction) and autism (overwhelming flows of bottom-up information). Dreaming can be explained as the interaction of concepts in your body of top-down knowledge in the absence of bottom-up information to correct error with. Placebo results have a good explanation through this lens: if we expect less pain, our brain will smooth the noisy pain signals, assuming they are partly a mistake.
Models of the world are only as good as they are explanatory/predictive. Predictive processing is the most accurate model of brain computation I’ve ever learned about, so I’ll be diving deeper into it and sharing more of what I find. Descriptions of the brain in terms of action potentials or brain lobes have left me unsatisfied. When trying to answer the question, “What is the brain doing?” a computational model seems most apt.
If you want to know more, definitely check out the source I pulled most of this info from below. And if this stuff fascinates you like it does me, check out the book Surfing Uncertainty by Andy Clark.
[1] https://predictive-mind.net/papers/vanilla-pp-for-philosophers-a-primer-on-predictive-processing
06/25/2020
Let me start by saying that the placebo effect is likely exaggerated, and almost certainly clinically insignificant. A study done by Hróbjartsson and Gøtzsche (plus two followup studies) assessed 114 studies and “found little evidence in general that placebos had powerful clinical effects.” The why behind this is discussed at length by Scott Alexander in Powerless Placebos, but I wanted to address the exception that was encountered in that study: “the 27 trials involving the treatment of pain (including a total of 1602 patients) showed a significant effect of placebo as compared with no treatment.”
Why might this be the case?
A quick brief of some key related parts of the brain is needed for this proposed explanation. First, the amygdala, that outdated piece of hardware in your “monkey brain” limbic system. Fear center, aggression, anxiety, yada yada, you know the deal. The amygdala receives reliable news from an ancient part of the brainstem, the periaqueductal gray (henceforth PAG). The PAG modulates pain, so when you prick your finger, it screams to the amygdala to do something about it. The amygdala then relays the news to the dorsal anterior cingulate (dACC) which accounts for your affective experience of the pain, the unpleasantness. We’re almost certain of this; in 1961 Eldon Foltz and Lowell White took people's dACC out and they could report the intensity of pain but are no longer bothered by it.
Onto the cortex!
The cortex as you know is the most recently evolved part of the brain, and it’s got a lot of control over the choices we make (we’re pretty sure this is where the sense of self lives). As a complete shock to absolutely no one, a lot of our behaviors are completely irrational. Our frontal cortex has an area where it is highly connected to our limbic system (which includes the amygdala!), and it is here where those pesky emotions seep into our decision making process. This place is called the ventromedial prefrontal cortex (henceforth vmPFC, very e(m)otional PFC to remember).
On the opposite end of the spectrum, the most rational, calculating, deliberative part of the prefrontal cortex is the dorsolateral prefrontal cortex (henceforth dlPFC, deliberative to remember). When you’re really thinking hard, the dlPFC is where all of the factors you’re taking into account coalesce. It’s important to remember that the brain is highly parallel, meaning at the same time you’re rationally calculating how many skittles are in a jar, your limbic system is anticipating what it’d feel like to win that $50 gift card to Applebee’s if you guess correctly.
All of these parts of the brain are connected to each other to more and less degrees. But here’s the thesis of this post: our thinking brain (i.e. the cortex) can inhibit the pain-processing part of our brain.This is the basis of the pain exception in the placebo analysis.
In this study, Matthew Lieberman placebo’d a group of poor souls with irritable bowel syndrome. Lieberman showed increased activity in the right ventro-lateral PFC and the dlPFC and decreased activity in the amygdala and dACC alongside a (small but significant) decrease in self-reported symptoms and pain.
So that’s pretty cool! It certainly could provide some kind of insight into what’s going on in the brains offreaks like David Blaine who can seem to withstand ridiculous amounts of pain without flinching. I cede any further explanation to other fields like psychology or cognitive behavioral science, because even though my recent fascination with neuroscience has provided me a lot of insight into what’s going on under-the-hood, it feels like a trap to think “everything is explainable by neuroscience.” It’s just one of many cool lenses to view the world through.