Cultural Contagion: How Ideas Spread on the Grid
Consider a puzzle that has occupied anthropologists, linguists, and political scientists for decades: why, in a world of rapid global communication, do cultural differences persist?
The homogenization hypothesis predicts that culture should converge. If people influence each other — adopting each other’s practices, beliefs, and tastes — then the long-run equilibrium should be a single global culture. The mathematics of social influence models supports this: if everyone is connected to everyone, and influence is proportional to contact, the steady state is uniformity.
And yet. The linguistic map of Europe, after centuries of contact, trade, conquest, and mass media, still shows dozens of distinct languages, hundreds of dialects, and thousands of cultural microregions. The political map of the United States, after 250 years of mass communication, shows not convergence but polarization — deepening regional differentiation that seems to intensify as communication technology improves.
Robert Axelrod, a political scientist at the University of Michigan, proposed in 1997 that the puzzle could be resolved with a simple cellular automaton. His paper, “The Dissemination of Culture: A Model with Local Convergence and Global Polarization,” published in the Journal of Conflict Resolution 41(2):203–226, showed that the mechanism responsible for cultural convergence — social influence — is the same mechanism responsible for cultural divergence. Local similarity and global fragmentation are not opposing forces. They are two faces of the same local rule.
The Axelrod Model: Precision and Elegance
The Axelrod model is precise enough to state completely in a paragraph.
Each agent occupies a node in a 10×10 (or larger) grid. Each agent’s culture is described by a vector of F cultural features — think of these as dimensions of cultural variation: language, religion, musical taste, culinary tradition, political affiliation, and so on. Each feature takes one of q possible trait values — the number of possible variants within each dimension. For a language feature with q = 5, the five values might represent five distinct languages. For a musical preference feature with q = 10, the ten values might represent ten genres.
At each step of the simulation, a focal agent and one of its four neighbors are selected at random. The two agents interact with a probability equal to their cultural overlap — the fraction of features on which they currently agree. If they agree on 3 of 5 features, they interact with probability 3/5; if they agree on 5 of 5, they interact with probability 1 (always); if they agree on 0 of 5, they interact with probability 0 (never). When an interaction occurs, the focal agent adopts the neighbor’s trait value on one of the features where they currently disagree, chosen at random.
Two consequences of this rule are immediate. First: culturally similar neighbors become more culturally similar — interaction happens more, and each interaction increases overlap. This is local convergence. Second: culturally dissimilar neighbors become culturally frozen in their dissimilarity — interaction probability falls as overlap falls, eventually reaching zero for agents with no shared features. This is what produces global fragmentation.
The Result: Local Convergence, Global Polarization
Axelrod ran the model to equilibrium — the state where no further interaction is possible, meaning every pair of adjacent agents either has identical culture or zero cultural overlap — and asked: how many distinct cultural regions result?
The answer depended on the model parameters in a specific way. The critical variable was the ratio F/q: the number of features relative to the number of trait variants per feature.
When q is small (few variants per feature), agents start with high average cultural overlap, interaction is frequent, and the model converges to a single global culture — complete homogenization. With only two possible trait values per feature (q = 2), even randomly initialized agents are, on average, 50% similar, and the positive feedback of interaction quickly drives convergence.
When q is large (many variants per feature), agents start with low average cultural overlap, interaction is rare, and the model freezes in a state of many small cultural regions — complete fragmentation. With a hundred possible trait values per feature (q = 100), the initial probability of any two agents sharing a trait is only 1 in 100, interaction almost never occurs, and the initial configuration barely changes.
Between these extremes is a phase transition. At a critical value of q (which depends on F and the grid size), the model shifts from one dominant global culture to many stable regional cultures. This transition is sharp — a small increase in q near the critical value shifts the outcome from near-total homogenization to near-total fragmentation.
Axelrod’s finding was that the number of stable cultural regions decreases with more features (larger F) and increases with more trait variants per feature (larger q). More interestingly, the number of regions also decreases when geographic territory grows beyond a certain size — larger grids tend toward fewer, larger cultural regions. And most surprisingly, the number of stable regions decreases when the range of interaction is increased (if agents interact with their second and third neighbors as well as first neighbors). More contact produces fewer cultures, not more — but it does not produce one.
Why Borders Freeze
The most counterintuitive result in the Axelrod model is the stability of cultural boundaries. In the final equilibrium, neighboring agents with zero cultural overlap simply stop interacting. The boundary between them is permanent — but it was created by the same process of social influence that produced the cultural regions on either side of it.
How does a boundary form? Consider two agents that begin with some modest cultural overlap — say, one shared feature out of five. Interaction is infrequent, but it happens. When it does, one agent might adopt the other’s trait on the shared feature — which reduces overlap to zero and freezes the boundary. Alternatively, one agent might adopt the other’s trait on a non-shared feature, which could either increase or decrease overlap depending on chance. The dynamics are stochastic, and the final boundary locations depend on initial conditions and random fluctuations during the simulation.
The key insight is that once overlap reaches zero, nothing can change. There is no mechanism in the model for spontaneous cultural innovation, for outside influence, or for agents to “reach across” a cultural boundary. The frozen state is absorbing.
This is where the model captures something real about cultural geography — and where it reveals its own limitations. Real cultural boundaries are not perfectly frozen. They erode through migration, trade, and asymmetric power relationships (conquered people adopt the conqueror’s language; the reverse rarely happens). The Axelrod model captures the tendency toward boundary formation but not the dynamics of boundary erosion.
Noise, Drift, and the Fragility of Cultural Diversity
Axelrod’s original model has no noise — no mechanism for random cultural change. Subsequent work added noise: at each time step, there is a small probability that an agent randomly changes one of its trait values (representing cultural innovation, individual idiosyncrasy, or outside influence).
The results were striking and counterintuitive. Small amounts of noise destroy cultural diversity. In the noise-free model, many small regions can persist stably. With even a small noise probability, these regions are continuously perturbed, and the perturbations prevent the equilibrium from being absorbed. The long-run outcome with noise is a single global culture — complete homogenization.
With large amounts of noise, the outcome is also homogenization, but now through a different mechanism: cultural identity is so unstable that no region can maintain coherence.
The most culturally diverse outcome occurs at zero noise — in the idealized model without any randomness. This is a fragile theoretical result: it suggests that cultural diversity in the Axelrod model depends on the absence of perturbation, which is not biologically or socially realistic.
The practical lesson is not that cultural diversity is impossible but that it requires active maintenance — either through geographic isolation (limiting the reach of perturbations) or through institutional structures that preserve cultural distinctiveness against homogenizing pressure. The model specifies, in formal terms, what those structures need to do: they need to reduce the effective interaction range and/or increase the number of trait variants that agents protect from convergence.
Language Change as CA
Language change is one of the most extensively documented cultural processes, and it is naturally spatial. Isoglosses — lines on a map marking the boundary between dialect features — appear in language atlases as geographic phenomena. The “butter/Butter” boundary between Low German and High German dialects. The cot-caught merger in American English (absent in New England and the South, present in the Midwest and West). The rhoticity boundary in English (which separates r-pronouncing from r-dropping regions).
These spatial patterns can be modeled as CA. Each cell represents a geographic region; the state encodes a set of phonological, lexical, and grammatical features. Cells update by adopting features from adjacent cells, weighted by population size and prestige. The boundaries between dialects — isoglosses — are the frozen boundaries of the Axelrod model, maintained by the self-reinforcing dynamics of linguistic convergence within a region.
The CA framework makes a specific prediction about isogloss bundles: features that change together geographically (multiple isoglosses coinciding at the same boundary) should be features that entered a region at the same time, carried by the same historical migration or prestige diffusion event. This prediction is broadly supported by the historical linguistics literature — major dialect boundaries correspond to historical population boundaries, and the coinciding isoglosses represent features that diffused together.
Memetics: Dawkins’ CA in Everything
In 1976, Richard Dawkins published The Selfish Gene, which introduced the concept of the meme: a unit of cultural information that replicates, mutates, and competes for attention in a population of minds. Dawkins drew an explicit analogy between genes (the units of biological evolution) and memes (the units of cultural evolution): both are replicators that persist by copying themselves into new hosts, and both are subject to selection pressure that favors variants that replicate better.
The meme concept maps directly onto the CA framework. Agents on a social grid are cells; memes are state variables; the rules by which memes spread from cell to cell are the update rules. The fitness landscape of a meme — the conditions under which it spreads versus dies out — is determined by the local rule: how likely is a neighboring agent to adopt this meme, given the current state of their own belief system?
Axelrod’s cultural dynamics model is, in meme terms, a model of meme bundles (the F features) that spread and compete under a specific fitness rule (cultural overlap determines interaction probability). The phase transition from homogenization to fragmentation is the transition between a regime where all memes spread globally and a regime where memes are contained within regional clusters.
The CA framing of memetics suggests a productive reformulation of some classic memetics debates. The question “why do conspiracy theories spread?” becomes: what is the local rule that makes this meme-bundle spread to neighbors who are already disposed toward adjacent meme-bundles? The question “why does language X die out?” becomes: what changes in the local interaction structure — migration, prestige shifts, institutional changes — altered the CA rule in ways that favored language Y?
These are precise questions. The CA framing does not answer them, but it specifies what an answer would look like: a local rule and a parameter value.
The Insight That Connects to Life
The connection between Axelrod’s cultural dynamics model and Conway’s Game of Life is not metaphorical. It is mathematical.
Both are two-dimensional grids of discrete cells with local update rules. In Life, a cell is alive or dead; in Axelrod’s model, a cell has a cultural vector. In Life, the update rule is deterministic (B3/S23); in Axelrod’s model, it is probabilistic (interaction probability = cultural overlap). In Life, patterns either grow, stabilize, or die; in Axelrod’s model, cultural regions either grow, stabilize, or get absorbed.
The deeper parallel is the one that matters: in both cases, the global pattern — the distribution of live cells, or the distribution of cultural regions — is entirely determined by local rules, and it cannot be predicted from the local rules without simulation. You cannot look at B3/S23 and deduce the existence of the glider. You cannot look at Axelrod’s F and q parameters and deduce exactly which cultural regions will survive.
This is the fundamental lesson of both models: complex global structure does not require global coordination. It requires local rules and time.
Cultural diversity is not maintained by international institutions or deliberate preservation efforts — or not primarily. It is maintained by the same local interaction dynamics that created it. The grid runs itself.