Dispatch #108: Crossing the river by feeling the stones
A deep dive into China’s policy experimentation machine—how it learns, where it falters, and why even "feeling the stones" can sometimes lead policymakers astray.
When China began to reform its economy in the late 1970s, it faced a paradoxical challenge: how could a rigid, centrally planned state introduce market forces without losing control? Deng Xiaoping, the chief architect of China’s economic rise, captured the answer in a metaphor that has since become foundational to modern Chinese governance:
We must cross the river by feeling the stones.
This wasn’t just an ordinary phrase. It was a radically different way of thinking about reform—not as a single leap into the unknown, but as a cautious, incremental journey guided by trial and error. It implied that certainty would emerge not from ideology, but from practice. That progress would come not from blueprint-perfect models but messy, grounded experimentation. Policies would not be imposed from above with absolute confidence. Instead, they would be tried out in a few places, studied, adjusted, and only then scaled across the country.
This is the crux of China’s policy-making approach over the past four decades. It stands in stark contrast to how many other governments—especially in the West—tend to operate, swinging between sweeping reforms and abrupt reversals depending on political winds.
Raghuram Rajan explains this here-
But Deng’s metaphor was only the beginning. To truly understand how China institutionalized this philosophy at scale, we must turn to the work of political scientist Yuen Yuen Ang.
In her influential book How China Escaped the Poverty Trap, Ang introduces a compelling concept: “directed improvisation.” It’s an oxymoron at first glance—how can something be both directed and improvised?
But that’s precisely the genius of China’s system.
In Ang’s framework, China’s rise wasn’t due to top-down planning alone, nor was it a case of spontaneous bottom-up innovation. It was a blend—a dynamic process where the central government set strategic goals (the “direction”), while local governments were empowered to experiment with how best to meet those goals (the “improvisation”).
Local officials, operating under the watchful eye of the Party, were given room to pilot new ideas, adapt policies to local conditions, and even challenge conventional practices—so long as they stayed aligned with overarching national priorities. Successes were emulated. Failures were quietly abandoned. In this system, learning through doing was not just tolerated—it was institutionalized.
This is how Ang unpacks the directed improvisation:
What Deng called feeling the stones became a full-fledged governance technology—a way for a massive, diverse, and authoritarian state to adapt flexibly to complex challenges. It allowed China to sidestep the ideological rigidity that had plagued earlier socialist experiments and instead develop a governance model that rewarded learning, adaptation, and pragmatic problem-solving.
The result? A country that could launch a special economic zone in Shenzhen, test it, study it, and—once the results were in—replicate the model across the country. A country where carbon trading, rural insurance, property tax reform, and e-governance were all first incubated in select cities before going national.
But here’s the big question: does this system work? Is policy learning in China as effective as it looks from the outside? Are these stones truly helping China cross the river—or are they sometimes just slippery illusions?
To answer this, we turn to the groundbreaking work of Shaoda Wang and David Y. Yang. Their paper, “Policy Experimentation in China: The Political Economy of Policy Learning,” unpacks this experimentation machine using an enormous dataset of over 19,000 government documents and 652 policy trials conducted between 1980 and 2020.
What they find is fascinating—and a bit unsettling.
Their research shows that while China's policy experimentation is indeed vast and unique in scale, it is also deeply entangled in political incentives, elite bias, and flawed inference. As we’ll see in the sections ahead, the process of “feeling the stones” is not always as objective or effective as it may seem.
But before we dive into their findings, let’s remember what makes China’s model so distinctive: a willingness to pilot, learn, adapt, and scale—all within the boundaries of a single-party state. That model, for all its imperfections, remains one of the most important governance innovations of the modern world.
And this paper gives us a rare empirical window into how it works.
What Does Policy Experimentation in China Entail?
Between 1980 and 2020, the Chinese government conducted an extraordinary 652 distinct policy experiments. These initiatives were not fringe tests or bureaucratic footnotes. They involved major reforms—property taxation, carbon emissions trading, rural insurance, and fiscal decentralization—policies with broad national implications.
This massive experimentation infrastructure was identified and analyzed by Wang and Yang through a meticulous review of over 19,000 official government documents. Among the 652 experiments, approximately 42% eventually became national policy. The remainder faded from view—discontinued without fanfare, formal evaluation, or public discussion. Policy failures in China, as the authors note, are often absorbed quietly into the system, with lessons implicitly learned but rarely broadcast.
What sets this paper apart is not only the scope of the dataset but also the clarity with which it surfaces three core features of China’s experimentation regime—features that reveal both its strengths and limitations.
Finding 1: Elite Selection — Experiments Are Not Representative
The first major insight concerns who gets to experiment. In theory, if the central government’s goal is to understand how a policy will perform when implemented nationally, one would expect it to test the policy in a diverse and representative set of regions, encompassing rich and poor, urban and rural, coastal and inland jurisdictions.
In practice, the vast majority of experiments (87.7%) were launched in economically advanced regions. On average, experimentation sites were 44.2% richer (measured by local fiscal revenue) than non-experimentation sites. Even policies designed for poor or rural constituencies—such as agricultural insurance or poverty alleviation—were disproportionately piloted in more prosperous areas.
This skew is not entirely surprising. Wealthier jurisdictions tend to have more capable bureaucracies, stronger infrastructure, and greater implementation capacity. But Wang and Yang also point to deeper political dynamics. Local governments often compete to be chosen as pilot sites, while central ministries are incentivized to select jurisdictions that will ensure the policy performs well, what we might call "policy safe havens."
The result is a significant selection bias that undermines generalizability. Policies trialed in elite conditions often cannot be seamlessly scaled to average or under-resourced regions. The lessons learned, while useful in part, may offer a distorted view of a policy’s true effects.
Finding 2: Policy Experiments Look Great — But Only on Stage
Let’s imagine you're a local government official in China. One day, the central government selects your city to test a new policy—say, a program to improve rural healthcare or streamline business licenses. You know something very important: if this policy succeeds and gets rolled out nationally, your name will be associated with a success story. And in China’s political system, that could mean a big promotion.
So, what do you do?
You pull out all the stops. You divert more funding toward the program. You assign your best staff to oversee it. You make sure every step is monitored, reported, and tweaked to look good. You’re not just implementing a policy—you’re staging a performance. The goal is to make the policy look as effective as possible during the trial.
This is exactly what Wang and Yang found. During policy experiments, local governments spend significantly more money in the area being tested—about 1.3% more of their total budget. That’s a lot when we’re talking about municipal budgets. And this effort is especially strong in places where local leaders are young and ambitious—those who have more to gain from a successful experiment.
But here’s the catch: this extra effort disappears once the policy goes national.
When the same policy is implemented in other cities or counties—places that weren’t part of the experiment—there’s no special attention, no budget bump, and no incentive for local leaders to go the extra mile. The policy becomes just another item on a long list of government tasks. And as a result, it doesn’t perform nearly as well.
The researchers call this the "voltage drop" problem: the idea that energy and effectiveness often drop off sharply when a policy moves from pilot to scale. In this case, they found that over 70% of policies performed worse after being scaled up compared to how they performed during the trial phase.
In other words, many of these impressive policy experiments were not sustainable models—they were exceptional efforts made under exceptional circumstances.
This finding raises a big question: can we trust the results of policy trials if they’re driven by temporary incentives? If a policy only works because people are trying to impress their bosses, is it a good policy?
Wang and Yang suggest that unless the central government recognizes and adjusts for this performance inflation, it risks learning the wrong lessons—mistaking effort for effectiveness, and confusing showmanship for sustainability.
Finding 3: The Central Government Sometimes Misreads the Results
So far, we’ve seen that policy experiments in China often take place in rich areas and that local officials pull out all the stops to make those experiments look like a success. But that leads to an important question:
Is the central government aware of these distortions when it evaluates whether a policy should go national? Does it carefully separate real, policy-driven success from results that were just due to lucky timing or political theatre?
According to Wang and Yang, the answer is: not always.
They uncover strong evidence that the central government sometimes mistakes noise for signal—that is, it confuses outside factors that have nothing to do with the policy itself for proof that the policy is working.
Let’s break this down with two examples they explore in detail:
Example 1: The Land Revenue Windfall Illusion
In China, one of the main ways local governments make money is by selling land to developers. When the real estate market is booming and interest rates are low, these land sales shoot up, and cities suddenly find themselves flush with cash. This means they can build more infrastructure, provide better public services, and hit their economic targets more easily.
Now, imagine one of these cities is also a site for a policy experiment.
If the experiment coincides with a big land revenue windfall, completely unrelated to the policy, it may look like the policy is doing great. But in reality, it’s the extra money from land sales that’s lifting performance, not the policy itself.
Wang and Yang show that the central government often doesn’t account for this. When evaluating the experiment, it sees good results and assumes it must be because the policy worked. As a result, policies that happen to be tested in cities with lucky fiscal windfalls are more likely to be adopted nationwide, even though the windfall had nothing to do with the reform being tested.
Example 2: Leadership Turnover and Performance Bumps
Here’s another subtle problem.
Let’s say a policy experiment begins under one local leader, but halfway through, a new Party Secretary takes over—someone who’s younger, more ambitious, and eager to climb the political ladder. This new leader pushes hard to make the experiment succeed, not necessarily because the policy is amazing, but because they want to be noticed by Beijing.
Wang and Yang find that these mid-experiment leadership changes—especially when they bring in officials with stronger promotion incentives—often lead to spikes in local performance. Again, the policy looks successful, but the driving force is the new official’s energy and ambition, not the policy’s actual design.
And again, the central government often fails to filter this out. It may treat the experiment as a success and scale up the policy, without realizing that in most places, where there’s no leadership change or no extra ambition, the policy may fall flat.
These findings point to a critical challenge in policy learning: just because something works in one place doesn’t mean it will work everywhere. And if decision-makers don’t fully understand why it worked in the first place, they risk drawing the wrong lessons.
Wang and Yang argue that the Chinese central government sometimes lacks the analytical sophistication to properly interpret the results of its experiments. It doesn’t always separate the effects of the policy from the effects of temporary, local factors like windfalls or political turnover.
Drawing on a conceptual framework inspired by Al-Ubaydli, List, and Suskind, the authors argue that effective policy learning depends on three elements:
The policy itself
The site where it is tested
The conditions under which it is implemented
When the second and third elements are biased or non-representative, conclusions about the first are inevitably flawed.
Deng Xiaoping’s dictum—crossing the river by feeling the stones—remains one of the most evocative metaphors in modern political thought. It captures a state’s commitment to pragmatic, iterative reform, grounded in empirical learning rather than ideological rigidity.
Yet as Wang and Yang demonstrate, the stones underfoot may not always be stable. They may be carefully chosen, polished for display, or arranged to impress rather than to instruct. The riverbed itself may shift depending on who is doing the crossing, and under what incentives.
Still, China’s experimentation regime remains one of the boldest governance innovations of the 20th and 21st centuries. It institutionalizes learning at scale, embraces policy variation, and recognizes uncertainty as an inevitable feature of development. For scholars and practitioners alike, this paper offers a valuable empirical anatomy of that system—its ambition, its mechanics, and its limitations.
In short, even a powerful and centralized state like China can struggle with the same challenge that plagues governments everywhere: mistaking short-term success under special conditions for long-term, scalable policy effectiveness.
Learn more about the paper here:
Love to read your articles!
This article has windfall of gains for this reader. Learnt how control and delegation can work as a team. We should think twice about the context and background before taking a big leap. Development lies in details.