headerphoto

Hanson’s Futarchy

Using text-authoring markets to write encyclopedias or to determine the final content of open-source software packages are relatively plausible applications of prediction markets, though these may not exist in the next few decades. Let us now turn to the least plausible proposals, those that could fundamentally remake the legislative core of democratic governance but seem exceedingly unlikely to happen, at least in our lifetimes. In part, my desire to consider such possibilities is similar to the desire that leads some to write science fiction, to imagine worlds yet unknown but that perhaps might someday come to be. There is, however, a more immediate reason. If prediction markets indeed have many of the useful attributes that I have identified–efficiently aggregating information, lowering the cost of decision making, providing for the possibility of consistently moderate decision making, avoiding excessive influence by special interests–then it should be possible to create a market-based legislature that would produce better results than do existing alternatives.

If I were to concede that prediction markets could not do that, then I would need to identify a more fundamental problem with prediction markets. While recognizing the possibility that any number of hurdles, such as the danger of manipulation (see Chapter 1), could prove insuperable, I see no technical reason why market-based legislatures would necessarily fail, if in fact prediction markets turn out to be a robust means of aggregating information. Therefore, I will make a theoretical case for using prediction markets at the core of government, temporarily setting aside problems of transition, democratic legitimacy, and democratic participation, all of which admittedly are decisive in favor of the status quo in the short term and even in the fairly long term.

Before considering my own proposal, however, I will consider another. Robin Hanson, the originator of prediction markets, has sketched out a vision that he calls “futarchy,” imagining that this might arise in some country after successful experiments with prediction markets for corporate and administrative agency decision making have been conducted.17 At the heart of Hanson’s proposal is the use of conditional markets to estimate the effect of proposed policies on a measure of national welfare. Initially, Hanson suggests, futarchy might depend on an existing imperfect measure of national welfare, such as gross domestic product (GDP).18

The existing legislature, however, could pass legislation to change the welfare measure, producing what he calls GDP+, a measure that “could include measures of lifespan, leisure, environmental assets, cultural prowess, and happiness.”19 Any policy that, according to a prediction market, would clearly improve GDP+ would be automatically adopted, at least unless another prediction market produced an opposite forecast. The legislature would continue to have a role in determining the ends that the state should seek, but prediction markets would determine how we would get there. Hanson thus appropriately titles his paper “Shall We Vote on Values, But Bet on Beliefs?”

Hanson’s vision is not very different from mine, and he offers cogent responses to thirty criticisms of his proposal. Many small decisions, however, will have such a minor impact on welfare that the market will never “clearly” recommend them.20 As discussed in Chapter 7, conditional prediction markets may each have enough noise that comparison of the market prices is meaningless. This may be an even greater problem than Hanson allows, because probably only a relatively small percentage of policy proposals would be big enough to have a significant impact on GDP+. Suppose, for example, that the level of precision of each conditional prediction market would be within 0.1 percent. That would mean an error of approximately eleven billion dollars based on current values, so proposals whose impact would be in the millions or low billions of dollars could not be clearly evaluated. Greater accuracy might be achieved, and the errors in the two prediction markets assessing the worlds in which the condition is and is not met might be highly correlated, but still there will be some range of policies that are simply too inconsequential to be assessed with conditional markets (see Chapter 7).

Hanson offers two general solutions. First, he notes that many small changes, individually too minor to make a marked difference in welfare, might be added together into a single proposal. But one might then want some type of system for determining how to decide which policies should be aggregated. Otherwise, some bad proposals could be bundled with good ones. Sometimes the market might reject a proposal that on balance would be good because it would recognize that rejection of the proposal would lead to a better proposal that subsequently would be adopted, but some pretty good yet far-from-perfect proposals might still be adopted because any additional improvements would not have measurable effects on welfare.

An optimistic response is that later prediction markets might reverse this problem. Someone who is eager to advance a good proposal will have an incentive to bundle it with other good proposals, including those that undo previous bad proposals. At least in some cases, however, the initial approval of the bad proposal will have had some immediate effect, such as government expenditures, and so bad policy choices will not always be easily reversible. It might take a long time before all of the good proposals are exhausted and cannot be used as cover for bad ones, and there might be many unnecessary policy shifts in the interim.

Second, Hanson suggests that his policy could be applied recursively to approve other prediction market schemes. For example, legislation might approve widespread use of prediction markets based on more narrow criteria. A bill, Hanson suggests, might create a new general policy that sports stadiums should be built “whenever markets estimated that it would increase some measure of regional welfare, or of stadium profitability.”21 A general stadium bill, however, might have only a negligible impact on GDP+. Moreover, there might be little incentive to identify the appropriate more localized criterion. A general policy relying on a weak proxy might improve social welfare to some extent and thus be approved, though another hypothetical policy relying on a better proxy might improve social welfare more. For example, the localized stadium regime might not take into account all relevant effects on the environment or the community. Once again, that could be fixed in later bundles of market proposals, but a mechanism that can make small improvements to already good proposals before they are enacted might be superior to one that counts on future proposals to fix imperfections.

Having a perfect measurement of GDP+ would not guarantee optimal policies if the prediction market would endorse localized prediction markets relying on simpler measures. Futarchy might well overemphasize easily measurable variables compared to less easily measurable ones. Hanson anticipates this objection, illustrating with the example that it might not be good to reward teachers based on test scores if other important outcomes of teaching are difficult to measure.22 Hanson’s response is that there may be proxies besides test scores for other outcomes of teaching. It might not make sense to reward teachers on the basis of those proxies, because teachers (like nonteachers) are risk-averse. But decisions about whether to enact particular policies could be based on these proxies, because any risk aversion of market speculators should have only de minimis effects on predictions. Indeed, we have noted already that noisy ex post measures do not doom prediction markets (see Chapter 3). Both the measure of GDP+ and the measures to be applied to localized prediction markets could rely on these relatively noisy proxies.

But what are these noisy proxies? In the teacher reward context, perhaps one might use a measure of how many students decide to transfer from public to private school, or, to gauge generalized learning, one might assess how well students performed on tests for other courses. Such proxies, however, not only might be noisy but also might not in an expected value sense capture the underlying issue of interest, namely, the extent to which a teacher is effective in ways that cannot be captured by test scores. Ultimately, the best proxy will often be a subjective judgment. In the teacher reward context, that subjective judgment might be a measure of student satisfaction or an outside evaluation of teaching. In the policy context, a subjective judgment would take into account all of the soft variables for which objective proxies cannot easily be developed. Perhaps recognizing the benefit of subjective judgment, Hanson suggests that futarchy sometimes might adopt my proposal for predictive cost-benefit analysis in particular settings.23 In my view, Hanson has it backward. Aggregating subjective judgment should be the core task of a hypothetical prediction market government, which in many cases might then decide to create localized prediction markets to forecast objective measurements.

 

Leave a Reply