Jeffrey Heninger, 22 November 2022
AI Impacts is a analysis group with seven workers. From Oct 31 – Nov 3, we had a piece retreat. We determined to strive utilizing Manifold Markets to assist us plan social occasions within the evenings. Listed here are some notes from this experiment.
Construction of the Experiment
Katja created a bunch on Manifold Markets for AI Impacts, and an preliminary assortment of markets. Anybody might add a market to this group, and 5 of us created at the very least one market. Every of us would price every night from 0 to 10 on an nameless Google kind. Many of the questions within the group have been in regards to the outcomes of the shape, typically conditional on what exercise we’d do this night. For instance: “On the primary day that at the very least 4 individuals start a recreation of One Night time Werewolf on the AI Impacts retreat, will the typical night ranking be above 8?” The markets would resolve in some unspecified time in the future the subsequent morning after we had submitted our varieties and Katja calculated the typical night ranking.
Disagreements in regards to the Experiment
There have been a number of disagreements about how the experiment was presupposed to be run.
Initially, the function of the night ranking kind was unclear. Was it asking to your sincere evaluation of the night or was it a part of the sport? “What quantity would you wish to assign to the night?” is completely different from “How good was your night actually?” We determined that we wished sincere responses. Even then, the numbers have been ambiguous. What constitutes a 7 night vs. a 9 night? Totally different individuals’s baselines lead to completely different scores, which might alter the typical. After the primary night, we had a greater estimate of the baseline. Most of the markets had used a mean rating of above 8, which was increased than the baseline. This made the markets really feel much less helpful, as an alternative shifting the predictions to decrease chances whereas remaining helpful. It’s not clear why this occurred, however it may need been as a result of we didn’t need to wager in opposition to ourselves having an excellent time or as a result of the tail of an unknown distribution is tougher to foretell than the center of the distribution.
One morning, Katja instructed us the typical rating earlier than resolving her markets. Zach used this info to wager on these markets. Rick thought that it was unclear whether or not this ought to be allowed, as a result of not everybody was there and since the earlier dialogue about sincere rankings prompt that we must always ask earlier than doing one thing which may give a bonus impartial of prediction capacity. We determined that this is able to not be allowed sooner or later, and that we’d not inform one another the outcomes of the markets earlier than resolving them.
Unrealized Potential Issues
We considered a number of different potential issues that didn’t find yourself being a difficulty.
One potential concern was that the interaction between the dynamics of the market and social occasions may make the socialization worse. Somebody who had wager in opposition to having an excellent night may need much less purpose to need the night to be gratifying to himself and others. If individuals hung out through the night interested by and regularly betting on the markets, it’d disrupt the continuing actions. In observe, whereas individuals did wager on the markets within the night, it didn’t disrupt the opposite actions.
We had a number of different concepts for the right way to mess up the markets: filling out the nameless kind a number of occasions, colluding or bribing individuals to change their scores, publicly filling out your kind earlier than the night begins to control the market, and purposely attempting to thwart different individuals’s intelligent methods. None of us tried doing any of those, however they may turn out to be related if the stakes have been increased. There’s additionally the priority that conditional and counterfactual predictions should not the identical: For determination making, we wish to evaluate varied counterfactuals, however it’s simpler to make markets that are conditional on us doing one thing. If we resolve to try this factor, it’s in all probability as a result of at the very least a few of us need to do it, so the conditional prediction shall be increased than the counterfactual prediction.
What We Did within the Night
The aim of the markets was to assist us plan out social occasions within the evenings. If the market thought that the night’s ranking can be extra prone to be increased if we wore halloween costumes than if we used the recent tub, then we must always resolve to put on halloween costumes.
Individuals largely didn’t use the markets to resolve what to do. On the primary night, the best rated exercise was a guitar sing-along. We didn’t find yourself doing that on any of the evenings. The exercise that appears to have been essentially the most enjoyable for the most individuals was cooperative round-the-table ping-pong. This was accomplished spontaneously, including extra individuals as they got here to the desk, with none market predicting the consequence. We spent a good period of time simply sitting round speaking to one another, which additionally didn’t have a market. Our determination making course of appeared to be much less formal: somebody would recommend an exercise or say that they’d personally do the exercise, and different individuals would be part of. Having somebody have a look at the markets and announce which exercise rated the best would have added extra steps and group in comparison with what we did.
We additionally tried various the construction of the markets to see if that made them extra helpful. For instance, the market “Will we use the recent tub and have enjoyable tonight?” had 4 decisions for the mixtures of whether or not or not at the very least 4 individuals would use the recent tub and whether or not the typical night ranking can be above or under 7. Katja did use this market to argue that individuals ought to use the recent tub.
There appears to have been a number of issues that stored the markets from being extra helpful: (1) Most of us didn’t know what sorts of social actions many of the remainder of us most popular, so it was arduous for anybody to make an knowledgeable wager. It wasn’t clear how the market supplied extra info than if we had used a voting system. (2) The connection between 4 individuals doing an exercise and the typical night ranking was too weak for a lot of a sign to undergo. The rankings ended up being noisy, and never particular sufficient for specific actions. (3) The act of checking the markets and saying a call was extra formal than our precise determination making course of. The market solely included a brief record of potentialities and didn’t recommend spontaneity.
Conclusion
Having prediction markets for the night social actions was a enjoyable addition to the AI Impacts retreat. There have been about 20 markets in regards to the retreat which the general public on the retreat wager on. However the markets didn’t find yourself having a major impression on what we did through the night.
Most of us didn’t have expertise utilizing prediction markets earlier than the retreat. We determined to not use the markets to make essential choices, as a result of we didn’t know what issues they’d trigger. The markets would seemingly have been extra impactful if we have been extra skilled and if the questions have been about extra essential choices. If we did use the markets for essential choices, we must ensure that the markets are tougher to take advantage of and have extra guidelines and fewer norms governing how we’d wager on the markets.
Because the retreat, Katja has used a market to assist plan an AI Impacts dinner. We plan to proceed experimenting with utilizing prediction markets to make predictions sooner or later.