Abstract: |
Punishment as a mechanism for promoting cooperation has been studied
extensively for more than two decades, but its effectiveness remains a matter
of dispute. Here, we examine how punishment's impact varies across cooperative
settings through a large-scale integrative experiment. We vary 14 parameters
that characterize public goods games, sampling 360 experimental conditions and
collecting 147, 618 decisions from 7, 100 participants. Our results reveal
striking heterogeneity in punishment effectiveness: while punishment
consistently increases contributions, its impact on payoffs (i.e., efficiency)
ranges from dramatically enhancing welfare (up to 43% improvement) to severely
undermining it (up to 44% reduction) depending on the cooperative context. To
characterize these patterns, we developed models that outperformed human
forecasters (laypeople and domain experts) in predicting punishment outcomes
in new experiments. Communication emerged as the most predictive feature,
followed by contribution framing (opt-out vs. opt-in), contribution type
(variable vs. all-or-nothing), game length (number of rounds), peer outcome
visibility (whether participants can see others' earnings), and the
availability of a reward mechanism. Interestingly, however, most of these
features interact to influence punishment effectiveness rather than operating
independently. For example, the extent to which longer games increase the
effectiveness of punishment depends on whether groups can communicate.
Together, our results refocus the debate over punishment from whether or not
it "works" to the specific conditions under which it does and does not work.
More broadly, our study demonstrates how integrative experiments can be
combined with machine learning to uncover generalizable patterns, potentially
involving interactions between multiple features, and help generate novel
explanations in complex social phenomena. |