Counterfactual regret minimization

This is an article about the first algorithm to solve a game of poker. Chess can be more easily solved because everything is open. Checkers the same. But poker has a very large level of uncertainty since some of the cards held by others are unknown. And the game that has been solved is only the two-person head-to-head version, not the game where half a dozen players are sitting around the table. But what interested me was the conceptual mechanism used, which is how I think most people behave in most genuinely risky situations.

The algorithm, named CFR+ by its creators, uses an improved version of a technique called counterfactual regret minimization (CFR). Past CFR algorithms have tried to solve poker by using several steps at each decision-point: coming up with counterfactual values representing different game outcomes; applying a regret minimization approach to figure out the strategy leading to the best outcome; and averaging the latest strategy with all past strategies.

Yet in the real world, although one can say that over time, this is the strategy that works, it doesn’t necessarily work all the time and other strategies can be successful in the short run for a minority of players. And therefore recessions. This is particularly the case in the financial sector, which is why everyone should be forced to play with their own money, and to personally absorb the losses when things go wrong.

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.