If you've ever wondered how tools like PioSolver, GTO+, or our own analyzer can tell you the "optimal" play in any poker situation, you're about to get answers. This article breaks down the math and algorithms that power modern poker solvers—no PhD required.
Game Theory Optimal (GTO) poker is a strategy that cannot be exploited by opponents. If you play perfectly GTO, the best your opponent can do is break even against you (in a heads-up scenario), no matter what adjustments they make.
This doesn't mean GTO is always the most profitable strategy—exploitative play against weak opponents makes more money. But GTO provides a strong baseline strategy and protects you from being exploited by skilled players.
Most modern poker solvers use a technique called Counterfactual Regret Minimization (CFR), developed by researchers at the University of Alberta. Here's how it works conceptually:
Imagine you're at a decision point in poker—say, you have AK on the button after a raise. You have three options: fold, call, or 3-bet.
The solver plays out millions of possible scenarios for each action and tracks the regret of not taking each option. Regret is essentially: "How much better would I have done if I'd chosen action X instead?"
The algorithm plays against itself repeatedly:
After enough iterations, the strategy converges toward Nash equilibrium—the GTO solution.
Let's look at a toy game to understand regret:
# Simplified poker decision: Button vs BB postflop
# Button has top pair, BB has flush draw
# Pot: $100, Effective stack: $100
class SimplifiedSolver:
def __init__(self):
self.regret_sum = {'bet': 0, 'check': 0}
self.strategy_sum = {'bet': 0, 'check': 0}
def get_strategy(self):
# Calculate strategy based on positive regrets
normalizing_sum = sum(max(r, 0) for r in self.regret_sum.values())
if normalizing_sum > 0:
strategy = {
action: max(self.regret_sum[action], 0) / normalizing_sum
for action in self.regret_sum
}
else:
# Default to uniform random
strategy = {'bet': 0.5, 'check': 0.5}
# Add to strategy sum for averaging
for action in strategy:
self.strategy_sum[action] += strategy[action]
return strategy
def get_average_strategy(self):
normalizing_sum = sum(self.strategy_sum.values())
return {
action: self.strategy_sum[action] / normalizing_sum
for action in self.strategy_sum
}
def train(self, iterations=10000):
for i in range(iterations):
strategy = self.get_strategy()
# Simulate outcomes for each action
# (simplified - real solver would traverse full game tree)
bet_ev = self.calculate_bet_ev(strategy)
check_ev = self.calculate_check_ev(strategy)
# Calculate regret for each action
# Regret = (action_value - strategy_value)
strategy_value = (strategy['bet'] * bet_ev +
strategy['check'] * check_ev)
self.regret_sum['bet'] += (bet_ev - strategy_value)
self.regret_sum['check'] += (check_ev - strategy_value)
def calculate_bet_ev(self, strategy):
# Simplified EV calculation
# Opponent folds some %, calls with better/worse hands
fold_equity = 0.30
called_ev = 0.45 * 200 - 0.55 * 100 # Win 45%, lose 55% when called
return fold_equity * 100 + (1 - fold_equity) * called_ev
def calculate_check_ev(self, strategy):
# EV when checking
# Sometimes opponent bets, sometimes we get to showdown
return 0.60 * 100 + 0.40 * 50 # Simplified calculation
This toy example shows the core loop: calculate regrets, update strategy, repeat. Real solvers like PioSolver use this same principle but with:
The problem with CFR is scale. Even a simple postflop scenario has:
This creates billions of decision nodes. Traditional solvers like PioSolver handle this by:
We've built our solver with a fundamentally different approach focused on speed and accessibility:
Instead of solving every spot from scratch, we've pre-computed solutions for the most common scenarios:
This means many queries return results instantly from our database rather than requiring minutes of computation.
For novel spots not in our database, we use a neural network trained on millions of solved poker scenarios. This network can:
# Simplified conceptual model of our neural network approach
class GTOApproximator:
def __init__(self):
self.model = self.load_trained_model()
def encode_game_state(self, hand, board, position, action_history, stack_depth):
"""Convert poker situation into neural network input features"""
features = []
# Hand strength features
features.extend(self.hand_strength_features(hand, board))
# Board texture features
features.extend(self.board_texture_features(board))
# Strategic features
features.extend([
self.position_encoding(position),
stack_depth / 100, # Normalize stack depth
len(action_history) / 10, # Action count
])
# Action history encoding
features.extend(self.encode_action_history(action_history))
return features
def predict_strategy(self, game_state):
"""Predict GTO strategy for this game state"""
features = self.encode_game_state(*game_state)
# Neural network outputs probabilities for each action
action_probs = self.model.predict(features)
return {
'fold': action_probs[0],
'call': action_probs[1],
'raise_small': action_probs[2],
'raise_medium': action_probs[3],
'raise_large': action_probs[4],
}
Our model is trained on over 50 million solved poker situations, learning the patterns that make strategies GTO without needing to traverse the full game tree every time.
Pure GTO is just the starting point. We layer on exploitative adjustments based on:
This hybrid approach gives you:
Building our solver required:
We're transparent about our tradeoffs:
| Solver | Accuracy | Speed | Ease of Use | |--------|----------|-------|-------------| | PioSolver | 99%+ | Hours | Expert | | GTO+ | 98%+ | Minutes | Intermediate | | Exploit Coach (DB) | 99%+ | Instant | Beginner | | Exploit Coach (Neural) | 95%+ | <500ms | Beginner |
For studying standard spots deeply, traditional solvers win on accuracy. For getting quick feedback during review sessions or in-game decisions, our approach is significantly faster while maintaining high accuracy.
Use Database Mode (99%+ accuracy) when:
Use Neural Network Mode (95%+ accuracy) when:
Use Exploitative Mode when:
We believe the future of poker study tools is:
Traditional solvers will always have their place for deep research, but everyday players need something faster and more practical.
Want to see the difference? Sign up for our beta and analyze your first hand for free. Compare our results to your current solver and judge for yourself.
Questions about our approach or the math behind GTO? Join our Discord community where we discuss poker theory, share solver insights, and help each other improve.
This article represents our current technical implementation as of October 2024. We're constantly training new models and adding pre-computed solutions to improve both speed and accuracy.