Here are some ideas for a new sort of game theory based on Item Response Theory (http://en.wikipedia.org/wiki/Item_response_theory). I would like to implement them in a python library. Is anybody on this list interested in being part of that? I will probably just post it on github.
* Mathematical Idea
We imagine that we have actors playing a simple game. The game has only one turn, and each player has a list of possible moves, and they randomly choose to either play or not play each move that they are allowed to play. The choices are independent.
We assume that every possible move is played by every player with probability
P(move M | player A) = sigmoid( a_1 * g_1 + a_2 * g_2 + ... + a_k * g_k)
where the a_i are the private and public variables of the actor A, and the g_i are functions of M specified by the rules of the game.
This is actually very, very clever! We are using the mathematical trick (duality) used in the definition of a Nash equilibrium, but we are thinking of it in a Bayesian framework, rather than in a linear programming optimization framework. Fixed points of the computational process described below should correspond to social equilibria of a more general class of games than is treated by current game theory.
* Model Components and Class Hierarchy - Actor - plans: a distribution over strategies - private: some set of parameters - public: some set of parameters
- Strategy: distribution over roles
- Role: distribution over moves
- Game: model of actors and networks - moves: generator of all possible moves - rules: functions that specify how each move interacts with the private and public parameters of the players involved (if we assume perfect information, then the private parameters can be thought of as all other actors' collective model of the actor in question - player model: IRT-style model of how players choose a move
- Move: atomic interactions in the game
- Network: an instance of a played game - players: actors - turns: each is a list of moves and potential moves (the potential moves correspond to unobserved data)
- IRT Model
* Simulation and training
We assume that we can only observe part of the network and part of the actor structure. That is, we assume that we can observe some moves, but not all, and that we can observe some actor variables, but not all.
We alternate between sleep, waking, and training.
We start with an initialization phase. For each move or non-move in the training network, we randomly decide whether or not to "hide" it from the algorithm with a certain probability. This lets us get an independent check of the model, since we can evaluate it by its ability to correctly guess the hidden data. (We will call the hidden data "holdout data".)
During sleep, we reinfer the variables of each actor based on the moves played and not played during the previous cycle. This is just IRT/logistic regression on the variables a_i, given the observed moves.
During waking, we try to reconstruct the network from scratch by sampling from the strategies of the actors. We assume that all moves are played simultaneously, and that the probability of each move being played is independent of all other moves. This is just sampling from the distribution P(M|A).
During training, we update the rules and models of the game to reinforce good move guesses (places where the waking cycle agreed with the training data) and to eliminate bad move guesses (places where the waking cycle disagreed with the training data). This is just IRT/ logistic regression on the variables g_i. Use this to update score on held-out data.