Stochastic Games

Charles-Amable Lenoir (1860-1926) was a French painter.

The topics are

In many real-life situations, payoffs to people or business are uncertain and they have to make decisions facing this uncertainty
(Timmer et al. \cite[p.595]{tim03}).

\begin{equation}{\label{a}}\tag{A}\mbox{}\end{equation}

Cooperative Games with Random Payoffs.

We consider two firms who want to work together in a temporary R&D project. The profit realization of this project is yet uncertain and will not be known before the project is finished. To decide whether or not to join this project, each firm wants to know beforehand what it gains, i.e., which share of the project’s profit it receives. These shares can be determined by means of negotitations among the firms, whose outcome is highly influenced by the firms’ attitudes toward risk. The agreed-upon profit shares will be written down in a contract so that each firm knows exactly what share of the total, yet unknown, profit it can count on. An alternative approach would be to make plans for profit sharing based on all possible realizations of the profit but this would be rather time-consuming, inefficient and perhaps even impossible (Timmer et al. \cite[p.596]{tim03}).

A game with random payoffs is represented by a tuple \((N,\{R(S)\}_{S\in {\cal S}},\{\succeq_{i}\}_{i\in N})\), where \(N\) is
the finite player set. The nonnegative random payoff to coalition \(S\) is denoted by \(R(S)\) and \({\cal S}\) is the set of coalitions with a nonzero payoff. The preference relation of player \(i\) is represented by \(\succeq_{i}\). Let \({\cal L}^{+}\) be the set of all nonnegative random variables with finite expectation. The payoff zero for sure is denoted by \(0\). Notice that \(0\in {\cal L}^{+}\). The payoff \(R(S)\) to coalition \(S\) is assumed to be an element of \({\cal L}^{+}\). Therefore \({\cal S}=\{S\subseteq N:R(S)\neq 0\mbox{ and }S\neq\emptyset\}\). An allocation of the payoff \(R(S)\) to the members of \(S\) is the scalar multiplication \(R(S)\cdot{\bf p}\), where \({\bf p}\in\mathbb{R}^{S}\) and player \(i\in S\) receives \(p_{i}\cdot R(S)\). Such an allocation is efficient if and only if \(\sum_{i\in S} p_{i}=1\). For ease of notation, we define \(\Delta (S)=\left\{{\bf p}\in\mathbb{R}^{S}:\sum_{i\in S}p_{i}=1\right\}\). The set \(A=\{p_{i}\cdot R(S):S\in {\cal S},p_{i}\in\mathbb{R}\}\) contains all the payoffs that a player may receive in this game. All nonzero payoffs in \(A\) are denoted by \(A_{-0}=\{p_{i}\cdot R(S)\in A:p_{i}\neq 0\}\). The preference relation \(\succeq_{i}\) of player \(i\) on \(A\) means that if \(X\succeq_{i}Y\), then player \(i\) weakly prefers \(X\) to \(Y\). If player \(i\) is indifferent between them, denoted by \(X\sim_{i}Y\), then \(X\succeq_{i}Y\) and \(Y\succeq_{i}X\). If player \(i\) strictly prefers \(X\) to \(Y\), denoted by \(X\succ_{i}Y\), then \(X\succeq_{i}Y\) and not \(X\sim_{i}Y\). We need some assumptions.

Assumption I. If \(R(T)=0\) for some coalition \(T\), then \(R(S)=0\) for all coalitions \(S\) such that \(S\subseteq T\).

Assumption II. For all \(i\in N\), there exists a surjective, coordinatewise strictly increasing and continuous function $latex f_{i}:\mathbb{R}
\rightarrow\mathbb{R}^{\cal S}$ such that the following conditions are satisfied:

  • \(f_{i}^{S}(t)\cdot R(S)\succeq_{i}f_{i}^{T}(t’)\cdot R(T)\) if and only if \(t\geq t’\), for all \(S,T\in {\cal S}\) and \(t,t’\in\mathbb{R}\);
  • \(f_{i}^{S}(0)=0\) for all \(S\in {\cal S}\).

An example of preference relation can be defined as \(X\succeq_{i}Y\) if and only if \(\mathbb{E}(X)\geq \mathbb{E}(Y)\), where \(\mathbb{E}(X)\) denotes the expectation of \(X\). Then, the corresponding functions \(f_{i}\) are \(f_{i}^{S}(t)=t/\mathbb{E}(R(S))\) for all \(S\in {\cal S}\) (Timmer et al. \cite[p.598]{tim03}).

Let the function \(\alpha_{i}:A\times A_{-0}\rightarrow\mathbb{R}\) pick the unique number in \(\mathbb{R}\) such that \(X\sim_{i}\alpha_{i}(X,Y)\cdot Y\). The number \(\alpha_{i}(X,Y)\) may be thought of as the \(Y\)-equivalent of \(X\) according to player \(i\). If \(X=p_{i}\cdot R(S)\) and \(Y=q_{i}\cdot R(T)\), then
\[\alpha_{i}(X,Y)=f_{i}^{T}((f_{i}^{S})^{-1}(p_{i}))/q_{i}.\]
This implies \(\alpha_{i}(p_{i}\cdot R(S),R(S))=p_{i}\). We restrict the use of \(\alpha_{i}(p_{i}\cdot R(S),q_{i}\cdot R(T))\) to \(i\in S\cap T\), because a player only has to consider payoffs from coalitions of which he/she is member. Further, we define \(\alpha_{i}(0,0)=1\). We do not define \(\alpha_{i}(X,0)\) for \(X\in A_{-0}\), because it can be derived from Assumption II that either \(X\succ_{i}0\) or \(X\prec_{i}0\). Hence, there exists no number \(\alpha_{i}\in\mathbb{R}\) such that \(X\sim_{i}\alpha_{i}\cdot0=0\) (Timmer et al. \cite[p.599]{tim03}).

Let \(\mbox{SG}(N)\) be the class of all games \((N,\{R(S)\}_{S\in {\cal S}},\{\succeq_{i}\}_{i\in N})\) with random payoffs. A solutuon concept for this class is a function \(\boldsymbol{\psi}\) on \(\mbox{SG}(N)\) such that \(\boldsymbol{\psi}(\Gamma )\) is an allocation \(R(N)\cdot{\bf p}\) for the game \(\Gamma\in \mbox{SG}(N)\). We denote by \(\mbox{SGL}(N)\) the subclass of \(\mbox{SG}(N)\) for which all the functions \(f_{i}^{S}\) are linear. Equivalently, \(\mbox{SGL}(N)\) is the class of games for which
\[\alpha_{i}(p_{i}\cdot R(S),q_{i}\cdot R(T))=(p_{i}/q_{i})\cdot\alpha_{i}(R(S),R(T))\]
for all players \(i\in N\) (Timmer et al. \cite[p.599]{tim03}).

Given a game \((N,v)\), we recall the marginal vector and Shapley value in (\ref{ga33}) and (\ref{ga58}), which are
re-stated below:
\[m_{i}(v,\pi )=v\left (P(\pi ,i)\cup\{i\}\right )-v\left (P(\pi ,i)\right )\]
and
\[\boldsymbol{\psi}(v)=\frac{1}{|N|!}\sum_{\pi\in\Pi (N)}{\bf m}(v,\pi ),\]
where \(P(\pi ,i)\) is the set of all predecessors of player \(i\). Now, we consider another presentation of the Shapley value. We denote by \(S_{i}^{\pi}=\{\pi (1),\cdots ,\pi (i)\}\) the first \(i\) players according to the permutation \(\pi\), and we define \(S_{0}^{\pi}=\emptyset\). The marginal vector \({\bf m}^{\pi}(v)\) is a vector in \(\mathbb{R}^{N}\) in which the component \(m_{\pi (i)}^{\pi}\) is the marginal contribution of player \(\pi (i)\) to coalition \(S_{i-1}^{\pi}\) and is given by
\[m_{\pi (i)}^{\pi}(v)=v(S_{i}^{\pi})-v(S_{i-1}^{\pi})=v(S_{i}^{\pi})-\sum_{k=1}^{i-1}m_{\pi (k)}^{\pi}(v)\]
for \(i\in N\). The Shapley value \(\boldsymbol{\phi}(v)\) is equal to the average of the marginal vectors, i.e.,
\[\phi_{i}(v)=\frac{1}{|N|!}\sum_{\pi\in\Pi (N)}m_{i}^{\pi}(v)\]
for all \(i\in N\). Based on the above motivation, we are going to define the marginal contribution for the games with random payoffs. Let \(\pi\in\Pi (N)\) and \(\Gamma =(N,\{R(S)\}_{S\in {\cal S}},\{\succeq_{i}\}_{i\in N})\). We denote by \(Y_{\pi (i)}^{\pi}\) the marginal contribution of player \(\pi (i)\) to coalition \(S_{i-1}^{\pi}\), which is defined by
\[Y_{\pi (i)}^{\pi}=\left [1-\sum_{k=1}^{i-1}\alpha_{\pi (k)}(Y_{\pi (k)}^{\pi},R(S_{i}^{\pi}))\right ]\cdot R(S_{i}^{\pi})\]
for \(i=2,\cdots ,n\). Since player \(\pi (1)\) receives his/her individual payoff, we have \(Y_{\pi (1)}^{\pi}=R(\{\pi (1)\})\). This contribution \(Y_{\pi (i)}^{\pi}\) is the remainder of \(R(S_{i}^{\pi})\) after the players in \(S_{i-1}^{\pi}\) received parts that they find
equivalent to their marginal contributions. Assumption I is necessary to avoid situations, where \(\alpha_{\pi (k)}(Y_{\pi (k)}^{\pi}, R(S_{i}^{\pi}))\) is not defined, i.e., where \(Y_{\pi (k)}^{\pi}\neq 0\) and \(R(S_{i}^{\pi})=0\). The marginal vector \(\bar{\bf m}^{\pi}\)
corresponding to the permutation \(\pi\) is the allocation of \(R(N)\), where player \(i\) receives a scalar multiplication of \(R(N)\) that is
equivalent to \(Y_{i}^{\pi}\) for him/her, i.e.,
\[\bar{m}_{i}^{\pi}(\Gamma )=m_{i}^{\pi}(\Gamma )\cdot R(N)\mbox{ with }m_{i}^{\pi}(\Gamma )=\alpha_{i}(Y_{i}^{\pi},R(N)).\]
In a straightforward way, we define the marginal value $\boldsymbol{\phi}^{m}$ for \(\Gamma\) as the average of the marginal vectors.
\[\phi_{i}^{m}(\Gamma )=\left [\frac{1}{|N|!}\cdot\sum_{\pi\in\Pi (N)}m_{i}^{\pi}(\Gamma )\right ]\cdot R(N).\]
(Timmer et al. \cite[p.599-600]{tim03}).

For the second formulation, given a game \((N,v)\), we consider the dividends \(d_{S}(v)\) defined in a recursive way by
\[d_{S}(v)=\left\{\begin{array}{ll}
v(S) & \mbox{if \(|S|=1\)}\\
\frac{1}{|S|}\cdot(v(S)-\sum_{T\subseteq S}|T|d_{T}(v) & \mbox{if \(|S|>1\)}.
\end{array}\right .\]
Now the Shapley value of \((N,v)\) can be written as
\[\phi_{i}(v)=\sum_{\{S:i\in S\}}d_{S}(v)\]
for all \(i\in N\) (Timmer et al. \cite[p.600]{tim03}).

For a cooperative game with random payoofs \(\Gamma\), we define the dividen \(d_{S}(\Gamma )\) of coalition \(S\) as follows:
\[d_{S}(\Gamma )=\left\{\begin{array}{ll}
R(S) & \mbox{if \(|S|=1\)}\\
\frac{1}{|S|}\cdot\left [1-\sum_{T\subset S,T\neq S}\sum_{j\in T}
\alpha_{j}(d_{T}(\Gamma ),R(S))\right ]\cdot R(S) & \mbox{if \(|S|>1\)}.
\end{array}\right .\]
The dividend of a one-person coalition is equal to its payoff. If \(S\) contains more than one player, then we start with its payoff \(R(S)\). Given a subset \(T\) of \(S\) with \(T\neq S\), we give each player \(j\in T\) the dividend \(d_{T}(\Gamma )\) expressed as a multiple of \(R(S)\). After we have done so for all sets \(T\subset S\) with \(T\neq S\), we divide the remainder of \(R(S)\) by \(|S|\) to obtain the dividend (Timmer et al. \cite[p.600]{tim03}).

The {\bf dividend value} \(\boldsymbol{\phi}^{d}\) of game \(\Gamma\) is defined by
\[\phi_{i}^{d}(\Gamma )=\left [\sum_{\{S:i\in S\}}\alpha_{i}(d_{S}(\Gamma ),R(N))\right ]\cdot R(N)\]
for all \(i\in N\). Player \(i\) receives the dividends expressed in multiples of \(R(N)\) for all coalitions to which he/she belongs. The example below shows that the marginal value and the dividen value of a game \(\Gamma\) need not be equal even if \(\Gamma\in \mbox{SGL}(N)\).

Example. We consider the game \(\Gamma =(N,\{R(S)\}_{S\in {\cal S}},\{\succeq_{i}\}_{i\in N})\), where \(N=\{1,2,3\}\) and the payoffs are \(R(\{1\})=R(\{2\})=0\), \(R(\{3\})=1\), \(R(\{1,2\})=2\), \(R(\{1,3\})=3\), \(R(\{2,3\})=1\) and \(R(N)\) is uniformly distributed over the interval \([3,7]\). We see that \({\cal S}=\{\{3\},\{1,2\},\{1,3\},\{2,3\},N\}\) and \(A=\{p\cdot R(S):S\in {\cal S},p\in\mathbb{R}\}\) by definition. Let \(\beta_{1}=\beta_{3}=1/2\), \(\beta_{2}=1/4\) and let \(u_{\beta_{i}}^{X}=\sup\{t\in\mathbb{R}:\mathbb{P}(X\leq t)\leq\beta_{i}\}\) be the \(\beta_{i}\)-quantile of the random variable \(X\). The utility functions \(U_{i}\) are defined by \(U_{i}(X)=u_{\beta_{i}}^{X}\) if \(X\geq 0\) and \(U_{i}(X)=u_{1-\beta_{i}}^{X}\) otherwise. The preferences of player \(i\) are such that \(X\succeq_{i}Y\) if and only if \(U_{i}(X)\geq U_{i}(Y)\). Then
\[\alpha_{i}(p\cdot R(S),q\cdot R(T))=\frac{p\cdot u_{\beta_{i}}^{R(S)}}{q\cdot u_{\beta_{i}}^{R(T)}}\]
corresponds to these preferences. For this game \(\boldsymbol{\phi}^{m}=(2/5,13/60,23/60)R(N)\) and \(\boldsymbol{\phi}^{d}=(23/60,7/30,23/60)R(N)\). The two values are unequal though very close. \(\sharp\)

For the third formulation, the dividend of \(S\) is defined by \(D_{S}(v)=|S|d_{S}(v)\). The function \(\gamma :2^{N}\setminus\{\emptyset\}\rightarrow N\) with \(\gamma (S)\in S\) for all coalitions \(S\) is called a selector function. The family of selector functions for games with player set \(N\) is denoted by \(\Lambda (N)\) and \(|\Lambda (N)|=\prod_{k=2}^{n}k^{a_{k}}\), where
$a_{k}=\left (\begin{array}{l} n\\ k\end{array}\right )$. The selector vector \({\bf m}^{\gamma}(v)\) corresponding to \(\gamma\) is
defined by
\[m_{i}^{\gamma}(v)=\sum_{\{S:\gamma (S)=i\}}D_{S}(v)\]
for all \(i\in N\). Then the Shapley value of \((N,v)\) is given by
\[\phi_{i}(v)=\frac{1}{|\Lambda (N)|}\cdot\sum_{\gamma\in\Lambda (N)}m_{i}^{\gamma}(v).\]

For a cooperative game with random payoffs \(\Gamma\), we define the dividend of coalition \(S\), \(D_{S}(v)\), by
\[D_{S}(\Gamma )=\left\{\begin{array}{ll}
R(S) & \mbox{if \(|S|=1\)}\\
\left [1-\sum_{T\subset S,T\neq S}\sum_{j\in T}\alpha_{j}(D_{T}(\Gamma )/|T|,
R(S))\right ]\cdot R(S) & \mbox{if \(|S|>1\)}.
\end{array}\right .\]
The dividend \(D_{S}(\Gamma )\) of a one-person coalition is equal to its dividend, namely \(R(S)\). For coalitions \(S\) which has more than one player, we take a subset \(T\) of \(S\) with \(T\neq S\). The dividend \(D_{T}(\Gamma )\) is divided equally among the players in \(T\). Player \(j\in T\) receives the amount \(\alpha_{j}(D_{T}(\Gamma )/|T|,R(S))\cdot R(S)\), which is equivalent for him/her to \(D_{T}(\Gamma )/|T|\). The dividend of coalition \(S\) is all that remains of \(R(S)\) after the dividends of the subcoalitions \(T\) have been divided.

Proposition. \(D_{S}(\Gamma )=|S|d_{S}(\Gamma )\) for all games \(\Gamma =(N,\{R(S)\}_{S\in {\cal S}},\{\succeq_{i}\}_{i\in N})\) and any coalition \(S\). \(\sharp\)

The selector vector \(\bar{\bf m}^{\gamma}(\Gamma )\) is defined by \(\bar{m}_{i}^{\gamma}(\Gamma )=m_{i}^{\gamma}(\Gamma )\cdot R(N)\) for \(i\in N\) and \(\gamma\in\Lambda (N)\), where
\[m_{i}^{\gamma}(\Gamma )=\sum_{\{S:i=\gamma (S)\}}\alpha_{i}(D_{S}(\Gamma ),R(N)).\]
The selector value \(\boldsymbol{\phi}^{s}\) is the average of these selector vectors,
\[\phi_{i}^{s}(\Gamma )=\left [\frac{1}{|\Lambda (N)|}\sum_{\{\gamma\in\Lambda (N)}m_{i}^{\gamma}(\Gamma )\right ]R(N)\]
for all \(i\in N\).

Remark. All three values introduced above reduce to the standard Shapley value for the degenerate case in which the game is deterministic. \(\sharp\)

Let \(\boldsymbol{\psi}\) be a solution on \(\mbox{SG}(N)\). We introduce the following axioms.

  • Axiom TBT1: (Efficiency). For all \(\Gamma\in \mbox{SG}(N)\), \(\boldsymbol{\psi}(\Gamma )=R(N)\cdot{\bf p}\) for some \({\bf p}\in \Delta (N)\), i.e., the whole of \(R(N)\) is distributed among the players;
  • Axiom TBT2: (Symmetry). For all \(\Gamma\in \mbox{SG}(N)\) and all \(i,j\in N\) such that \(\alpha_{i}=\alpha_{j}\) and \(R(S\cup\{i\})=R(S\cup\{j\})\) for all \(S\subseteq N\setminus\{i,j\}\), we have \(\psi_{i}(\Gamma )=\psi_{j}(\Gamma )\), i.e., the symmetric players receive the same;
  • Axiom TBT3: (Null Player Property). For all \(\Gamma\in \mbox{SG}(N)\) and all \(i\in N\) such that \(R(i)=0\) and \(R(S)=R(S\setminus\{i\})\) for all coalitions \(S\neq\{i\}\), we have \(\psi_{i}(\Gamma )=0\).

Proposition. The marginal value \(\boldsymbol{\phi}^{m}\) and the dividend value \(\boldsymbol{\phi}^{d}\) satisfy the Axioms TBT1-TBT3. The selector value \(\boldsymbol{\phi}^{s}\) satisfies Axioms TBT2 and TBT3.$\sharp$

We introduce another axiom.

  • Axiom TBT4: (Strong Monotonicity). If, for all \(i\in N\) and all games \(\Gamma ,\Gamma^{\prime}\in \mbox{SG}(N)\) such that \(\bar{m}_{i}^{\pi}(\Gamma )\succeq_{i}\bar{m}_{i}^{\pi}(\Gamma^{prime})\) for all \(\pi\in\Pi (N)\), we have \(\psi_{i}(\Gamma )\succeq_{i}\psi_{i}(\Gamma^{\prime})\).

Hence, if a player prefers his/her marginal contributions in the game \(\Gamma\) to those in the game \(\Gamma^{\prime}\) that he/she should also prefer his/her share in the solution of the game \(\Gamma\) to that of the game \(\Gamma^{\prime}\).

Proposition. The marginal value \(\boldsymbol{\phi}^{m}\) satisfes Axiom TBT4 on the class of games \(\mbox{SGL}(N)\). \(\sharp\)

Theorem. The selector value and the dividend value coincide on \(\mbox{SGL}(N)\). \(\sharp\)

We denote by \(SGLI^{N}\) those games in \(\mbox{SGL}(N)\), where all the players have identical preferences \(\alpha_{i}=\alpha_{j}\) for all \(i,j\in N\).

Theorem. For all \(\Gamma\in SGLI^{N}\), we have $latex \boldsymbol{\phi}^{m}=\boldsymbol{\phi}^{d}=
\boldsymbol{\phi}^{s}$. \(\sharp\)

Theorem. The marginal value \(\boldsymbol{\phi}^{m}\) is the unique solution on \(SGLI^{N}\) that satisfies Axioms TBT1, TBT2, TBT4. \(\sharp\)

We will now turn our attention to games with random payoffs that need not have linear functions \(f_{i}\) for \(i\in N\).

Proposition. If \(\Gamma\) is a two-person game, then \(\boldsymbol{\phi}^{m}=\boldsymbol{\phi}^{d}=\boldsymbol{\phi}^{s}\). \(\sharp\)

An example shows that the three solutions need not coincide for three-person games.

 

Hsien-Chung Wu
Hsien-Chung Wu
文章: 199

發佈留言

發佈留言必須填寫的電子郵件地址不會公開。 必填欄位標示為 *