BZOJ2318. Spoj4060 game with probability Problem(概率DP)

题目描述
Alice and Bob play the following game. First, they collect N small stones and put them together in one pile. After that, they throw a coin one by one. Alice starts first. If a player throws heads then he takes exactly one stone from the pile. In case of tails he don’t do anything. The one who takes the last stone wins. For each player, his skill of throwing a coin is known (to everyone, including himself and his opponent). More precisely, if Alice wants to throw some specific side of the coin, she always succeeds with probability P. The same probability for Bob is Q. You are to find probability that Alice will win the game if both guys play optimally.

输入格式
Input starts with a line containing one integer T - a number of test cases (1 <= T <= 50). Then T test cases follow. Each of them is one line with three numbers N, P, and Q separated with a space (1 <= N <= 99999999, 0.5 <= P, Q <= 0.99999999). P and Q have not more than 8 digits after decimal point.

输出格式
For each test case output one line with a probability that Alice will win the game. Your answer must be precise up to 10^-6.

题意翻译
Alice和Bob在玩一个游戏。

有n个石子在这里,Alice和Bob轮流投掷硬币,如果正面朝上,则从n个石子中取出一个石子,否则不做任何事。取到最后一颗石子的人胜利。Alice在投掷硬币时有p的概率投掷出他想投的一面,Bob有q的概率投掷出他相投的一面。

现在Alice先手投掷硬币,假设他们都想赢得游戏,问你Alice胜利的概率为多少。

输入输出样例
输入 #1复制
1
1 0.5 0.5
输出 #1复制
0.666666667

思路:
f [ i ] f[i] 为剩余i个石头,A先手的胜率
g [ i ] g[i] 为剩余i个石头,A后手的胜率(相当于B了)

那么 f [ 0 ] = 0 , g [ 0 ] = 1 f[0] = 0, g[0] = 1

则有递推关系式(按惯例概率是顺推,期望是逆推):
f [ i ] = g [ i 1 ] p + g [ i ] ( 1 p ) f[i] = g[i-1] * p + g[i] * (1-p)
g [ i ] = f [ i 1 ] q + f [ i ] ( 1 q ) g[i] = f[i-1] * q + f[i] * (1-q)
(后手的时候转移为什么就成了先手?这个具体过程是什么,我想不明白。个人理解是看做先手是A,后手是B,实际是轮流取的过程)。

概率(期望)DP的方程都有这种特点:方程中包含自己,即你中有我,我中有你。要获得真正的递推式则要进行变形。

如果A先手后手都想选:
f [ i ] = ( p g [ i 1 ] + ( 1 p ) q f [ i 1 ] ) / ( 1 ( 1 p ) ( 1 q ) f[i] = (p * g[i-1] + (1 - p) * q * f[i-1]) / (1 - (1 - p) * (1 - q)
g [ i ] = ( q f [ i 1 ] + ( 1 q ) p g [ i 1 ] ) / ( 1 ( 1 p ) ( 1 q ) g[i] = (q * f[i-1] + (1 - q) * p * g[i-1]) / (1 - (1 - p) * (1 - q)

如果 f [ i 1 ] > g [ i 1 ] f[i - 1] > g[i - 1] 的话,先手后手都不想选,那么p = 1 - p,q = 1 - q,
否则都想选。

对于这个式子,暴力的方法是变形后看结果。

···
理解的话(其实没有完全理解),可以把先手理解为A,后手理解为B。

假设i = 1的时候,g[0] = 1 > f[0] = 0
那么先手肯定想选,选完得到g[0]就赢了。后手也肯定想选,不然就留给先手赢吗。

当f[i - 1] > g[i - 1]的时候。
A取完变成了g[i - 1],为什么不晚一点取让B先取,剩给自己变成f[i - 1]更好呢。
B取完,不管B变成什么(后手取完变成什么感觉还是挺难想清楚的),A变成了f[i-1]。就不就满足了A的邪恶计划了吗?要是A赢了,B不就输了吗。所以B也不想取。

当f[i - 1] < g[i - 1]的时候同理

这里我们可以总结到:先手取完变成什么是好考虑的,因为就是直接取嘛,直接模拟就好了。后手取完变成什么感觉是不好考虑的(先手取的结果会影响你,个人感觉不好模拟)。但是后手有一个考虑因素:不要让先手赢,这样就可以从先手的操作推出后手。

还有比较玄学的是 n = min(1000,n)。n很大的时候接近0.5了,所以,,,

#include <cstdio>
#include <cstring>
#include <algorithm>

using namespace std;

double f[1005],g[1005];

int main()
{
    int T;scanf("%d",&T);
    while(T--)
    {
        int n;
        double p,q;
        scanf("%d %lf %lf",&n,&p,&q);
        n = min(n,1000);
        
        f[0] = 0.0;g[0] = 1.0;
        for(int i = 1;i <= n;i++)
        {
            if(f[i - 1] > g[i - 1])
            {
                p = 1.0 - p;q = 1.0 - q;
            }
            f[i] = (p * g[i - 1] + (1.0 - p) * q * f[i - 1]) / (1.0 - (1.0 - p) * (1.0 - q));
            g[i] = (q * f[i - 1] + (1.0 - q) * p * g[i - 1]) / (1.0 - (1.0 - p) * (1.0 - q));
            if(f[i - 1] > g[i - 1])
            {
                p = 1.0 - p;q = 1.0 - q;
            }
        }
        printf("%.6f\n",f[n]);
    }
    return 0;
}

发布了698 篇原创文章 · 获赞 22 · 访问量 3万+

猜你喜欢

转载自blog.csdn.net/tomjobs/article/details/104346258