统计机器学习-伯努利与二项分布

矩母函数:
M x ( t ) = E ( e t x ) = { x e t x f ( x )    ( ) e t x f ( x ) d x ( ) M_x(t)=E(e^{tx})= \begin{cases} \sum_xe^{tx}f(x)\qquad\qquad\qquad \ \ (离散的)\\ \\ \displaystyle \int e^{tx} f(x)dx\qquad\qquad\qquad(连续的) \\ \end{cases}

伯努利实验:

模型:扔硬币,正面朝上的概率为 p p ,正面朝下的概率 q = 1 p q=1-p
x = { 1 , 0 } x=\{1,0\}
( 0 1 分布、两点分布)
分布函数:
B e r n { x , p } = p x ( 1 p ) 1 x Bern\{x,p\}=p^x (1-p)^{1-x}
E ( X ) = p   ;   V a r ( X ) = p q = p ( 1 p ) 期望E(X)=p \ ; \ 方差Var(X)=pq=p(1-p)
似然函数:
p ( x 1 , x 2 , . . . , x n p ) = i = 1 N p x i ( 1 p ) 1 x i p(x_1,x_2,...,x_n|p)=\prod_{i=1}^N p^{x_i}(1-p)^{1-x_i}
l o g ( p ( x 1 , x 2 , . . . , x n p ) ) = l o g ( p x i ( 1 p ) 1 x i ) log(p(x_1,x_2,...,x_n|p))=\sum log(p^{x_i}(1-p)^{1-x_i})
= x i l o g p + ( 1 x i ) l o g ( 1 p ) =\sum x_i logp + \sum (1-x_i)log(1-p)

二项分布:
模型:n次伯努利试验成功了x次
分布函数:
B i n ( x p ) = C n x p x ( 1 p ) n x = C n x p x q n x Bin(x|p)=C_n^x p^x (1-p)^{n-x}=C_n^x p^x q^{n-x}
x = 0 n C n x p x q n x = ( p + q ) n = x = 0 n ( x n ) p x q 1 x \sum_{x=0}^n C_n^x p^x q^{n-x}=(p+q)^n=\sum_{x=0}^n \big(_x^n\big)p^xq^{1-x}
C n x = = ( x n ) 注:C_n^x == \big(_x^n\big)
M x ( t ) = x = 0 n e t x ( x n ) p x q n x = x = 0 n ( x n ) ( e t p ) x q n x = ( p e t + q ) n M_x(t)=\sum_{x=0}^n e^{tx}\big(_x^n\big)p^xq^{n-x}=\sum_{x=0}^n\big(_x^n\big)(e^tp)^xq^{n-x}=(pe^t+q)^n
E ( X ) = n p   ;   V a r ( X ) = n p q = n p ( 1 p ) 期望E(X)=np \ ; \ 方差Var(X)=npq=np(1-p)


matplotlib可视化验证二项分布的性质:
x B i n ( n 1 , p ) y B i n ( n 2 , p ) x \sim Bin(n_1,p),y \sim Bin(n_2,p)
x y x + y B i n ( n 1 + n 2 , p ) 若x、y独立,x+y \sim Bin(n_1+n_2,p)

import numpy as np
import matplotlib.pyplot as plt

plt.xlim((0,100))
plt.ylim((0,0.27))

sample1 = np.random.binomial(10, 0.5, size=1000)
sample2 = np.random.binomial(100, 0.5, size=1000)
sample3 = np.random.binomial(110, 0.5, size=1000)

pillar=100

s1=plt.hist(sample1, rwidth=0.9, alpha=0.6, density=True, label="n1",bins=pillar,range=[0,pillar])
plt.plot(s1[1][0:pillar],s1[0],color='blue')

s2=plt.hist(sample2, rwidth=0.9, alpha=0.6, density=True, label="n2",bins=pillar,range=[0,pillar])
plt.plot(s2[1][0:pillar],s2[0],color='orange')

s3=plt.hist(sample3, rwidth=0.9, alpha=0.6, density=True, label="n3",bins=pillar,range=[0,pillar])
plt.plot(s3[1][0:pillar],s3[0],color='g')

s4=plt.hist(sample1 + sample2, rwidth=0.9, alpha=0.6, density=True, label="n1+n2",bins=pillar,range=[0,pillar])
plt.plot(s4[1][0:pillar],s4[0],color='r')

plt.legend()
plt.show()

在这里插入图片描述


pyecharts:

import numpy as np
from pyecharts.charts import *
import collections
from pyecharts import options as opts
from pyecharts.render import make_snapshot
from snapshot_selenium import snapshot
from pyecharts.globals import ThemeType

sample1 = np.random.binomial(10, 0.5, size=1000)
sample2 = np.random.binomial(100, 0.5, size=1000)
sample3 = np.random.binomial(110, 0.5, size=1000)

list_x0 = sorted(dict(collections.Counter(sample1 + sample2)))
list_y0 = [dict(collections.Counter(sample1 + sample2))[i] for i in list_x0]
list_y0 = [i / sum(list_y0) for i in list_y0]

list_x1 = sorted(dict(collections.Counter(sample1)))
list_y1 = [dict(collections.Counter(sample1))[i] for i in list_x1]
list_y1 = [i / sum(list_y1) for i in list_y1]

list_x2 = sorted(dict(collections.Counter(sample2)))
list_y2 = [dict(collections.Counter(sample2))[i] for i in list_x2]
list_y2 = [i / sum(list_y2) for i in list_y2]

list_x3 = sorted(dict(collections.Counter(sample3)))
list_y3 = [dict(collections.Counter(sample3))[i] for i in list_x3]
list_y3 = [i / sum(list_y3) for i in list_y3]

line = (
    Line(init_opts=opts.InitOpts(theme=ThemeType.LIGHT))
        .add_xaxis([str(i) for i in list_x3])
        .add_yaxis("n1+n2", list_y0, is_smooth=True)
        .add_yaxis("n1", list_y1, is_smooth=True)
        .add_yaxis("n2", list_y2, is_smooth=True)
        .add_yaxis("n3", list_y3, is_smooth=True)
        .set_global_opts(legend_opts=opts.LegendOpts(orient="vertical", pos_right='10%'),
                         title_opts=opts.TitleOpts(title="Bernoulli", pos_left="center"))
        .set_series_opts(label_opts=opts.LabelOpts(is_show=False))
)
make_snapshot(snapshot, line.render(), "Bernoulli.png")

在这里插入图片描述

发布了36 篇原创文章 · 获赞 27 · 访问量 1万+

猜你喜欢

转载自blog.csdn.net/qq_43613793/article/details/104750824