PMM.ipynb 7.48 KB
Newer Older
20200318029 committed
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96
{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 泊松混合模型"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 泊松分布\n",
    "\n",
    "泊松分布(Poisson distribution)的概率分布函数为:\n",
    "\n",
    "$$P(X = k) = \\frac{\\lambda^{k}}{k !} e^{- \\lambda}, k = 0, 1, \\cdots \\tag {1}$$\n",
    "\n",
    "泊松分布的参数$\\lambda$表示单位时间(单位面积)内随机事件的平均发生次数。泊松分布适合于描述单位时间内随机事件发生的次数。\n",
    "\n",
    "泊松分布的期望$\\lambda$:,方差:$\\lambda$,特征函数:$\\phi(t) = \\exp(\\lambda (e^{i t} - 1))$\n",
    "\n",
    "### 泊松分布与二项分布\n",
    "\n",
    "当二项分布的$n$很大而$p$很小时,二项分布可近似为泊松分布,其中$\\lambda = np$。通常当$n \\geq 20, p \\leq 0.05$时,就可以用泊松分布近似。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 泊松混合模型\n",
    "\n",
    "泊松混合模型(Poisson mixture model)的概率密度函数为:\n",
    "\n",
    "$$p(y; \\theta) = \\sum_{k}^{K} \\alpha_{k} \\frac{\\lambda^{k}}{k !} e^{- \\lambda} \\tag {2}$$\n",
    "\n",
    "其中,$\\alpha_{k} \\geq 0$,$\\sum_{k}^{K} \\alpha_{k} = 1$。\n",
    "\n",
    "给定由$K$个Poisson分布生成的$N$条独立数据构成数据集$D = {y_{j} | j = 1, 2, \\cdots, N}$,估计泊松混合模型的参数$\\theta_{k} = (\\alpha_{k}, \\lambda_{k} | k = 1, 2, \\cdots, K)$。\n",
    "\n",
    "1. 隐变量\n",
    "\n",
    "观测数据$y_{j}$已知,反映观测数据$y_{j}$来自第$k$个分模型的数据未知,以隐变量$\\gamma_{k}$表示:\n",
    "\n",
    "$$\\gamma_{jk} = \\begin{cases}\n",
    "1, & \\text{第}j\\text{个观测来自第}k\\text{个分模型} \\\\\n",
    "0, & 否则\n",
    "\\end{cases}\n",
    "j = 1, 2, \\cdots, N; k = 1, 2, \\cdots, K \\tag {3}$$\n",
    "\n",
    "未观测数据$\\gamma_{jk}$为`0-1`随机变量,则完全数据为\n",
    "\n",
    "$$(y_{j}, \\gamma_{j1}, \\cdots, \\gamma_{jK}), j = 1, 2, \\cdots, N$$\n",
    "\n",
    "完全数据的似然函数:\n",
    "\n",
    "$$\\begin{aligned}\n",
    "P(y, \\gamma; \\theta) & = \\prod_{j = 1}^{N} P(y_{j}, \\gamma_{j1}, \\cdots, \\gamma_{jK}; \\theta) \\\\\n",
    "& = \\prod_{j = 1}^{N} \\prod_{k = 1}^{K} [\\alpha_{k} P(y_{j}; \\theta_{k})]^{\\gamma_{jk}} \\\\\n",
    "& = \\prod_{k = 1}^{K} \\alpha_{k}^{n_{k}} \\prod_{j = 1}^{N} [P(y_{j}; \\theta_{k})]^{\\gamma_{jk}} \\\\\n",
    "& = \\prod_{k = 1}^{K} \\alpha_{k}^{n_{k}} \\prod_{j = 1}^{N} \\left[\n",
    "    \\frac{\\lambda_{k}^{y_{j}}}{y_{j} !} e^{- \\lambda_{k}}\n",
    "\\right]^{\\gamma_{jk}}\n",
    "\\end{aligned}$$\n",
    "\n",
    "其中,$n_{k} = \\sum_{j = 1}^{N} \\gamma_{jk}$,$N = \\sum_{k = 1}^{K} n_{k}$。完全数据的对数似然函数为\n",
    "\n",
    "$$\\begin{aligned}\n",
    "\\log P(y, \\gamma; \\theta) = \\sum_{k = 1}^{K} \\left( n_{k} \\log \\alpha_{k} + \\sum_{j = 1}^{N} \\gamma_{jk} \\left[\n",
    "    y_{j} \\log \\lambda_{k} - \\log y_{j}! - \\lambda_{k}\n",
    "\\right] \\right) \\\\\n",
    "\\end{aligned} \\tag {4}$$\n",
    "\n",
    "2. E步,$Q$函数\n",
    "\n",
    "$$\\begin{aligned}\n",
    "Q(\\theta, \\theta^{(i)})\n",
    "& = \\text{E}_{\\gamma} \\left[ \\log P(y, \\gamma; \\theta) | y; \\theta^{(i)} \\right] \\\\\n",
    "& = \\sum_{k = 1}^{K} \\left(\n",
    "    \\sum_{j = 1}^{N} \\text{E}[\\gamma_{jk} | y_{j}; \\theta^{(i)}] \\log \\alpha_{k} + \\sum_{j = 1}^{N} \\text{E}[\\gamma_{jk} | y_{j}; \\theta^{(i)}] \\left[\n",
    "    y_{j} \\log \\lambda_{k} - \\log y_{j}! - \\lambda_{k}\n",
    "\\right] \\right) \\\\\n",
    "\\end{aligned}$$\n",
    "\n",
    "令$\\hat{\\gamma}_{jk} = \\text{E}[\\gamma_{jk} | y_{j}; \\theta^{(i)}]$,表示当前模型参数下第$j$个观测数据来自第$k$个分模型的概率,称为分模型$k$对观测数据$y_{j}$的响应度。\n",
    "\n",
    "$$\\begin{aligned}\n",
    "\\hat{\\gamma}_{jk} & = \\text{E}[\\gamma_{jk} | y_{j}; \\theta^{(i)}] \\\\\n",
    "& = \\frac{\n",
    "    P(y_{j} | \\gamma_{jk} = 1; \\theta^{(i)}) P(\\gamma_{jk} = 1; \\theta^{(i)})\n",
    "}{\n",
    "    \\sum_{k = 1}^{K} P(y_{j} | \\gamma_{jk} = 1; \\theta^{(i)}) P(\\gamma_{jk} = 1; \\theta^{(i)})\n",
    "} \\\\\n",
    "& = \\frac{\n",
20200318029 committed
97
    "    \\alpha_{k}^{(i)} \\frac{\\left( \\lambda_{k}^{(i)} \\right)^{y_{j}}}{y_{j} !} e^{- \\lambda_{k}^{(i)}}\n",
20200318029 committed
98
    "}{\n",
20200318029 committed
99
    "    \\sum_{k = 1}^{K} \\alpha_{k}^{(i)} \\frac{ \\left( \\lambda_{k}^{(i)} \\right)^{y_{j}}}{y_{j} !} e^{- \\lambda_{k}^{(i)}}\n",
20200318029 committed
100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138
    "} \\\\\n",
    "\\end{aligned} \\tag {5}$$\n",
    "\n",
    "则$n_{k} = \\sum_{j = 1}^{N} E[\\hat{\\gamma}_{jk}]$\n",
    "\n",
    "$$\\begin{aligned}\n",
    "Q(\\theta, \\theta^{(i)}) = \\sum_{k = 1}^{K} \\left(\n",
    "    \\sum_{j = 1}^{N} \\hat{\\gamma}_{jk} \\log \\alpha_{k} + \\sum_{j = 1}^{N} \\hat{\\gamma}_{jk} \\left[\n",
    "    y_{j} \\log \\lambda_{k} - \\log y_{j}! - \\lambda_{k}\n",
    "\\right] \\right) \\\\\n",
    "\\end{aligned} \\tag {6}$$\n",
    "\n",
    "3. M步\n",
    "\n",
    "求使函数$Q(\\theta, \\theta^{(i)})$取极大值时的参数$\\theta^{(i + 1)} = (\\hat{\\lambda}_{k}, \\hat{\\alpha}_{k} | k = 1, 2, \\cdots, K)$:\n",
    "\n",
    "$$\\begin{aligned}\n",
    "\\theta^{(i + 1)} & = \\argmax_{\\theta} Q(\\theta, \\theta^{(i)}) \\\\\n",
    "& \\text{s.t.} \\sum_{k = 1}^{K} \\alpha_{k} = 1, \\ \\alpha_{k} \\geq 1\n",
    "\\end{aligned}$$\n",
    "\n",
    "通过拉格朗日乘数法可得:\n",
    "\n",
    "$$\\hat{\\lambda}_{k} = \\frac{\\sum_{j = 1}^{N} \\hat{\\gamma}_{jk} y_{j}}{\\sum_{j = 1}^{N} \\hat{\\gamma}_{jk}} \\tag {7}$$\n",
    "\n",
    "$$\\hat{\\alpha}_{k} = \\frac{n_k}{N} = \\frac{\\sum_{j = 1}^{N} \\hat{\\gamma}_{jk}}{N} \\tag {8}$$\n",
    "\n",
    "### 泊松混合模型EM参数估计\n",
    "\n",
    "输入:观测数据$y_{1}, y_{2}, \\cdots,y_{N}$,泊松混合模型;\n",
    "\n",
    "输出:泊松混合模型参数。\n",
    "\n",
    "(1)参数初始化\n",
    "\n",
    "(2)E步:依据当前模型参数,计算分模型$k$对观测数据$y_{j}$的响应度\n",
    "\n",
    "$$\\begin{aligned}\n",
    "\\hat{\\gamma}_{jk} = \\frac{\n",
20200318029 committed
139
    "    \\alpha_{k}^{(i)} \\frac{\\left( \\lambda_{k}^{(i)} \\right)^{y_{j}}}{y_{j} !} e^{- \\lambda_{k}^{(i)}}\n",
20200318029 committed
140
    "}{\n",
20200318029 committed
141
    "    \\sum_{k = 1}^{K} \\alpha_{k}^{(i)} \\frac{ \\left( \\lambda_{k}^{(i)} \\right)^{y_{j}}}{y_{j} !} e^{- \\lambda_{k}^{(i)}}\n",
20200318029 committed
142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178
    "} \\\\\n",
    "\\end{aligned}$$\n",
    "\n",
    "(3)M步:计算新一轮迭代的模型参数\n",
    "\n",
    "$$\\begin{aligned}\n",
    "\\hat{\\lambda}_{k} & = \\frac{\\sum_{j = 1}^{N} \\hat{\\gamma}_{jk} y_{j}}{\\sum_{j = 1}^{N} \\hat{\\gamma}_{jk}} \\\\\n",
    "\\hat{\\alpha}_{k} & = \\frac{n_k}{N} = \\frac{\\sum_{j = 1}^{N} \\hat{\\gamma}_{jk}}{N} \\\\\n",
    "k & = 1, 2, \\cdots, K\n",
    "\\end{aligned}$$\n",
    "\n",
    "(4)重复(2)、(3),直至收敛(对数似然函数值不再有明显的变化)\n"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.9"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}