Skip to content
Projects
Groups
Snippets
Help
This project
Loading...
Sign in / Register
Toggle navigation
M
ml2_MiniAssignments
Overview
Overview
Details
Activity
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Issues
0
Issues
0
List
Board
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Charts
Wiki
Wiki
Snippets
Snippets
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
20200318029
ml2_MiniAssignments
Commits
dd74ebf8
Commit
dd74ebf8
authored
Mar 31, 2020
by
20200318029
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
20200331
parents
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
541 additions
and
0 deletions
+541
-0
homework1/MiniHomework#1.ipynb
+541
-0
No files found.
homework1/MiniHomework#1.ipynb
0 → 100644
View file @
dd74ebf8
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 机器学习中的优化 "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 问题1 (30分)\n",
"假设我们有训练数据$D=\\{(\\mathbf{x}_1,y_1),...,(\\mathbf{x}_n,y_n)\\}$, 其中$(\\mathbf{x}_i,y_i)$为每一个样本,而且$\\mathbf{x}_i$是样本的特征并且$\\mathbf{x}_i\\in \\mathcal{R}^D$, $y_i$代表样本数据的标签(label), 取值为$0$或者$1$. 在逻辑回归中,模型的参数为$(\\mathbf{w},b)$。对于向量,我们一般用粗体来表达。 为了后续推导的方便,可以把b融入到参数w中。 这是参数$w$就变成 $w=(w_0, w_1, .., w_D)$,也就是前面多出了一个项$w_0$, 可以看作是b,这时候每一个$x_i$也需要稍作改变可以写成 $x_i = [1, x_i]$,前面加了一个1。稍做思考应该能看出为什么可以这么写。\n",
"\n",
"请回答以下问题。请用Markdown自带的Latex来编写。\n",
"\n",
"\n",
"\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### (a) ```编写逻辑回归的目标函数```\n",
"请写出目标函数(objective function), 也就是我们需要\"最小化\"的目标(也称之为损失函数或者loss function),不需要考虑正则。 把目标函数表示成最小化的形态,另外把$\\prod_{}^{}$转换成$\\log \\sum_{}^{}$\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"$$\\mathcal{L} (\\mathbf{w}) = - \n",
"\\sum_{i = 1}^{n} \\left[\n",
" y_{i} \\log \\sigma (\\mathbf{w}^{\\text{T}} \\mathbf{x}_{i}) +\n",
" (1 - y_{i}) \\log [1 - \\sigma (\\mathbf{w}^{\\text{T}} \\mathbf{x}_{i})]\n",
"\\right]$$"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### (b) ```求解对w的一阶导数```\n",
"为了做梯度下降法,我们需要对参数$w$求导,请把$L(w)$对$w$的梯度计算一下:"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"$$\\begin{aligned}\n",
"\\frac{\\partial \\mathcal{L} (\\mathbf{w})}{\\partial \\mathbf{w}}\n",
"& = - \\frac{\n",
" \\partial \\sum_{i = 1}^{n} \\left[\n",
" y_{i} \\log \\sigma (\\mathbf{w}^{\\text{T}} \\mathbf{x}_{i}) + \n",
" (1 - y_{i}) \\log [1 - \\sigma (\\mathbf{w}^{\\text{T}} \\mathbf{x}_{i})]\n",
" \\right]\n",
"}\n",
"{\n",
" \\partial \\mathbf{w}\n",
"} \\\\\n",
"& = - \\sum_{i = 1}^{n} \\left[\n",
" y_{i} \\frac{\n",
" \\partial \\log \\sigma (\\mathbf{w}^{\\text{T}} \\mathbf{x}_{i})\n",
" }\n",
" {\n",
" \\partial \\mathbf{w}\n",
" } + \n",
" (1 - y_{i}) \\frac{\n",
" \\partial \\log [1 - \\sigma (\\mathbf{w}^{\\text{T}} \\mathbf{x}_{i})]\n",
" }\n",
" {\n",
" \\partial \\mathbf{w}\n",
" }\n",
"\\right] \\\\\n",
"& = - \\sum_{i = 1}^{n} \\left[\n",
" y_{i} \\left( 1 - \\sigma (\\mathbf{w}^{\\text{T}} \\mathbf{x}_{i} ) \\right) \\mathbf{x}_{i} - \n",
" (1 - y_{i}) \\sigma (\\mathbf{w}^{\\text{T}} \\mathbf{x}_{i} ) \\mathbf{x}_{i}\n",
"\\right] \\\\\n",
"& = \\sum_{i = 1}^{n} \\left[\n",
" \\sigma (\\mathbf{w}^{\\text{T}} \\mathbf{x}_{i}) - y_{i}\n",
"\\right] \\mathbf{x}_{i} \\\\\n",
"\\end{aligned}$$"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### (c) ```求解对w的二阶导数```\n",
"在上面结果的基础上对$w$求解二阶导数,也就是再求一次导数。 这个过程需要回忆一下线性代数的部分 ^^。 参考: matrix cookbook: https://www.math.uwaterloo.ca/~hwolkowi/matrixcookbook.pdf, 还有 Hessian Matrix。 "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"$$\\begin{aligned}\n",
"\\frac{\\partial^{2} \\mathcal{L}(\\mathbf{w})}{\\partial^{2} \\mathbf{w}}\n",
"& = \\frac{\n",
" \\partial \\sum_{i = 1}^{n} \\left[\n",
" \\sigma (\\mathbf{w}^{\\text{T}} \\mathbf{x}_{i}) - y_{i}\n",
" \\right] \\mathbf{x}_{i}\n",
"}\n",
"{\n",
" \\partial \\mathbf{w}\n",
"} \\\\\n",
"& = \\sum_{i = 1}^{n} \\frac{\n",
" \\partial \\sigma (\\mathbf{w}^{\\text{T}} \\mathbf{x}_{i})\n",
"}\n",
"{\n",
" \\partial \\mathbf{w}\n",
"} \\mathbf{x}_{i}^{\\text{T}} \\\\\n",
"& = \\sum_{i = 1}^{n}\n",
"\\sigma (\\mathbf{w}^{\\text{T}} \\mathbf{x}_{i})\n",
"\\left( 1 - \\sigma (\\mathbf{w}^{\\text{T}} \\mathbf{x}_{i}) \\right)\n",
"\\mathbf{x}_{i} \\mathbf{x}_{i}^{\\text{T}} \\\\\n",
"\\end{aligned}$$"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### (d) ```证明逻辑回归目标函数是凸函数```\n",
"试着证明逻辑回归函数是凸函数。假设一个函数是凸函数,我们则可以得出局部最优解即为全局最优解,所以假设我们通过随机梯度下降法等手段找到最优解时我们就可以确认这个解就是全局最优解。证明凸函数的方法有很多种,在这里我们介绍一种方法,就是基于二次求导大于等于0。比如给定一个函数$f(x)=x^2-3x+3$,做两次\n",
"求导之后即可以得出$f''(x)=2 > 0$,所以这个函数就是凸函数。类似的,这种理论也应用于多元变量中的函数上。在多元函数上,只要证明二阶导数是posititive semidefinite即可以。 问题(c)的结果是一个矩阵。 为了证明这个矩阵(假设为H)为Positive Semidefinite,需要证明对于任意一个非零向量$v\\in \\mathcal{R}$, 需要得出$v^{T}Hv >=0$\n",
"请写出详细的推导过程:"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"// TODO 请写下推导过程\n",
"\n",
"$$\\begin{aligned}\n",
"\\mathbf{v}^{\\text{T}} \\frac{\\partial^{2} \\mathcal{L}(\\mathbf{w})}{\\partial^{2} \\mathbf{w}} \\mathbf{v}\n",
"& = \\sum_{i = 1}^{n}\n",
"\\sigma (\\mathbf{w}^{\\text{T}} \\mathbf{x}_{i})\n",
"\\left( 1 - \\sigma (\\mathbf{w}^{\\text{T}} \\mathbf{x}_{i}) \\right)\n",
"\\mathbf{v}^{\\text{T}} \\mathbf{x}_{i} \\mathbf{x}_{i}^{\\text{T}} \\mathbf{v}\\\\\n",
"\\end{aligned}$$\n",
"\n",
"由于$0 \\lt \\sigma (\\mathbf{w}^{\\text{T}} \\mathbf{x}_{i}) \\lt 1$,故只需证明$\\mathbf{v}^{\\text{T}} \\mathbf{x}_{i} \\mathbf{x}_{i}^{\\text{T}} \\mathbf{v} \\geq 0$\n",
"\n",
"$$\\mathbf{v}^{\\text{T}} \\mathbf{x}_{i} \\mathbf{x}_{i}^{\\text{T}} \\mathbf{v}\n",
"= \\left( \\mathbf{v}^{\\text{T}} \\mathbf{x}_{i} \\right)^{2}\n",
"\\geq 0\n",
"\\Rightarrow\n",
"\\frac{\\partial^{2} \\mathcal{L}(\\mathbf{w})}{\\partial^{2} \\mathbf{w}}\n",
"\\geq 0\n",
"\\Rightarrow\n",
"\\frac{\\partial^{2} \\mathcal{L}(\\mathbf{w})}{\\partial^{2} \\mathbf{w}}\n",
"\\succeq 0\n",
"$$\n",
"\n",
"即$\\frac{\\partial^{2} \\mathcal{L}(\\mathbf{w})}{\\partial^{2} \\mathbf{w}}$为半正定矩阵,逻辑回归函数是凸函数得证。\n",
"\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 问题2 (20分)\n",
"证明p-norm是凸函数, p-norm的定义为:\n",
"$||x||_p=(\\sum_{i=1}^{n}|x_i|^p)^{1/p}$\n",
"\n",
"hint: Minkowski’s Inequality"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"// TODO: your proof\n",
"\n",
"minkowski’s inequality, if $1 \\leq p \\lt \\infty$, then whenever $\\mathbf{x}, \\mathbf{y} \\in \\mathcal{V}_{F}$, we have\n",
"\n",
"$$\\begin{aligned}\n",
"\\left\\| \\mathbf{x} + \\mathbf{y} \\right\\|_{p} \\leq \\left\\| \\mathbf{x} \\right\\|_{p} + \\left\\| \\mathbf{y} \\right\\|_{p}\n",
"\\end{aligned}$$\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"given two arbitrary points, $\\mathbf{x}$ and $\\mathbf{y}$, and $0 \\leq \\alpha \\leq 1$, let $\\mathbf{z} = \\alpha \\mathbf{x} + (1 - \\alpha)\\mathbf{y}$\n",
"\n",
"$$\\begin{aligned}\n",
"\\| \\mathbf{z} \\| =\n",
"\\| \\alpha \\mathbf{x} + (1 - \\alpha) \\mathbf{y} \\|_{p} \\leq\n",
"\\| \\alpha \\mathbf{x}_{p} \\| + \\| (1 - \\alpha) \\mathbf{y} \\|_{p} =\n",
"\\alpha \\| \\mathbf{x}_{p} \\| + (1 - \\alpha) \\| \\mathbf{y} \\|_{p}\n",
"\\end{aligned}$$\n",
"\n",
"i.e. p-norm is convex"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 问题3 (20分)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"在第一次课程中,我们讲了在判定凸函数的时候用到一项技术:second-order convexity, 也就是当函数f(x)在每一个点上twice differentiable, 这时候我们就有个性质: f(x)是凸函数,当且仅当 f(x)的二阶为PSD矩阵(半正定矩阵)。 请在下方试着证明一下此理论。 "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"// your proof of second order convexity\n",
"\n",
"1. sufficiency, to prove $\\nabla^{2} f(\\mathbf{x}) \\succeq 0 \\Rightarrow f(\\mathbf{y}) \\geq f(\\mathbf{x}) + {\\nabla f(\\mathbf{x})}^{\\text{T}}$\n",
"\n",
"using tayler expansion at $\\mathbf{x}$, let $\\mathbf{y} = \\mathbf{x} + \\Delta \\mathbf{x}$ and $0 \\lt \\theta \\lt 1$,\n",
"\n",
"$$\\begin{aligned}\n",
"f(\\mathbf{y})\n",
"& = f(\\mathbf{x} + \\Delta \\mathbf{x}) \\\\\n",
"& = f(\\mathbf{x}) + {\\nabla f(\\mathbf{x})}^{\\text{T}} \\Delta \\mathbf{x}\n",
"+ \\frac{1}{2} {\\Delta \\mathbf{x}}^{\\text{T}} {\\nabla^{2} f(\\mathbf{x} + \\theta \\Delta \\mathbf{x})} \\Delta \\mathbf{x} \\\\\n",
"& \\because \\nabla^{2} f(\\mathbf{x}) \\succeq 0 \\\\\n",
"& \\therefore {\\Delta \\mathbf{x}}^{\\text{T}} {\\nabla^{2} f(\\mathbf{x} + \\theta \\Delta \\mathbf{x})} \\Delta \\mathbf{x} \\geq 0 \\\\\n",
"f(\\mathbf{y})\n",
"& \\geq f(\\mathbf{x}) + {\\nabla f(\\mathbf{x})}^{\\text{T}} \\Delta \\mathbf{x}\n",
"\\end{aligned}$$\n",
"\n",
"2. necessity, to prove $f(\\mathbf{y}) \\geq f(\\mathbf{x}) + {\\nabla f(\\mathbf{x})}^{\\text{T}} \\Rightarrow \\nabla^{2} f(\\mathbf{x}) \\succeq 0 $\n",
"\n",
"$$\\begin{aligned}\n",
"\\nabla^{2} f(\\mathbf{x})\n",
"& = \\lim_{\\Delta \\mathbf{x} \\rightarrow 0} \\frac{\n",
" f(\\mathbf{x} + \\Delta \\mathbf{x}) + f(\\mathbf{x} - \\Delta \\mathbf{x}) - 2 f(\\mathbf{x})\n",
"}{\n",
" {\\Delta \\mathbf{x}}^{\\text{T}}{\\Delta \\mathbf{x}}\n",
"} \\\\\n",
"& \\geq \\lim_{\\Delta \\mathbf{x} \\rightarrow 0} \\frac{\n",
" f(\\mathbf{x} + \\Delta \\mathbf{x}) + f(\\mathbf{x} - \\Delta \\mathbf{x}) - 2 f(\\mathbf{x})\n",
"}{\n",
" {\\Delta \\mathbf{x}}^{\\text{T}}{\\Delta \\mathbf{x}}\n",
"} \\\\\n",
"\\end{aligned}$$\n",
"\n",
"\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 问题4 (30分)\n",
"在课堂里我们讲过Transportation problem. 重新描述问题: 有两个城市北京和上海,分别拥有300件衣服和500件衣服,另外有三个城市分别是1,2,3分别需要200,300,250件衣服。现在需要把衣服从北京和上海运送到城市1,2,3。 我们假定每运输一件衣服会产生一些代价,比如:\n",
"- 北京 -> 1: 5\n",
"- 北京 -> 2: 6\n",
"- 北京 -> 3: 4\n",
"- 上海 -> 1: 6\n",
"- 上海 -> 2: 3\n",
"- 上海 -> 3: 7\n",
"\n",
"最后的值是单位cost. \n",
"\n",
"问题:我们希望最小化成本。 "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"```(a)``` 请写出linear programming formulation。 利用标准的写法(Stanford form),建议使用矩阵、向量的表示法。 "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"// your formulation\n",
"\n",
"$\\mathbf{x} = (x_{1}, x_{2}, \\dots, x_{6})$, $\\mathbf{w} = (5, 6, 4, 6, 3, 7)$\n",
"\n",
"$$\\begin{aligned}\n",
"& \\min_{\\mathbf{x}} \\mathbf{w}^{\\text{T}} \\mathbf{x} \\\\\n",
"& s.t. \n",
"\\begin{cases}\n",
" x_{1} + x_{2} + x_{3} \\leq 300 \\\\\n",
" x_{4} + x_{5} + x_{6} \\leq 500 \\\\\n",
" x_{1} + x{3} = 200 \\\\\n",
" x_{2} + x{4} = 300 \\\\\n",
" x_{3} + x{6} = 250 \\\\\n",
" x_{i} \\geq 0\n",
"\\end{cases}\n",
"\\end{aligned}$$\n",
"\n",
"i.e.\n",
"\n",
"$$\\begin{aligned}\n",
"& \\min_{\\mathbf{x}} \\mathbf{w}^{\\text{T}} \\mathbf{x} \\\\\n",
"& s.t. \n",
"\\begin{cases}\n",
" \\mathbf{A}_{\\text{ub}} \\mathbf{x} - \\mathbf{b}_{\\text{ub}} \\leq 0 \\\\\n",
" \\mathbf{A}_{\\text{eq}} \\mathbf{x} - \\mathbf{b}_{\\text{eq}} = 0 \\\\\n",
" - \\mathbf{x} \\leq 0\n",
"\\end{cases}\n",
"\\end{aligned}$$\n",
"\n",
"with\n",
"\n",
"$$\\mathbf{A}_{\\text{ub}} = \\begin{bmatrix}\n",
" 1 & 1 & 1 & 0 & 0 & 0 \\\\\n",
" 0 & 0 & 0 & 1 & 1 & 1\n",
"\\end{bmatrix}$$"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"```(b)``` 利用lp solver求解最优解。 参考:\n",
"https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.linprog.html\n",
" 或者: http://cvxopt.org/"
]
},
{
"cell_type": "code",
"execution_count": 29,
"metadata": {},
"outputs": [],
"source": [
"# your implementation\n",
"\n",
"from scipy.optimize import linprog\n",
"import numpy as np"
]
},
{
"cell_type": "code",
"execution_count": 34,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"cost: 3049.999996 with x = [5.00000000e+01 3.80939619e-08 2.49999999e+02 1.50000000e+02\n",
" 2.99999999e+02 1.15679510e-07]\n"
]
}
],
"source": [
"w = np.array([5, 6, 4, 6, 3, 7])\n",
"\n",
"A_ub = np.array([[1, 1, 1, 0, 0, 0], [0, 0, 0, 1, 1, 1]])\n",
"b_ub = np.array([300, 500])\n",
"\n",
"A_eq = np.array([[1, 0, 0, 1, 0, 0], [0, 1, 0, 0, 1, 0], [0, 0, 1, 0, 0, 1]])\n",
"b_eq = np.array([200, 300, 250])\n",
"\n",
"ret = linprog(c=w, A_ub=A_ub, b_ub=b_ub, A_eq=A_eq, b_eq=b_eq)\n",
"\n",
"if ret.success:\n",
" x_opt = ret.x\n",
" cost = ret.fun\n",
"else:\n",
" x_opt = None\n",
" cost = None\n",
" \n",
"print(\"cost: {:f} with x = {}\".format(cost, x_opt))"
]
},
{
"cell_type": "code",
"execution_count": 35,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"cost: 3050 with x = [50, 0, 250, 150, 300, 0]\n"
]
}
],
"source": [
"# rounding to integer\n",
"\n",
"x_opt = [int(round(num)) for num in x_opt]\n",
"cost = np.sum(c * x)\n",
"print(\"cost: {:d} with x = {}\".format(cost, x_opt))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"```(c)```: 试着把上述LP转化成Dual formulation,请写出dual form. "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"// your formulation\n",
"\n",
"* primal problem:\n",
"\n",
"$$\\begin{aligned}\n",
"& \\min_{\\mathbf{x}} \\mathbf{w}^{\\text{T}} \\mathbf{x} \\\\\n",
"& s.t. \n",
"\\begin{cases}\n",
" \\mathbf{A}_{\\text{ub}} \\mathbf{x} - \\mathbf{b}_{\\text{ub}} \\leq 0 \\\\\n",
" \\mathbf{A}_{\\text{eq}} \\mathbf{x} - \\mathbf{b}_{\\text{eq}} = 0 \\\\\n",
" - \\mathbf{x} \\leq 0\n",
"\\end{cases}\n",
"\\end{aligned}$$\n",
"\n",
"lagrangian formula\n",
"\n",
"$$\\begin{aligned}\n",
"\\mathcal{L}(\\mathbf{x}, \\mathbf{\\alpha}, \\mathbf{\\beta}, \\mathbf{\\gamma})\n",
"& = \\mathbf{w}^{\\text{T}} \\mathbf{x}\n",
"+ \\mathbf{\\alpha}^{\\text{T}} (\\mathbf{A}_{\\text{ub}} \\mathbf{x} - \\mathbf{b}_{\\text{ub}})\n",
"+ \\mathbf{\\beta}^{\\text{T}} (\\mathbf{A}_{\\text{eq}} \\mathbf{x} - \\mathbf{b}_{\\text{eq}})\n",
"- \\mathbf{\\gamma}^{\\text{T}} \\mathbf{x}\n",
"\\end{aligned}$$\n",
"\n",
"with\n",
"\n",
"$$\\mathbf{\\alpha} \\geq 0, \\mathbf{\\gamma} \\geq 0$$\n",
"\n",
"define\n",
"\n",
"$$\\begin{aligned}\n",
"\\mathcal{P}(\\mathbf{x})\n",
"& = \\sup_{\\mathbf{\\alpha} \\geq 0, \\mathbf{\\beta}, \\mathbf{\\gamma} \\geq 0} \\mathcal{L}(\\mathbf{x}, \\mathbf{\\alpha}, \\mathbf{\\beta}, \\mathbf{\\gamma}) \\\\\n",
"& = \\begin{cases}\n",
" \\mathbf{w}^{\\text{T}} \\mathbf{x}, & \\text{if } \\mathbf{x} {\\text{ is feasible}} \\\\\n",
" + \\infty, & \\text{otherwise} \\\\\n",
"\\end{cases}\n",
"\\end{aligned}$$\n",
"\n",
"therefore\n",
"\n",
"$$\\begin{aligned}\n",
"\\min_{\\mathbf{x}} \\mathbf{w}^{\\text{T}} \\mathbf{x}\n",
"& \\Leftrightarrow \\inf_{\\mathbf{x}} \\mathcal{P}(\\mathbf{x})\n",
"= \\inf_{\\mathbf{x}} \\sup_{\\mathbf{\\alpha} \\geq 0, \\mathbf{\\beta}, \\mathbf{\\gamma} \\geq 0} \\mathcal{L}(\\mathbf{x}, \\mathbf{\\alpha}, \\mathbf{\\beta}, \\mathbf{\\gamma})\n",
"\\end{aligned}$$\n",
"\n",
"* dual problem:\n",
"\n",
"define\n",
"\n",
"$$\\begin{aligned}\n",
"\\mathcal{D}(\\mathbf{\\alpha}, \\mathbf{\\beta}, \\mathbf{\\gamma})\n",
"& = \\inf_{\\mathbf{x}} \\mathcal{L}(\\mathbf{x}, \\mathbf{\\alpha}, \\mathbf{\\beta}, \\mathbf{\\gamma}) \\\\\n",
"& = \\mathbf{w}^{\\text{T}} \\mathbf{x}\n",
"+ \\mathbf{\\alpha}^{\\text{T}} (\\mathbf{A}_{\\text{ub}} \\mathbf{x} - \\mathbf{b}_{\\text{ub}})\n",
"+ \\mathbf{\\beta}^{\\text{T}} (\\mathbf{A}_{\\text{eq}} \\mathbf{x} - \\mathbf{b}_{\\text{eq}})\n",
"- \\mathbf{\\gamma}^{\\text{T}} \\mathbf{x} \\\\\n",
"& = (\\mathbf{w}^{\\text{T}} + \\mathbf{\\alpha}^{\\text{T}} \\mathbf{A}_{\\text{ub}} + \\mathbf{\\beta}^{\\text{T}} \\mathbf{A}_{\\text{eq}} - \\mathbf{\\gamma}^{\\text{T}}) \\mathbf{x}\n",
"- \\mathbf{\\alpha}^{\\text{T}} \\mathbf{b}_{\\text{ub}}\n",
"- \\mathbf{\\beta}^{\\text{T}} \\mathbf{b}_{\\text{eq}} \\\\\n",
"& = \\begin{cases}\n",
" - \\mathbf{\\alpha}^{\\text{T}} \\mathbf{b}_{\\text{ub}}\n",
" - \\mathbf{\\beta}^{\\text{T}} \\mathbf{b}_{\\text{eq}},\n",
" & \\text{if } \\mathbf{w}^{\\text{T}} + \\mathbf{\\alpha}^{\\text{T}} \\mathbf{A}_{\\text{ub}} + \\mathbf{\\beta}^{\\text{T}} \\mathbf{A}_{\\text{eq}} - \\mathbf{\\gamma}^{\\text{T}} = 0 \\\\\n",
" - \\infty, & \\text{otherwise} \\\\\n",
"\\end{cases}\n",
"\\end{aligned}$$\n",
"\n",
"therefore, the dual problem\n",
"\n",
"$$\\begin{aligned}\n",
"\\sup_{\\mathbf{\\alpha} \\geq 0, \\mathbf{\\beta}, \\mathbf{\\gamma} \\geq 0} \\mathcal{D}(\\mathbf{\\alpha}, \\mathbf{\\beta}, \\mathbf{\\gamma})\n",
"& = \\sup_{\\mathbf{\\alpha} \\geq 0, \\mathbf{\\beta}, \\mathbf{\\gamma} \\geq 0} \\inf_{\\mathbf{x}} \\mathcal{L}(\\mathbf{x}, \\mathbf{\\alpha}, \\mathbf{\\beta}, \\mathbf{\\gamma})\n",
"\\end{aligned}$$\n",
"\n",
"can be written as\n",
"\n",
"$$\\begin{aligned}\n",
"& \\max_{\\mathbf{\\alpha}, \\mathbf{\\beta}}(- \\mathbf{\\alpha}^{\\text{T}} \\mathbf{b}_{\\text{ub}} - \\mathbf{\\beta}^{\\text{T}} \\mathbf{b}_{\\text{eq}}) \\\\\n",
"& s.t. \n",
"\\begin{cases}\n",
" \\mathbf{w}^{\\text{T}} + \\mathbf{\\alpha}^{\\text{T}} \\mathbf{A}_{\\text{ub}} + \\mathbf{\\beta}^{\\text{T}} \\mathbf{A}_{\\text{eq}} \\geq 0 \\\\\n",
" \\mathbf{\\alpha} \\geq 0 \\\\\n",
"\\end{cases}\n",
"\\end{aligned}$$\n",
"\n",
"i.e.\n",
"\n",
"$$\\begin{aligned}\n",
"& \\min_{\\mathbf{\\alpha}, \\mathbf{\\beta}}(\n",
" \\mathbf{b}_{\\text{ub}}^{\\text{T}} \\mathbf{\\alpha}\n",
" + \\mathbf{b}_{\\text{eq}}^{\\text{T}} \\mathbf{\\beta}\n",
") \\\\\n",
"& s.t. \n",
"\\begin{cases}\n",
" \\mathbf{w}^{\\text{T}} + \\mathbf{\\alpha}^{\\text{T}} \\mathbf{A}_{\\text{ub}} + \\mathbf{\\beta}^{\\text{T}} \\mathbf{A}_{\\text{eq}} \\geq 0 \\\\\n",
" \\mathbf{\\alpha} \\geq 0 \\\\\n",
"\\end{cases}\n",
"\\end{aligned}$$"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.9"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment