{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "## 机器学习中的优化 " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 问题1 (30分)\n", "假设我们有训练数据$D=\\{(\\mathbf{x}_1,y_1),...,(\\mathbf{x}_n,y_n)\\}$, 其中$(\\mathbf{x}_i,y_i)$为每一个样本,而且$\\mathbf{x}_i$是样本的特征并且$\\mathbf{x}_i\\in \\mathcal{R}^D$, $y_i$代表样本数据的标签(label), 取值为$0$或者$1$. 在逻辑回归中,模型的参数为$(\\mathbf{w},b)$。对于向量,我们一般用粗体来表达。 为了后续推导的方便,可以把b融入到参数w中。 这是参数$w$就变成 $w=(w_0, w_1, .., w_D)$,也就是前面多出了一个项$w_0$, 可以看作是b,这时候每一个$x_i$也需要稍作改变可以写成 $x_i = [1, x_i]$,前面加了一个1。稍做思考应该能看出为什么可以这么写。\n", "\n", "请回答以下问题。请用Markdown自带的Latex来编写。\n", "\n", "\n", "\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### (a) ```编写逻辑回归的目标函数```\n", "请写出目标函数(objective function), 也就是我们需要\"最小化\"的目标(也称之为损失函数或者loss function),不需要考虑正则。 把目标函数表示成最小化的形态,另外把$\\prod_{}^{}$转换成$\\log \\sum_{}^{}$\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "$$\\mathcal{L} (\\mathbf{w}) = - \n", "\\sum_{i = 1}^{n} \\left[\n", " y_{i} \\log \\sigma (\\mathbf{w}^{\\text{T}} \\mathbf{x}_{i}) +\n", " (1 - y_{i}) \\log [1 - \\sigma (\\mathbf{w}^{\\text{T}} \\mathbf{x}_{i})]\n", "\\right]$$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### (b) ```求解对w的一阶导数```\n", "为了做梯度下降法,我们需要对参数$w$求导,请把$L(w)$对$w$的梯度计算一下:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "$$\\begin{aligned}\n", "\\frac{\\partial \\mathcal{L} (\\mathbf{w})}{\\partial \\mathbf{w}}\n", "& = - \\frac{\n", " \\partial \\sum_{i = 1}^{n} \\left[\n", " y_{i} \\log \\sigma (\\mathbf{w}^{\\text{T}} \\mathbf{x}_{i}) + \n", " (1 - y_{i}) \\log [1 - \\sigma (\\mathbf{w}^{\\text{T}} \\mathbf{x}_{i})]\n", " \\right]\n", "}\n", "{\n", " \\partial \\mathbf{w}\n", "} \\\\\n", "& = - \\sum_{i = 1}^{n} \\left[\n", " y_{i} \\frac{\n", " \\partial \\log \\sigma (\\mathbf{w}^{\\text{T}} \\mathbf{x}_{i})\n", " }\n", " {\n", " \\partial \\mathbf{w}\n", " } + \n", " (1 - y_{i}) \\frac{\n", " \\partial \\log [1 - \\sigma (\\mathbf{w}^{\\text{T}} \\mathbf{x}_{i})]\n", " }\n", " {\n", " \\partial \\mathbf{w}\n", " }\n", "\\right] \\\\\n", "& = - \\sum_{i = 1}^{n} \\left[\n", " y_{i} \\left( 1 - \\sigma (\\mathbf{w}^{\\text{T}} \\mathbf{x}_{i} ) \\right) \\mathbf{x}_{i} - \n", " (1 - y_{i}) \\sigma (\\mathbf{w}^{\\text{T}} \\mathbf{x}_{i} ) \\mathbf{x}_{i}\n", "\\right] \\\\\n", "& = \\sum_{i = 1}^{n} \\left[\n", " \\sigma (\\mathbf{w}^{\\text{T}} \\mathbf{x}_{i}) - y_{i}\n", "\\right] \\mathbf{x}_{i} \\\\\n", "\\end{aligned}$$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### (c) ```求解对w的二阶导数```\n", "在上面结果的基础上对$w$求解二阶导数,也就是再求一次导数。 这个过程需要回忆一下线性代数的部分 ^^。 参考: matrix cookbook: https://www.math.uwaterloo.ca/~hwolkowi/matrixcookbook.pdf, 还有 Hessian Matrix。 " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "$$\\begin{aligned}\n", "\\frac{\\partial^{2} \\mathcal{L}(\\mathbf{w})}{\\partial^{2} \\mathbf{w}}\n", "& = \\frac{\n", " \\partial \\sum_{i = 1}^{n} \\left[\n", " \\sigma (\\mathbf{w}^{\\text{T}} \\mathbf{x}_{i}) - y_{i}\n", " \\right] \\mathbf{x}_{i}\n", "}\n", "{\n", " \\partial \\mathbf{w}\n", "} \\\\\n", "& = \\sum_{i = 1}^{n} \\frac{\n", " \\partial \\sigma (\\mathbf{w}^{\\text{T}} \\mathbf{x}_{i})\n", "}\n", "{\n", " \\partial \\mathbf{w}\n", "} \\mathbf{x}_{i}^{\\text{T}} \\\\\n", "& = \\sum_{i = 1}^{n}\n", "\\sigma (\\mathbf{w}^{\\text{T}} \\mathbf{x}_{i})\n", "\\left( 1 - \\sigma (\\mathbf{w}^{\\text{T}} \\mathbf{x}_{i}) \\right)\n", "\\mathbf{x}_{i} \\mathbf{x}_{i}^{\\text{T}} \\\\\n", "\\end{aligned}$$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### (d) ```证明逻辑回归目标函数是凸函数```\n", "试着证明逻辑回归函数是凸函数。假设一个函数是凸函数,我们则可以得出局部最优解即为全局最优解,所以假设我们通过随机梯度下降法等手段找到最优解时我们就可以确认这个解就是全局最优解。证明凸函数的方法有很多种,在这里我们介绍一种方法,就是基于二次求导大于等于0。比如给定一个函数$f(x)=x^2-3x+3$,做两次\n", "求导之后即可以得出$f''(x)=2 > 0$,所以这个函数就是凸函数。类似的,这种理论也应用于多元变量中的函数上。在多元函数上,只要证明二阶导数是posititive semidefinite即可以。 问题(c)的结果是一个矩阵。 为了证明这个矩阵(假设为H)为Positive Semidefinite,需要证明对于任意一个非零向量$v\\in \\mathcal{R}$, 需要得出$v^{T}Hv >=0$\n", "请写出详细的推导过程:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "// TODO 请写下推导过程\n", "\n", "$$\\begin{aligned}\n", "\\mathbf{v}^{\\text{T}} \\frac{\\partial^{2} \\mathcal{L}(\\mathbf{w})}{\\partial^{2} \\mathbf{w}} \\mathbf{v}\n", "& = \\sum_{i = 1}^{n}\n", "\\sigma (\\mathbf{w}^{\\text{T}} \\mathbf{x}_{i})\n", "\\left( 1 - \\sigma (\\mathbf{w}^{\\text{T}} \\mathbf{x}_{i}) \\right)\n", "\\mathbf{v}^{\\text{T}} \\mathbf{x}_{i} \\mathbf{x}_{i}^{\\text{T}} \\mathbf{v}\\\\\n", "\\end{aligned}$$\n", "\n", "由于$0 \\lt \\sigma (\\mathbf{w}^{\\text{T}} \\mathbf{x}_{i}) \\lt 1$,故只需证明$\\mathbf{v}^{\\text{T}} \\mathbf{x}_{i} \\mathbf{x}_{i}^{\\text{T}} \\mathbf{v} \\geq 0$\n", "\n", "$$\\mathbf{v}^{\\text{T}} \\mathbf{x}_{i} \\mathbf{x}_{i}^{\\text{T}} \\mathbf{v}\n", "= \\left( \\mathbf{v}^{\\text{T}} \\mathbf{x}_{i} \\right)^{2}\n", "\\geq 0\n", "\\Rightarrow\n", "\\frac{\\partial^{2} \\mathcal{L}(\\mathbf{w})}{\\partial^{2} \\mathbf{w}}\n", "\\geq 0\n", "\\Rightarrow\n", "\\frac{\\partial^{2} \\mathcal{L}(\\mathbf{w})}{\\partial^{2} \\mathbf{w}}\n", "\\succeq 0\n", "$$\n", "\n", "即$\\frac{\\partial^{2} \\mathcal{L}(\\mathbf{w})}{\\partial^{2} \\mathbf{w}}$为半正定矩阵,逻辑回归函数是凸函数得证。\n", "\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 问题2 (20分)\n", "证明p-norm是凸函数, p-norm的定义为:\n", "$||x||_p=(\\sum_{i=1}^{n}|x_i|^p)^{1/p}$\n", "\n", "hint: Minkowski’s Inequality" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "// TODO: your proof\n", "\n", "minkowski’s inequality, if $1 \\leq p \\lt \\infty$, then whenever $\\mathbf{x}, \\mathbf{y} \\in \\mathcal{V}_{F}$, we have\n", "\n", "$$\\begin{aligned}\n", "\\left\\| \\mathbf{x} + \\mathbf{y} \\right\\|_{p} \\leq \\left\\| \\mathbf{x} \\right\\|_{p} + \\left\\| \\mathbf{y} \\right\\|_{p}\n", "\\end{aligned}$$\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "given two arbitrary points, $\\mathbf{x}$ and $\\mathbf{y}$, and $0 \\leq \\alpha \\leq 1$, let $\\mathbf{z} = \\alpha \\mathbf{x} + (1 - \\alpha)\\mathbf{y}$\n", "\n", "$$\\begin{aligned}\n", "\\| \\mathbf{z} \\| =\n", "\\| \\alpha \\mathbf{x} + (1 - \\alpha) \\mathbf{y} \\|_{p} \\leq\n", "\\| \\alpha \\mathbf{x}_{p} \\| + \\| (1 - \\alpha) \\mathbf{y} \\|_{p} =\n", "\\alpha \\| \\mathbf{x}_{p} \\| + (1 - \\alpha) \\| \\mathbf{y} \\|_{p}\n", "\\end{aligned}$$\n", "\n", "i.e. p-norm is convex" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 问题3 (20分)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "在第一次课程中,我们讲了在判定凸函数的时候用到一项技术:second-order convexity, 也就是当函数f(x)在每一个点上twice differentiable, 这时候我们就有个性质: f(x)是凸函数,当且仅当 f(x)的二阶为PSD矩阵(半正定矩阵)。 请在下方试着证明一下此理论。 " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "// your proof of second order convexity\n", "\n", "1. sufficiency, to prove $\\nabla^{2} f(\\mathbf{x}) \\succeq 0 \\Rightarrow f(\\mathbf{y}) \\geq f(\\mathbf{x}) + {\\nabla f(\\mathbf{x})}^{\\text{T}} (\\mathbf{y} - \\mathbf{x})$\n", "\n", "using tayler expansion at $\\mathbf{x}$, let $\\mathbf{y} = \\mathbf{x} + \\Delta \\mathbf{x}$ and $0 \\lt \\theta \\lt 1$,\n", "\n", "$$\\begin{aligned}\n", "f(\\mathbf{y})\n", "& = f(\\mathbf{x} + \\Delta \\mathbf{x}) \\\\\n", "& = f(\\mathbf{x}) + {\\nabla f(\\mathbf{x})}^{\\text{T}} \\Delta \\mathbf{x}\n", "+ \\frac{1}{2} {\\Delta \\mathbf{x}}^{\\text{T}} {\\nabla^{2} f(\\mathbf{x} + \\theta \\Delta \\mathbf{x})} \\Delta \\mathbf{x} \\\\\n", "& \\because \\nabla^{2} f(\\mathbf{x}) \\succeq 0 \\\\\n", "& \\therefore {\\Delta \\mathbf{x}}^{\\text{T}} {\\nabla^{2} f(\\mathbf{x} + \\theta \\Delta \\mathbf{x})} \\Delta \\mathbf{x} \\geq 0 \\\\\n", "f(\\mathbf{y})\n", "& \\geq f(\\mathbf{x}) + {\\nabla f(\\mathbf{x})}^{\\text{T}} \\Delta \\mathbf{x}\n", "\\end{aligned}$$\n", "\n", "2. necessity, to prove $f(\\mathbf{y}) \\geq f(\\mathbf{x}) + {\\nabla f(\\mathbf{x})}^{\\text{T}} (\\mathbf{y} - \\mathbf{x}) \\Rightarrow \\nabla^{2} f(\\mathbf{x}) \\succeq 0 $\n", "\n", "consider tayler expansion, let $0 \\leq \\theta \\leq 1$\n", "\n", "$$\\begin{aligned}\n", "f(\\mathbf{x} + \\Delta \\mathbf{x})\n", "& = f(\\mathbf{x}) + \\nabla f(\\mathbf{x}) \\Delta \\mathbf{x}\n", "+ \\frac{1}{2} {\\Delta \\mathbf{x}}^{\\text{T}} \\nabla^{2} f(\\mathbf{x} + \\theta \\Delta \\mathbf{x}) {\\Delta \\mathbf{x}}\n", "\\end{aligned}$$\n", "\n", "for $f$ is convex,\n", "\n", "$$\\begin{aligned}\n", "f(\\mathbf{x} + \\Delta \\mathbf{x}) \\geq f(\\mathbf{x}) + \\nabla f(\\mathbf{x}) \\Delta \\mathbf{x}\n", "\\end{aligned}$$\n", "\n", "then\n", "\n", "$$\\begin{aligned}\n", "{\\Delta \\mathbf{x}}^{\\text{T}} \\nabla^{2} f(\\mathbf{x} + \\theta \\Delta \\mathbf{x}) {\\Delta \\mathbf{x}}\n", "& = 2 (f(\\mathbf{x} + \\Delta \\mathbf{x})\n", "- f(\\mathbf{x}) - \\nabla f(\\mathbf{x}) \\Delta \\mathbf{x})\n", "\\geq 0\n", "\\end{aligned}$$\n", "\n", "for $\\Delta \\mathbf{x}$ is an arbitrary vector, according to the definition of positive semi-definite,\n", "\n", "$$\\begin{aligned}\n", "\\nabla^{2} f(\\mathbf{x} + \\theta \\Delta \\mathbf{x}) \\succeq 0\n", "\\end{aligned}$$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 问题4 (30分)\n", "在课堂里我们讲过Transportation problem. 重新描述问题: 有两个城市北京和上海,分别拥有300件衣服和500件衣服,另外有三个城市分别是1,2,3分别需要200,300,250件衣服。现在需要把衣服从北京和上海运送到城市1,2,3。 我们假定每运输一件衣服会产生一些代价,比如:\n", "- 北京 -> 1: 5\n", "- 北京 -> 2: 6\n", "- 北京 -> 3: 4\n", "- 上海 -> 1: 6\n", "- 上海 -> 2: 3\n", "- 上海 -> 3: 7\n", "\n", "最后的值是单位cost. \n", "\n", "问题:我们希望最小化成本。 " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "```(a)``` 请写出linear programming formulation。 利用标准的写法(Stanford form),建议使用矩阵、向量的表示法。 " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "// your formulation\n", "\n", "$\\mathbf{x} = (x_{1}, x_{2}, \\dots, x_{6})$, $\\mathbf{w} = (5, 6, 4, 6, 3, 7)$\n", "\n", "$$\\begin{aligned}\n", "& \\min_{\\mathbf{x}} \\mathbf{w}^{\\text{T}} \\mathbf{x} \\\\\n", "& s.t. \n", "\\begin{cases}\n", " x_{1} + x_{2} + x_{3} \\leq 300 \\\\\n", " x_{4} + x_{5} + x_{6} \\leq 500 \\\\\n", " x_{1} + x{3} = 200 \\\\\n", " x_{2} + x{4} = 300 \\\\\n", " x_{3} + x{6} = 250 \\\\\n", " x_{i} \\geq 0\n", "\\end{cases}\n", "\\end{aligned}$$\n", "\n", "i.e.\n", "\n", "$$\\begin{aligned}\n", "& \\min_{\\mathbf{x}} \\mathbf{w}^{\\text{T}} \\mathbf{x} \\\\\n", "& s.t. \n", "\\begin{cases}\n", " \\mathbf{A}_{\\text{ub}} \\mathbf{x} - \\mathbf{b}_{\\text{ub}} \\leq 0 \\\\\n", " \\mathbf{A}_{\\text{eq}} \\mathbf{x} - \\mathbf{b}_{\\text{eq}} = 0 \\\\\n", " - \\mathbf{x} \\leq 0\n", "\\end{cases}\n", "\\end{aligned}$$\n", "\n", "with\n", "\n", "$$\\mathbf{A}_{\\text{ub}} = \\begin{bmatrix}\n", " 1 & 1 & 1 & 0 & 0 & 0 \\\\\n", " 0 & 0 & 0 & 1 & 1 & 1\n", "\\end{bmatrix}$$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "```(b)``` 利用lp solver求解最优解。 参考:\n", "https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.linprog.html\n", " 或者: http://cvxopt.org/" ] }, { "cell_type": "code", "execution_count": 29, "metadata": {}, "outputs": [], "source": [ "# your implementation\n", "\n", "from scipy.optimize import linprog\n", "import numpy as np" ] }, { "cell_type": "code", "execution_count": 34, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "cost: 3049.999996 with x = [5.00000000e+01 3.80939619e-08 2.49999999e+02 1.50000000e+02\n", " 2.99999999e+02 1.15679510e-07]\n" ] } ], "source": [ "w = np.array([5, 6, 4, 6, 3, 7])\n", "\n", "A_ub = np.array([[1, 1, 1, 0, 0, 0], [0, 0, 0, 1, 1, 1]])\n", "b_ub = np.array([300, 500])\n", "\n", "A_eq = np.array([[1, 0, 0, 1, 0, 0], [0, 1, 0, 0, 1, 0], [0, 0, 1, 0, 0, 1]])\n", "b_eq = np.array([200, 300, 250])\n", "\n", "ret = linprog(c=w, A_ub=A_ub, b_ub=b_ub, A_eq=A_eq, b_eq=b_eq)\n", "\n", "if ret.success:\n", " x_opt = ret.x\n", " cost = ret.fun\n", "else:\n", " x_opt = None\n", " cost = None\n", " \n", "print(\"cost: {:f} with x = {}\".format(cost, x_opt))" ] }, { "cell_type": "code", "execution_count": 35, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "cost: 3050 with x = [50, 0, 250, 150, 300, 0]\n" ] } ], "source": [ "# rounding to integer\n", "\n", "x_opt = [int(round(num)) for num in x_opt]\n", "cost = np.sum(c * x)\n", "print(\"cost: {:d} with x = {}\".format(cost, x_opt))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "```(c)```: 试着把上述LP转化成Dual formulation,请写出dual form. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "// your formulation\n", "\n", "* primal problem:\n", "\n", "$$\\begin{aligned}\n", "& \\min_{\\mathbf{x}} \\mathbf{w}^{\\text{T}} \\mathbf{x} \\\\\n", "& s.t. \n", "\\begin{cases}\n", " \\mathbf{A}_{\\text{ub}} \\mathbf{x} - \\mathbf{b}_{\\text{ub}} \\leq 0 \\\\\n", " \\mathbf{A}_{\\text{eq}} \\mathbf{x} - \\mathbf{b}_{\\text{eq}} = 0 \\\\\n", " - \\mathbf{x} \\leq 0\n", "\\end{cases}\n", "\\end{aligned}$$\n", "\n", "lagrangian formula\n", "\n", "$$\\begin{aligned}\n", "\\mathcal{L}(\\mathbf{x}, \\mathbf{\\alpha}, \\mathbf{\\beta}, \\mathbf{\\gamma})\n", "& = \\mathbf{w}^{\\text{T}} \\mathbf{x}\n", "+ \\mathbf{\\alpha}^{\\text{T}} (\\mathbf{A}_{\\text{ub}} \\mathbf{x} - \\mathbf{b}_{\\text{ub}})\n", "+ \\mathbf{\\beta}^{\\text{T}} (\\mathbf{A}_{\\text{eq}} \\mathbf{x} - \\mathbf{b}_{\\text{eq}})\n", "- \\mathbf{\\gamma}^{\\text{T}} \\mathbf{x}\n", "\\end{aligned}$$\n", "\n", "with\n", "\n", "$$\\mathbf{\\alpha} \\geq 0, \\mathbf{\\gamma} \\geq 0$$\n", "\n", "define\n", "\n", "$$\\begin{aligned}\n", "\\mathcal{P}(\\mathbf{x})\n", "& = \\sup_{\\mathbf{\\alpha} \\geq 0, \\mathbf{\\beta}, \\mathbf{\\gamma} \\geq 0} \\mathcal{L}(\\mathbf{x}, \\mathbf{\\alpha}, \\mathbf{\\beta}, \\mathbf{\\gamma}) \\\\\n", "& = \\begin{cases}\n", " \\mathbf{w}^{\\text{T}} \\mathbf{x}, & \\text{if } \\mathbf{x} {\\text{ is feasible}} \\\\\n", " + \\infty, & \\text{otherwise} \\\\\n", "\\end{cases}\n", "\\end{aligned}$$\n", "\n", "therefore\n", "\n", "$$\\begin{aligned}\n", "\\min_{\\mathbf{x}} \\mathbf{w}^{\\text{T}} \\mathbf{x}\n", "& \\Leftrightarrow \\inf_{\\mathbf{x}} \\mathcal{P}(\\mathbf{x})\n", "= \\inf_{\\mathbf{x}} \\sup_{\\mathbf{\\alpha} \\geq 0, \\mathbf{\\beta}, \\mathbf{\\gamma} \\geq 0} \\mathcal{L}(\\mathbf{x}, \\mathbf{\\alpha}, \\mathbf{\\beta}, \\mathbf{\\gamma})\n", "\\end{aligned}$$\n", "\n", "* dual problem:\n", "\n", "define\n", "\n", "$$\\begin{aligned}\n", "\\mathcal{D}(\\mathbf{\\alpha}, \\mathbf{\\beta}, \\mathbf{\\gamma})\n", "& = \\inf_{\\mathbf{x}} \\mathcal{L}(\\mathbf{x}, \\mathbf{\\alpha}, \\mathbf{\\beta}, \\mathbf{\\gamma}) \\\\\n", "& = \\mathbf{w}^{\\text{T}} \\mathbf{x}\n", "+ \\mathbf{\\alpha}^{\\text{T}} (\\mathbf{A}_{\\text{ub}} \\mathbf{x} - \\mathbf{b}_{\\text{ub}})\n", "+ \\mathbf{\\beta}^{\\text{T}} (\\mathbf{A}_{\\text{eq}} \\mathbf{x} - \\mathbf{b}_{\\text{eq}})\n", "- \\mathbf{\\gamma}^{\\text{T}} \\mathbf{x} \\\\\n", "& = (\\mathbf{w}^{\\text{T}} + \\mathbf{\\alpha}^{\\text{T}} \\mathbf{A}_{\\text{ub}} + \\mathbf{\\beta}^{\\text{T}} \\mathbf{A}_{\\text{eq}} - \\mathbf{\\gamma}^{\\text{T}}) \\mathbf{x}\n", "- \\mathbf{\\alpha}^{\\text{T}} \\mathbf{b}_{\\text{ub}}\n", "- \\mathbf{\\beta}^{\\text{T}} \\mathbf{b}_{\\text{eq}} \\\\\n", "& = \\begin{cases}\n", " - \\mathbf{\\alpha}^{\\text{T}} \\mathbf{b}_{\\text{ub}}\n", " - \\mathbf{\\beta}^{\\text{T}} \\mathbf{b}_{\\text{eq}},\n", " & \\text{if } \\mathbf{w}^{\\text{T}} + \\mathbf{\\alpha}^{\\text{T}} \\mathbf{A}_{\\text{ub}} + \\mathbf{\\beta}^{\\text{T}} \\mathbf{A}_{\\text{eq}} - \\mathbf{\\gamma}^{\\text{T}} = 0 \\\\\n", " - \\infty, & \\text{otherwise} \\\\\n", "\\end{cases}\n", "\\end{aligned}$$\n", "\n", "therefore, the dual problem\n", "\n", "$$\\begin{aligned}\n", "\\sup_{\\mathbf{\\alpha} \\geq 0, \\mathbf{\\beta}, \\mathbf{\\gamma} \\geq 0} \\mathcal{D}(\\mathbf{\\alpha}, \\mathbf{\\beta}, \\mathbf{\\gamma})\n", "& = \\sup_{\\mathbf{\\alpha} \\geq 0, \\mathbf{\\beta}, \\mathbf{\\gamma} \\geq 0} \\inf_{\\mathbf{x}} \\mathcal{L}(\\mathbf{x}, \\mathbf{\\alpha}, \\mathbf{\\beta}, \\mathbf{\\gamma})\n", "\\end{aligned}$$\n", "\n", "can be written as\n", "\n", "$$\\begin{aligned}\n", "& \\max_{\\mathbf{\\alpha}, \\mathbf{\\beta}}(- \\mathbf{\\alpha}^{\\text{T}} \\mathbf{b}_{\\text{ub}} - \\mathbf{\\beta}^{\\text{T}} \\mathbf{b}_{\\text{eq}}) \\\\\n", "& s.t. \n", "\\begin{cases}\n", " \\mathbf{w}^{\\text{T}} + \\mathbf{\\alpha}^{\\text{T}} \\mathbf{A}_{\\text{ub}} + \\mathbf{\\beta}^{\\text{T}} \\mathbf{A}_{\\text{eq}} \\geq 0 \\\\\n", " \\mathbf{\\alpha} \\geq 0 \\\\\n", "\\end{cases}\n", "\\end{aligned}$$\n", "\n", "i.e.\n", "\n", "$$\\begin{aligned}\n", "& \\min_{\\mathbf{\\alpha}, \\mathbf{\\beta}}(\n", " \\mathbf{b}_{\\text{ub}}^{\\text{T}} \\mathbf{\\alpha}\n", " + \\mathbf{b}_{\\text{eq}}^{\\text{T}} \\mathbf{\\beta}\n", ") \\\\\n", "& s.t. \n", "\\begin{cases}\n", " \\mathbf{w}^{\\text{T}} + \\mathbf{\\alpha}^{\\text{T}} \\mathbf{A}_{\\text{ub}} + \\mathbf{\\beta}^{\\text{T}} \\mathbf{A}_{\\text{eq}} \\geq 0 \\\\\n", " \\mathbf{\\alpha} \\geq 0 \\\\\n", "\\end{cases}\n", "\\end{aligned}$$" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.9" } }, "nbformat": 4, "nbformat_minor": 2 }