0620

3c46eefb · 20200519029 · 8d6b3f54 · 4ce6bc9a · 8d6b3f54 · 3c46eefb
Commit 3c46eefb authored Jun 20, 2020 by 20200519029
27 changed files
--- a/Mini-homework.ipynb
+++ b/Mini-homework.ipynb
-{
- "cells": [
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "### Problem 1. Fibonacci Sequence\n",
-    "在课程里，讨论过如果去找到第N个Fibonacci number。在这里，我们来试着求一下它的Closed-form解。 \n",
-    "\n",
-    "Fibonacci数列为 1,1,2,3,5,8,13,21,.... 也就第一个数为1，第二个数为1，以此类推...\n",
-    "我们用f(n)来数列里的第n个数，比如n=3时 f(3)=2。\n",
-    "\n",
-    "下面，来证明一下fibonacci数列的closed-form, 如下：\n",
-    "\n",
-    "$f(n)=\\frac{1}{\\sqrt{5}}(\\frac{1+\\sqrt{5}}{2})^n-\\frac{1}{\\sqrt{5}}(\\frac{1-\\sqrt{5}}{2})^n$\n",
-    "\n"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "// your proof is here ....\n",
-    "\n",
-    "\n",
-    "\n",
-    "\n",
-    "\n",
-    "\n",
-    "\n",
-    "\n",
-    "\n",
-    "\n",
-    "\n",
-    "\n",
-    "\n",
-    "\n",
-    "\n",
-    "\n"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "令$F_{n}$表示第N个Fibonacci number，令$F_{0}=0$\n",
-    "设存在$M=\\left(\\begin{array}{ll}A & C \\\\ B & D\\end{array}\\right)$使得$\\left(\\begin{array}{c}F_{n} \\\\ F_{n+1}\\end{array}\\right)=M\\left(\\begin{array}{c}F_{n-1} \\\\ F_{n}\\end{array}\\right)$有$\\left(\\begin{array}{c}F_{n} \\\\ F_{n+1}\\end{array}\\right)=\\left(\\begin{array}{c}A F_{n-1}+C F_{n} \\\\ B F_{n-1}+D F_{n}\\end{array}\\right)$令$M=\\left(\\begin{array}{ll}0 & 1 \\\\ 1 & 1\\end{array}\\right)$可求出通项公式$\\left(\\begin{array}{c}F_{n} \\\\ F_{n+1}\\end{array}\\right)=M\\left(\\begin{array}{c}F_{n-1} \\\\ F_{n}\\end{array}\\right) \\Rightarrow\\left(\\begin{array}{c}F_{n} \\\\ F_{n+1}\\end{array}\\right)=M^{n}\\left(\\begin{array}{c}F_{0} \\\\ F_{1}\\end{array}\\right) \\Rightarrow\\left(\\begin{array}{c}F_{n} \\\\ F_{n+1}\\end{array}\\right)=P D^{n} P^{-1}\\left(\\begin{array}{c}0 \\\\ 1\\end{array}\\right)$解其特征方程$\\operatorname{det}(M-\\lambda I)=0$得$\\lambda=\\frac{1 \\pm \\sqrt{5}}{2}$特征向量$\\left(\\begin{array}{c}1 \\\\ \\frac{1 \\pm \\sqrt{5}}{2}\\end{array}\\right)$\n",
-    "$P^{-1}=\\left(\\begin{array}{cc}\\frac{\\sqrt{5}-1}{2 \\sqrt{5}} & \\frac{1}{\\sqrt{5}} \\\\ \\frac{\\sqrt{5}+1}{2 \\sqrt{5}} & -\\frac{1}{\\sqrt{5}}\\end{array}\\right)$\n",
-    "$F_{n}=(1 \\quad 0)\\left(\\begin{array}{ccc}1 & 1 \\\\ \\frac{1+\\sqrt{5}}{2} & \\frac{1-\\sqrt{5}}{2}\\end{array}\\right)\\left(\\begin{array}{cc}\\left(\\frac{1+\\sqrt{5}}{2}\\right)^{n} & 0 \\\\ 0 & \\left(\\frac{1-\\sqrt{5}}{2}\\right)^{n}\\end{array}\\right)\\left(\\begin{array}{cc}\\frac{\\sqrt{5}-1}{2 \\sqrt{5}} & \\frac{1}{\\sqrt{5}} \\\\ \\frac{\\sqrt{5}+1}{2 \\sqrt{5}} & -\\frac{1}{\\sqrt{5}}\\end{array}\\right)\\left(\\begin{array}{c}0 \\\\ 1\\end{array}\\right)$\n",
-    "$=(1 \\quad 0)\\left(\\begin{array}{ccc}1 & 1 \\\\ \\frac{1+\\sqrt{5}}{2} & \\frac{1-\\sqrt{5}}{2}\\end{array}\\right)\\left(\\begin{array}{cc}\\left.\\frac{1+\\sqrt{5}}{2}\\right)^{n} & 0 \\\\ 0 & \\left(\\frac{1-\\sqrt{5}}{2}\\right)^{n}\\end{array}\\right)\\left(\\begin{array}{c}\\frac{1}{\\sqrt{5}} \\\\ -\\frac{1}{\\sqrt{5}}\\end{array}\\right)$\n",
-    "$=\\frac{2^{-n}}{\\sqrt{5}}(1 \\quad 0)\\left(\\begin{array}{cc}1 & 1 \\\\ \\frac{1+\\sqrt{5}}{2} & \\frac{1-\\sqrt{5}}{2}\\end{array}\\right)\\left(\\begin{array}{c}(1+\\sqrt{5})^{n} \\\\ -(1-\\sqrt{5})^{n}\\end{array}\\right)=\\frac{\\left(\\frac{1+\\sqrt{5}}{2}\\right)^{n}-\\left(\\frac{1-\\sqrt{5}}{2}\\right)^{n}}{\\sqrt{5}}$故\n",
-    "$F_{n}=\\frac{\\left(\\frac{1+\\sqrt{5}}{2}\\right)^{n}-\\left(\\frac{1-\\sqrt{5}}{2}\\right)^{n}}{\\sqrt{5}}$"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "### Problem2. Algorithmic Complexity\n",
-    "对于下面的复杂度，从小大排一下顺序：\n",
-    "\n",
-    "$O(N),  O(N^2),  O(2^N),  O(N\\log N),  O(N!),  O(1),  O(\\log N),  O(3^N),  O(N^2\\log N), O(N^{2.1})$\n"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "// your answer....\n",
-    "\n",
-    "$O(1),O(\\log N), O(N), O(N\\log N),O(N^2),  O(N^{2.1},O(N^2\\log N),   O(2^N),O(3^N), O(N!))$\n",
-    "\n",
-    "\n",
-    "\n",
-    "\n",
-    "\n",
-    "\n",
-    "\n"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "### Problem 3  Dynamic Programming Problem"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "#### Edit Distance (编辑距离）\n",
-    "编辑距离用来计算两个字符串之间的最短距离，这里涉及到三个不通过的操作，add, delete和replace. 每一个操作我们假定需要1各单位的cost. \n",
-    "\n",
-    "例子： \"apple\", \"appl\" 之间的编辑距离为1 （需要1个删除的操作）\n",
-    "\"machine\", \"macaide\" dist = 2\n",
-    "\"mach\", \"aaach\"  dist=2"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 121,
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "The Edit Distance of 'sunday' and 'saturday' is 3.\n"
-     ]
-    }
-   ],
-   "source": [
-    "# s1, s2 are two strings\n",
-    "def editDistDP(s1, s2):\n",
-    "\n",
-    "    m, n = len(s1), len(s2)\n",
-    "    dp = [[0 for _ in range(n+1)] for _ in range(m+1)]\n",
-    "\n",
-    "    for i in range(m + 1):\n",
-    "        for j in range(n + 1):\n",
-    "            # i 为0 最少操作是Insert j字符\n",
-    "            if i == 0:\n",
-    "                dp[i][j] = j\n",
-    "            # j 为0 最少操作是Insert i字符\n",
-    "            elif j == 0:\n",
-    "                dp[i][j] = i\n",
-    "            # 当前字符相等时不需要编辑\n",
-    "            elif s1[i-1] == s2[j-1]:\n",
-    "                dp[i][j] = dp[i-1][j-1]\n",
-    "            #不相等时需要编辑，从三种操作中选取最小的加一\n",
-    "            else:\n",
-    "                dp[i][j] = 1 + min(dp[i][j-1],  # Insert\n",
-    "                                  dp[i-1][j],  # Remove\n",
-    "                                  dp[i-1][j-1])  # Replace  \n",
-    "    return dp[m][n]\n",
-    "\n",
-    "s1 = \"sunday\"\n",
-    "s2 = \"saturday\"\n",
-    "edit_distance = editDistDP(s1, s2)\n",
-    "print(\"The Edit Distance of '%s' and '%s' is %d.\"%(s1, s2, edit_distance))"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "### Problem 4 非技术问题\n",
-    "本题目的目的是想再深入了解背景，之后课程的内容也会根据感兴趣的点来做适当会调整。 \n",
-    "\n",
-    "\n",
-    "Q1: 之前或者现在，做过哪些AI项目/NLP项目？可以适当说一下采用的解决方案，如果目前还没有想出合适的解决方案，也可以说明一下大致的想法。 请列举几个点。\n",
-    "前期跟着咱们训练营做过一些广告点击率预测、chatbot等项目，目前尝试使用LSTM+crf做一些词性标注项目\n",
-    "\n",
-    "\n",
-    "\n",
-    "\n",
-    "\n",
-    "\n",
-    "\n",
-    "Q2: 未来想往哪个行业发展？ 或者想做哪方面的项目？ 请列举几个点。\n",
-    "医疗、金融、推荐系统等方面，想做信息检索、信息抽取、文本生成、机器翻译、问答系统的项目\n",
-    "\n",
-    "\n",
-    "\n",
-    "\n",
-    "\n",
-    "\n",
-    "\n",
-    "Q3: 参加训练营，最想获得的是什么？可以列举几个点。\n",
-    "主要想转行，想要获得对AI/NLP领域的深入了解，能实际解决一下NLP任务，对于用到的算法能够有一些自己的思考和理解，达到NLP工程师入门级水平\n",
-    "\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": []
-  }
- ],
- "metadata": {
-  "kernelspec": {
-   "display_name": "Python 3",
-   "language": "python",
-   "name": "python3"
-  },
-  "language_info": {
-   "codemirror_mode": {
-    "name": "ipython",
-    "version": 3
-   },
-   "file_extension": ".py",
-   "mimetype": "text/x-python",
-   "name": "python",
-   "nbconvert_exporter": "python",
-   "pygments_lexer": "ipython3",
-   "version": "3.7.4"
-  }
- },
- "nbformat": 4,
- "nbformat_minor": 2
-}
--- a/README.md
+++ b/README.md
--- a/课件/0530第一篇论文xgboost.pptx
+++ b/课件/0530第一篇论文xgboost.pptx
--- a/课件/0531Decision Tree，random forest，xgboost.pptx
+++ b/课件/0531Decision Tree，random forest，xgboost.pptx
--- a/课件/0531TBD动态规划.pptx
+++ b/课件/0531TBD动态规划.pptx
--- a/课件/0531哈希表，搜索树，堆（优先堆）.pptx
+++ b/课件/0531哈希表，搜索树，堆（优先堆）.pptx
--- a/课件/0606 Ensemble 模型实战--资料/代码/.gitkeep
+++ b/课件/0606 Ensemble 模型实战--资料/代码/.gitkeep
++ "b/\350\257\276\344\273\266/0606  Ensemble \346\250\241\345\236\213\345\256\236\346\210\230--\350\265\204\346\226\231/\344\273\243\347\240\201/.gitkeep"
--- a/课件/0606 Ensemble 模型实战--资料/代码/Chap08集成学习-周志华.pdf
+++ b/课件/0606 Ensemble 模型实战--资料/代码/Chap08集成学习-周志华.pdf
--- a/课件/0606 Ensemble 模型实战--资料/代码/code.zip
+++ b/课件/0606 Ensemble 模型实战--资料/代码/code.zip
--- a/课件/0606 Ensemble 模型实战 [阿勇].pptx
+++ b/课件/0606 Ensemble 模型实战 [阿勇].pptx
--- a/课件/0606第二篇论文From Word Embeddings To Document Distances.pptx
+++ b/课件/0606第二篇论文From Word Embeddings To Document Distances.pptx
--- a/课件/0607生活中的优化问题/.gitkeep
+++ b/课件/0607生活中的优化问题/.gitkeep
++ "b/\350\257\276\344\273\266/0607\347\224\237\346\264\273\344\270\255\347\232\204\344\274\230\345\214\226\351\227\256\351\242\230/.gitkeep"
--- a/课件/0607生活中的优化问题/0607 优化问题.pptx
+++ b/课件/0607生活中的优化问题/0607 优化问题.pptx
--- a/课件/0607生活中的优化问题/optimization-checkpoint.ipynb
+++ b/课件/0607生活中的优化问题/optimization-checkpoint.ipynb
--- a/课件/0613 LP QP以及它们的Dual [阿勇].pptx
+++ b/课件/0613 LP QP以及它们的Dual [阿勇].pptx
--- a/课件/0613Simplex Method与LP实战/.gitkeep
+++ b/课件/0613Simplex Method与LP实战/.gitkeep
++ "b/\350\257\276\344\273\266/0613Simplex Method\344\270\216LP\345\256\236\346\210\230/.gitkeep"
--- a/课件/0613Simplex Method与LP实战/0613Simplex Method与LP实战.pptx
+++ b/课件/0613Simplex Method与LP实战/0613Simplex Method与LP实战.pptx
--- a/课件/0613Simplex Method与LP实战/lpsolve.py
+++ b/课件/0613Simplex Method与LP实战/lpsolve.py
+'''
+'''
+原题目：
+有2000元经费，需要采购单价为50元的若干桌子和单价为20元的若干椅子，你希望桌椅的总数尽可能的多，但要求椅子数量不少于桌子数量，且不多于桌子数量的1.5倍，那你需要怎样的一个采购方案呢？
+解：要采购x1张桌子，x2把椅子
+
+max z= x1 + x2
+s.t. x1 - x2 <= 0
+1.5x1 >= x2
+50x1 + 20x2 <= 2000
+x1, x2 >=0
+
+在python中此类线性规划问题可用lp solver解决
+scipy.optimize._linprog def linprog(c: int,
+            A_ub: Optional[int] = None,
+            b_ub: Optional[int] = None,
+            A_eq: Optional[int] = None,
+            b_eq: Optional[int] = None,
+            bounds: Optional[Iterable] = None,
+            method: Optional[str] = 'simplex',
+            callback: Optional[Callable] = None,
+            options: Optional[dict] = None) -> OptimizeResult
+
+矩阵A：就是约束条件的系数（等号左边的系数）
+矩阵B：就是约束条件的值（等号右边）
+矩阵C：目标函数的系数值
+'''
+
+from scipy import  optimize as opt
+import numpy as np
+#参数
+#c是目标函数里变量的系数
+c=np.array([1,1])
+#a是不等式条件的变量系数
+a=np.array([[1,-1],[-1.5,1],[50,20]])
+#b是是不等式条件的常数项
+b=np.array([0,0,2000])
+#a1，b1是等式条件的变量系数和常数项，这个例子里无等式条件,不要这两项
+#a1=np.array([[1,1,1]])
+#b1=np.array([7])
+#限制
+lim1=(0,None) #(0,None)->(0,+无穷)
+lim2=(0,None)
+#调用函数
+ans=opt.linprog(-c,a,b,bounds=(lim1,lim2))
+#输出结果
+print(ans)
+
+#注意：我们这里的应用问题，椅子不能是0.5把，所以最后应该采购37把椅子
+
--- a/课件/0613Simplex Method与LP实战/simplex.py
+++ b/课件/0613Simplex Method与LP实战/simplex.py
+
+
+import numpy as np
+
+class Simplex(object):
+    def __init__(self, obj, max_mode=False):
+        self.max_mode = max_mode  # 默认是求min，如果是max目标函数要乘-1
+        self.mat = np.array([[0] + obj]) * (-1 if max_mode else 1)      #矩阵先加入目标函数
+
+    def add_constraint(self, a, b):
+        self.mat = np.vstack([self.mat, [b] + a])      #矩阵加入约束
+
+    def solve(self):
+        m, n = self.mat.shape  # 矩阵里有1行目标函数，m - 1行约束，应加入m-1个松弛变量
+        temp, B = np.vstack([np.zeros((1, m - 1)), np.eye(m - 1)]), list(range(n - 1, n + m - 1))  # temp是一个对角矩阵，B是个递增序列
+        mat = self.mat = np.hstack([self.mat, temp])  # 横向拼接
+        while mat[0, 1:].min() < 0:   #判断目标函数里是否还有负系数项
+            col = np.where(mat[0, 1:] < 0)[0][0] + 1  # 在目标函数里找到第一个负系数的变量，找到替入变量
+            row = np.array([mat[i][0] / mat[i][col] if mat[i][col] > 0 else 0x7fffffff for i in
+                            range(1, mat.shape[0])]).argmin() + 1  # 找到最严格约束的行，也就找到替出变量
+            if mat[row][col] <= 0: return None  # 若无替出变量，原问题无界
+            mat[row] /= mat[row][col]    #替入变量和替出变量互换
+            ids = np.arange(mat.shape[0]) != row
+            mat[ids] -= mat[row] * mat[ids, col:col + 1]  # 对i!= row的每一行约束条件，做替入和替出变量的替换
+            B[row] = col  #用B数组记录替入的替入变量
+        return mat[0][0] * (1 if self.max_mode else -1), {B[i]: mat[i, 0] for i in range(1, m) if B[i] < n} #返回目标值，对应x的解就是基本变量为对应的bi，非基本变量为0
+
+
+"""
+       minimize -x1 - 14x2 - 6x3
+       st
+        x1 + x2 + x3 <=4
+        x1 <= 2
+        x3 <= 3
+        3x2 + x3 <= 6
+        x1 ,x2 ,x3 >= 0
+       answer :-32
+    """
+t = Simplex([-1, -14, -6])
+t.add_constraint([1, 1, 1], 4)
+t.add_constraint([1, 0, 0], 2)
+t.add_constraint([0, 0, 1], 3)
+t.add_constraint([0, 3, 1], 6)
+print(t.solve())
+print(t.mat)
\ No newline at end of file
--- a/课件/0614 Inventory Optimization with Stochastic Programming[阿勇].pptx
+++ b/课件/0614 Inventory Optimization with Stochastic Programming[阿勇].pptx
--- a/课件/0614凸优化（2）.pptx
+++ b/课件/0614凸优化（2）.pptx
--- a/课件/2.From Word Embeddings To Document Distances.pdf
+++ b/课件/2.From Word Embeddings To Document Distances.pdf
--- a/课件/XGBoost- A Scalable Tree Boosting System.pdf
+++ b/课件/XGBoost- A Scalable Tree Boosting System.pdf
--- a/课件/project1/.gitkeep
+++ b/课件/project1/.gitkeep
++ "b/\350\257\276\344\273\266/project1/.gitkeep"
--- a/课件/project1/Project1项目.zip
+++ b/课件/project1/Project1项目.zip
--- a/课件/starter_code.ipynb
+++ b/课件/starter_code.ipynb
--- a/课件/凸优化（1）.pptx
+++ b/课件/凸优化（1）.pptx