Add new file

83c2df61 · Yuan · 7a4b913d · 83c2df61
Commit 83c2df61 authored Aug 11, 2020 by Yuan
Hide whitespace changes
Inline Side-by-side

Showing with 23 additions and 0 deletions

review
+23 -0

No files found.
--- a/review
+++ b/review
+Summary:
+参考一下建议完善下
+Detailed Comments:
+1、在read_corpus这一步，需要过滤掉一些无效的问答对，过滤之后应该只有8万多条数据；
+
+2、数据预处理这一步骤还需要完善下；
+
+3、不建议使用这个X_tfidf = tfidf.toarray()  # 结果存放在X矩阵里
+如果数据量比较大的情况下，使用.toarray() 很容易爆内存
+
+4、q_emb = getTFIDF(query,tfidf_vocab,idf)这里直接使用上面训练好的模型transform就好了
+
+5、在对输入的query也需要进行数据预处理的；这样才能保证一致性；
+
+6、请按时提交作业
+
+Overall Score: 82
+--------------------------------------------------------------------------------------------------------------------------
+
+Thanks for your efforts.
+
+-Your instructor
\ No newline at end of file