Skip to content
Projects
Groups
Snippets
Help
This project
Loading...
Sign in / Register
Toggle navigation
P
project1
Overview
Overview
Details
Activity
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Issues
0
Issues
0
List
Board
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Charts
Wiki
Wiki
Snippets
Snippets
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
20200203048
project1
Commits
83c2df61
Commit
83c2df61
authored
Aug 11, 2020
by
Yuan
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
Add new file
parent
7a4b913d
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
23 additions
and
0 deletions
+23
-0
review
+23
-0
No files found.
review
0 → 100644
View file @
83c2df61
Summary:
参考一下建议完善下
Detailed Comments:
1、在read_corpus这一步,需要过滤掉一些无效的问答对,过滤之后应该只有8万多条数据;
2、数据预处理这一步骤还需要完善下;
3、不建议使用这个X_tfidf = tfidf.toarray() # 结果存放在X矩阵里
如果数据量比较大的情况下,使用.toarray() 很容易爆内存
4、q_emb = getTFIDF(query,tfidf_vocab,idf)这里直接使用上面训练好的模型transform就好了
5、在对输入的query也需要进行数据预处理的;这样才能保证一致性;
6、请按时提交作业
Overall Score: 82
--------------------------------------------------------------------------------------------------------------------------
Thanks for your efforts.
-Your instructor
\ No newline at end of file
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment