Skip to content
Projects
Groups
Snippets
Help
This project
Loading...
Sign in / Register
Toggle navigation
P
project1
Overview
Overview
Details
Activity
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Issues
0
Issues
0
List
Board
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Charts
Wiki
Wiki
Snippets
Snippets
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
20200203048
project1
Commits
83c2df61
Commit
83c2df61
authored
4 years ago
by
Yuan
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
Add new file
parent
7a4b913d
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
23 additions
and
0 deletions
+23
-0
review
+23
-0
No files found.
review
0 → 100644
View file @
83c2df61
Summary:
参考一下建议完善下
Detailed Comments:
1、在read_corpus这一步,需要过滤掉一些无效的问答对,过滤之后应该只有8万多条数据;
2、数据预处理这一步骤还需要完善下;
3、不建议使用这个X_tfidf = tfidf.toarray() # 结果存放在X矩阵里
如果数据量比较大的情况下,使用.toarray() 很容易爆内存
4、q_emb = getTFIDF(query,tfidf_vocab,idf)这里直接使用上面训练好的模型transform就好了
5、在对输入的query也需要进行数据预处理的;这样才能保证一致性;
6、请按时提交作业
Overall Score: 82
--------------------------------------------------------------------------------------------------------------------------
Thanks for your efforts.
-Your instructor
\ No newline at end of file
This diff is collapsed.
Click to expand it.
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment