Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Purpose of doc_dict.txt and sum_dict.txt #30

Open
hannesb0 opened this issue Jun 21, 2018 · 2 comments
Open

Purpose of doc_dict.txt and sum_dict.txt #30

hannesb0 opened this issue Jun 21, 2018 · 2 comments

Comments

@hannesb0
Copy link

Hey,

I am quite new in machine learning and I would like to use your code to train my own text summarization model. What I am currently wondering about is, what exactly are you needing the doc_dict.txt and sum_dict.txt files for?

How would I have to adapt them, if I would like to generate my own training data?

Thanks for any help!

@Pamulapati13
Copy link

It contains all the unique words in the doccuments and summaries.

@hannesb0
Copy link
Author

Thanks for your answer!

Am I right in assuming, that the sum_dic.txt file also contains some abbreviations, e.g. pm for prime minister?

Because if not, why would I need two separate dictionaries and could not use one and the same for the documents and the summaries?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants