Reinforcement_learning_NLP

Implementing Reinforcement Learning to find the best dialogue strategy for a conversation agent (chatbot) by search for maximum award.

To record a converstaion, do:

git clone https://github.com/MollyZhang/Reinforcement_learning_NLP.git
cd Reinforcement_learning_NLP
cd RL
python run.py

If you want to train and populate reward table based on the 300 conversations recorded,type f,if you want to try a new dialogue,type s,if you want to view the accuracy of the evaluation model,type e,if you want to view the reward table,type r,if you want to view the Q_table type q

Future improvements

Dealing with user saying gibberish like "dfkjlskdfj"
Dealing user repeat itself
Dealing with user insult
Having strategy to mickmick user input. e.g. When user says "yay!!!", bot says"wow!!!!".

A brief overview of the code

We have learnt currently from the 300 odd conversations and populated the Reward table based on the user evaluation metrics. The first block initializes the variables and the Q_table and the R_table.We have 6 strategies and 18 state variables based on the 4 state metrics like (If the user utterance is a question or not,the length of the utterance,the sentiment of the uttterance and whether the utterance is at the beginning(first utterance of the user),we have thus created 18 combinations of these states.

The second block are all the utility functions used and called by the later blocks.The most prominent amongst them being the training() where we train and populate the Q_table.The logic of Q_learning is implemented here.

The third block populates the reward table according to whether the utterance is at the beginning(in this case,it is calculated according to 0.8start+0.2 overall,while for the rest utterances,it is 0.4engaging+04interrupt+0.2*overall.

The fourth block is used for training where it calls the training() method.

The fifth block records the new conversations and poplates the strategies based on the Q_table and updates the Q_table. Work on evaluation is still in progress.

Name		Name	Last commit message	Last commit date
Latest commit History 100 Commits
RL		RL
RL_nehal		RL_nehal
data		data
demo_q_learning		demo_q_learning
molly		molly
simulation		simulation
wei		wei
.gitignore		.gitignore
Q_learning.ipynb		Q_learning.ipynb
README.md		README.md
States.ipynb		States.ipynb
log_dialogue.py		log_dialogue.py
nehal_Q_learning.ipynb		nehal_Q_learning.ipynb
strategies.ipynb		strategies.ipynb
strategies.pkl		strategies.pkl

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Reinforcement_learning_NLP

To record a converstaion, do:

Future improvements

A brief overview of the code

About

Releases

Packages

Contributors 3

Languages

MollyZhang/Reinforcement_learning_NLP

Folders and files

Latest commit

History

Repository files navigation

Reinforcement_learning_NLP

To record a converstaion, do:

Future improvements

A brief overview of the code

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages