Skip to content
This repository has been archived by the owner on Nov 3, 2023. It is now read-only.

Add OpenAI Chat Completion Agent (GPT-3.5/GPT-4) #5061

Merged
merged 3 commits into from
Sep 5, 2023

Conversation

rguan1
Copy link
Contributor

@rguan1 rguan1 commented Jun 20, 2023

Patch description
We are adding support for the latest OpenAI models like GPT-3.5 (ChatGPT) and GPT-4 by adding a wrapper for the chat completion API.

Testing steps
We've manually tested that the wrapper can handle a multi-turn conversation without error-ing which demonstrates that the wrapper is properly storing and querying the API endpoint. We tested conversation capability both with GPT-3.5 and GPT-4 models in both self-chat and interactive modes.

Self Chat

parlai self_chat -m openai_chat_completions --num-self-chats 1 --selfchat-max-turns 6 --openai-api-key <insert your api key> --max-tokens 40 --model-name gpt-3.5-turbo --init-prompt "You are chatting with a fellow Wizard of the grand frost order. You are defending the village from sword wielding golden retrievers raiding the countryside." --name wizard1 --role user --counterparty-role user --counterparty-name moose --partner-model-file zoo:blender/blender_90M/model

[context]: Hi!
[OpenaiChatCompletionsAgent_1]: Hello fellow wizard! I see you're also busy defending the village from these sword-wielding golden retrievers. These creatures are quite strong and ferocious, aren't they?
   [TransformerGenerator_2]: i ' m not a wizard , but i ' ve always wanted to be one . what do you do for a living ?
[OpenaiChatCompletionsAgent_1]: Oh, I'm not sure if I would call it a living, per se. As a wizard of the grand frost order, my duty is to protect the village and its people from all sorts of
   [TransformerGenerator_2]: that ' s very noble of you . i ' d love to be a grand wizard one day .
[OpenaiChatCompletionsAgent_1]: It takes years of study and practice to become a grand wizard, but if you have the determination and passion for it, you can certainly achieve it. As a grand wizard, you won't just be
   [TransformerGenerator_2]: do you have any other hobbies besides being a wizard ? i like to think of it as a hobby .
[OpenaiChatCompletionsAgent_1]: As a wizard, my primary focus is on studying and practicing magic to improve my abilities and protect the village. However, when I have some free time, I enjoy reading books on history, mythology,
   [TransformerGenerator_2]: what kind of books do you like to read ? i have a lot of free time as well .
[OpenaiChatCompletionsAgent_1]: I enjoy reading books on history, folklore, and mythology. I find these topics fascinating, and they often give me inspiration for spells and incantations. I also enjoy reading fictional works, such as
   [TransformerGenerator_2]: i like fantasy books too . i like the ones that focus on the real world and not just fantasy .

Interactive Mode

parlai interactive -m openai_chat_completions --openai-api-key <insert your api key> --max-tokens 40 --model-name gpt-4

Enter Your Message: Can you describe a pack of golden retriever knights roaming the countryside?
[OpenaiChatCompletionsAgent]: In the enchanting countryside, a majestic sight awaited anyone who happened to stumble upon it. A pack of Golden Retriever knights, glorious canines draped in gleaming armor, solemnly roamed
Enter Your Message: Can you write a sentence describing how they roam the land for name brand kibble?
[OpenaiChatCompletionsAgent]: the vast landscapes in pursuit of the fabled name-brand kibble, rumored to grant strength and power to those valiant enough to consume its heavenly morsels.

@rguan1
Copy link
Contributor Author

rguan1 commented Jun 20, 2023

Some questions related to this diff:

  1. If I want to make an abstract class that both the gpt-3 and chat-completions agent can inherit from, where would I locate that file? For instance, I see a number of models that inherit from TorchAgent which is located in the core folder. However, I don't think that is the appropriate place for the openai abstract agent.

The reason I want to create an abstract class is so that I can house shared logic such as shared arg parameters and additional helper functions (like token counting and prompt truncation).

  1. Should the name of this agent remain "openai-chat-completions" or should we go with a snappier but more inaccurate gpt4 name? I settled on "openai-chat-completions" to future-proof the code against future model names using the endpoint and because it is a more accurate reflection of the dual duty of querying chatgpt as well as gpt4.

  2. I have this argument counterparty-name and counterparty-role. This works fine unless you use self chat with both conversation partners being chat-completion agents. In this case, the turn history gets confused because both agent-1 and agent-2 believe themselves to be name rather than counterparty-name. In other words, both agents think they are the first speaker. Ideally, we'd want agent-2 to recognize that it is counterparty-name.

Do you have a good way to handle this? The only thing that I can think of is parsing the agent id for an int because self-chat with the same model results in an int appended to the end of the agent id like this:

[Gpt3Agent_1]: Hi there! How can I help?
   [Gpt3Agent_2]: I'd like to make a pull request

However, this solution is imperfect since I don't believe it handles n-party convos when n >= 3.

msg = {'role': self.counterparty_role, 'content': observation['text']}
if self.counterparty_name:
msg['name'] = self.counterparty_name
self.turns.append(msg)
Copy link
Contributor

@mojtaba-komeili mojtaba-komeili Jun 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might be able to have a custom version of History to handle turns.

@mojtaba-komeili
Copy link
Contributor

This looks great, thanks for your contribution. To address your questions:

  1. You are right, we better have an abstract agent that can be used for API-based models such as GPTx. I would recommend creating a new agent file (for example in parlai/core/api_agent) which will be used with any similar models that will come out in the future. You can modify FakeHistory and add it as something more compatible there as well.
  2. I don't have any issues with the name and it looks like you had thoughts about that. Let's keep it as you chose.
  3. That is a fair point. Please add that as a note to the README.

@rguan1
Copy link
Contributor Author

rguan1 commented Jul 9, 2023

@mojtaba-komeili Thanks for reviewing this PR!

Ok, I addressed the smaller nits and suggestions that you highlighted so far. The individual commits in the PR should make it clear what was addressed.

I also made a a first pass at the OpenAI base class: it only abstracts out shared CLI params. Let me know what you think so far. I can add an improved fake history if this is the right approach. Also, should we split up the base API changes into a separate PR? I'm not super well versed in working in PRs, so I am not sure at what point there is too much scope creep within an individual PR. However, I can imagine that tacking this feature onto this PR could make it harder to read and understand the git history rather than having 2 PRs. Let me know if it is ok to keep all of the code in this PR or if it should be split up.

@rguan1 rguan1 force-pushed the openai_gpt4 branch 6 times, most recently from d077808 to 1adc079 Compare July 9, 2023 22:51
@mojtaba-komeili
Copy link
Contributor

mojtaba-komeili commented Jul 10, 2023

Thanks a lot. This looks great.

  • I think adding the FakeHistory (or even better some other new history class aptly named for this agent is a good idea. I think we are ready to merge as soon as you apply this.
  • As far as the PR goes, it was better if we were separating it, but I am fine with this version too. It is a big PR but it can all fall under the same feature being added.

Thanks again for your contribution.

@rguan1
Copy link
Contributor Author

rguan1 commented Jul 21, 2023

Thanks for the suggestions. I think we can definitely get a core history class in, and this time I will give it a better name than FakeHistory haha.

So, if I were to split this PR into 2, this is my proposed split

  • First PR adds the Chat Completion Agent
  • Second PR introduces a base openai agent, a base history class, and usage of the base history class (which can handle the turns logic as proposed above).

I think if we split it that way, we can just chronologically split the later commits into a separate PR. Let me know what you think @mojtaba-komeili. Thanks again!

@mojtaba-komeili
Copy link
Contributor

Thanks for the suggestions. I think we can definitely get a core history class in, and this time I will give it a better name than FakeHistory haha.

So, if I were to split this PR into 2, this is my proposed split

  • First PR adds the Chat Completion Agent
  • Second PR introduces a base openai agent, a base history class, and usage of the base history class (which can handle the turns logic as proposed above).

I think if we split it that way, we can just chronologically split the later commits into a separate PR. Let me know what you think @mojtaba-komeili. Thanks again!

Sorry for late reply. Yeah, that sounds like a good plan to me. The first PR is making a practical change to be used immediately and the second makes the basis for future similar ones.
Looking forward to seeing your submission.

@rguan1
Copy link
Contributor Author

rguan1 commented Aug 27, 2023

The latest change represents the "First PR adds the Chat Completion Agent" as discussed above. In other words, we removed the the 2 latest commits which do the following "base openai agent, a base history class, and usage of the base history class". We will add those in a follow-up PR to keep the commit history cleaner and more readable.

Sorry for the late response, @mojtaba-komeili! Please let me know if anything else needs to be changed before it is good to go.

```

## Limitations
This API wrapper has three major limitations
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You said three, but four is listed here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can fix this in a follow up PR.

Copy link
Contributor

@mojtaba-komeili mojtaba-komeili left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Looking forward to your follow up PR.

@mojtaba-komeili mojtaba-komeili merged commit 24daddb into facebookresearch:main Sep 5, 2023
4 of 5 checks passed
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants