Merge pull request #1839 from Agenta-AI/postgres

Postgres
Agenta-AI · Jul 3, 2024 · f518b23 · f518b23
2 parents 2e3ee66 + 1688567
commit f518b23
Show file tree

Hide file tree

Showing 20 changed files with 790 additions and 245 deletions.
diff --git a/README.md b/README.md
@@ -42,7 +42,6 @@
 </br>
 </p>
 
-
 <p align="center">
     <a href="https://join.slack.com/t/agenta-hq/shared_invite/zt-1zsafop5i-Y7~ZySbhRZvKVPV5DO_7IA">
         <img src="https://img.shields.io/badge/JOIN US ON SLACK-4A154B?style=for-the-badge&logo=slack&logoColor=white" />
@@ -55,7 +54,6 @@
     </a>
 </p>
 
-
 </br>
 
 <a href="https://cloud.agenta.ai">
@@ -64,7 +62,6 @@
   </picture>
 </a>
 
-
 <br>
   <br />
   <br />
@@ -96,54 +93,58 @@
 
 # ⭐️ Why Agenta?
 
-Agenta is an end-to-end LLM developer platform. It provides the tools for **prompt engineering and management**,  ⚖️ **evaluation**, **human annotation**, and :rocket: **deployment**. All without imposing any restrictions on your choice of framework, library, or model. 
+Agenta is an end-to-end LLM developer platform. It provides the tools for **prompt engineering and management**, ⚖️ **evaluation**, **human annotation**, and :rocket: **deployment**. All without imposing any restrictions on your choice of framework, library, or model.
 
-Agenta allows developers and product teams to collaborate in building production-grade LLM-powered applications in less time. 
+Agenta allows developers and product teams to collaborate in building production-grade LLM-powered applications in less time.
 
 ### With Agenta, you can:
 
-- [🧪 **Experiment** and **compare** prompts](https://docs.agenta.ai/basic_guides/prompt_engineering) on [any LLM workflow](https://docs.agenta.ai/advanced_guides/custom_applications) (chain-of-prompts, Retrieval Augmented Generation (RAG), LLM agents...) 
-- ✍️ Collect and [**annotate golden test sets**](https://docs.agenta.ai/basic_guides/test_sets) for evaluation
-- 📈 [**Evaluate** your application](https://docs.agenta.ai/basic_guides/automatic_evaluation) with pre-existing or [**custom evaluators**](https://docs.agenta.ai/advanced_guides/using_custom_evaluators)
-- [🔍 **Annotate** and **A/B test**](https://docs.agenta.aibasic_guides/human_evaluation) your applications with **human feedback**
-- [🤝 **Collaborate with product teams**](https://docs.agenta.ai/basic_guides/team_management) for prompt engineering and evaluation
-- [🚀 **Deploy your application**](https://docs.agenta.ai/basic_guides/deployment) in one-click in the UI,  through CLI, or through github workflows. 
+- [🧪 **Experiment** and **compare** prompts](https://docs.agenta.ai/prompt_management/prompt_engineering) on [any LLM workflow](https://docs.agenta.ai/prompt_management/custom_applications) (chain-of-prompts, Retrieval Augmented Generation (RAG), LLM agents...)
+- ✍️ Collect and [**annotate golden test sets**](https://docs.agenta.ai/evaluation/test_sets) for evaluation
+- 📈 [**Evaluate** your application](https://docs.agenta.ai/evaluation/automatic_evaluation) with pre-existing or [**custom evaluators**](https://docs.agenta.ai/evaluation/custom_evaluator)
+- [🔍 **Annotate** and **A/B test**](https://docs.agenta.ai/evaluation/human_evaluation) your applications with **human feedback**
+- [🤝 **Collaborate with product teams**](https://docs.agenta.ai/misc/team_management) for prompt engineering and evaluation
+- [🚀 **Deploy your application**](https://docs.agenta.ai/prompt_management/deployment) in one-click in the UI, through CLI, or through github workflows.
 
 ### Works with any LLM app workflow
 
 Agenta enables prompt engineering and evaluation on any LLM app architecture:
+
 - Chain of prompts
 - RAG
 - Agents
-- ...
 
-It works with any framework such as [Langchain](https://langchain.com), [LlamaIndex](https://www.llamaindex.ai/) and any LLM provider (openAI, Cohere, Mistral). 
-
-[Jump here to see how to use your own custom application with agenta](/advanced_guides/custom_applications)
+It works with any framework such as [Langchain](https://langchain.com), [LlamaIndex](https://www.llamaindex.ai/) and any LLM provider (openAI, Cohere, Mistral).
 
 # Quick Start
 
 ### [Get started for free](https://cloud.agenta.ai?utm_source=github&utm_medium=readme&utm_campaign=github)
-### [Explore the Docs](https://docs.agenta.ai)
-### [Create your first application in one-minute](https://docs.agenta.ai/quickstart/getting-started-ui)
-### [Create an application using Langchain](https://docs.agenta.ai/tutorials/first-app-with-langchain)
+
+### [Explore the Docs](https://docs.agenta.ai/getting_started/introduction)
+
+### [Create your first application in one-minute](https://docs.agenta.ai/getting_started/quick-start)
+
+### [Create an application using Langchain](https://docs.agenta.ai/guides/tutorials/first-app-with-langchain)
+
 ### [Self-host agenta](https://docs.agenta.ai/self-host/host-locally)
-### [Check the Cookbook](https://docs.agenta.ai/cookbook)
 
-# Features
+### [Check the Cookbook](https://docs.agenta.ai/guides/evaluation_from_sdk)
 
+# Features
 
-| Playground | Evaluation |
-| ------- | ------- |
-| Compare and version prompts for any LLM app, from single prompt to agents. <br/> <video src="https://github.com/Agenta-AI/agenta/assets/4510758/8b736d2b-7c61-414c-b534-d95efc69134c" controls="controls" style="max-width:100%;"> | Define test sets, then evaluate manually or programmatically your different variants.<br/> <video src="https://github.com/Agenta-AI/agenta/assets/4510758/8c6997c6-da87-46ad-a81f-e15e277263d2" controls="controls" style="max-width:100%;">|
-| Human annotation | Deployment |
-| Use Human annotator to A/B test and score your LLM apps. <br/>  <img width="750" alt="Screenshot 2024-01-28 at 12 57 46" src="https://github.com/Agenta-AI/agenta/assets/4510758/bf62a697-bf19-4ba9-850e-742fbfb75424"> | When you are ready, deploy your LLM applications as APIs in one click.<br/>![](https://github.com/Agenta-AI/agenta/blob/main/docs/images/endpoint.gif) |
+| Playground                                                                                                                                                                                                                         | Evaluation                                                                                                                                                                                                                                   |
+| ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| Compare and version prompts for any LLM app, from single prompt to agents. <br/> <video src="https://github.com/Agenta-AI/agenta/assets/4510758/8b736d2b-7c61-414c-b534-d95efc69134c" controls="controls" style="max-width:100%;"> | Define test sets, then evaluate manually or programmatically your different variants.<br/> <video src="https://github.com/Agenta-AI/agenta/assets/4510758/8c6997c6-da87-46ad-a81f-e15e277263d2" controls="controls" style="max-width:100%;"> |
+| Human annotation                                                                                                                                                                                                                   | Deployment                                                                                                                                                                                                                                   |
+| Use Human annotator to A/B test and score your LLM apps. <br/> <img width="750" alt="Screenshot 2024-01-28 at 12 57 46" src="https://github.com/Agenta-AI/agenta/assets/4510758/bf62a697-bf19-4ba9-850e-742fbfb75424">             | When you are ready, deploy your LLM applications as APIs in one click.<br/>![](https://github.com/Agenta-AI/agenta/blob/main/docs/images/endpoint.gif)                                                                                       |
 
 # Enterprise Support
+
 Contact us here for enterprise support and early access to agenta self-managed enterprise with Kubernetes support. <br/><br/>
 <a href="https://cal.com/mahmoud-mabrouk-ogzgey/demo"><img src="https://cal.com/book-with-cal-dark.svg" alt="Book us"></a>
 
 # Disabling Anonymized Tracking
+
 By default, Agenta automatically reports anonymized basic usage statistics. This helps us understand how Agenta is used and track its overall usage and growth. This data does not include any sensitive information.
 
 To disable anonymized telemetry, follow these steps:
@@ -154,6 +155,7 @@ To disable anonymized telemetry, follow these steps:
 After making this change, restart Agenta Compose.
 
 # ⭐️ Join Our Team
+
 - [Founding Lead Software Engineer Backend](https://agentaai.notion.site/Founding-Lead-Software-Engineer-Backend-d70bfefed6d543778bc4aa38b543a678)
 - [Founding Product Engineer Frontend](https://agentaai.notion.site/Founding-Product-Engineer-Frontend-b6d26a3e9b254be6b6c2bfffbf0b53c5)
 - [Founding Product Designer](https://agentaai.notion.site/Founding-Product-Designer-96b1e760ff0241fd96632578d533a778)
@@ -164,12 +166,14 @@ We warmly welcome contributions to Agenta. Feel free to submit issues, fork the
 
 We are usually hanging in our Slack. Feel free to [join our Slack and ask us anything](https://join.slack.com/t/agenta-hq/shared_invite/zt-1zsafop5i-Y7~ZySbhRZvKVPV5DO_7IA)
 
-Check out our [Contributing Guide](https://docs.agenta.ai/contributing/getting-started) for more information.
+Check out our [Contributing Guide](https://docs.agenta.ai/misc/contributing/getting-started) for more information.
 
 ## Contributors ✨
 
 <!-- ALL-CONTRIBUTORS-BADGE:START - Do not remove or modify this section -->
+
 [![All Contributors](https://img.shields.io/badge/all_contributors-46-orange.svg?style=flat-square)](#contributors-)
+
 <!-- ALL-CONTRIBUTORS-BADGE:END -->
 
 Thanks goes to these wonderful people ([emoji key](https://allcontributors.org/docs/en/emoji-key)):
@@ -190,7 +194,7 @@ Thanks goes to these wonderful people ([emoji key](https://allcontributors.org/d
     </tr>
     <tr>
       <td align="center" valign="top" width="14.28%"><a href="https://github.com/Pajko97"><img src="https://avatars.githubusercontent.com/u/25198892?v=4?s=100" width="100px;" alt="Pavle Janjusevic"/><br /><sub><b>Pavle Janjusevic</b></sub></a><br /><a href="#infra-Pajko97" title="Infrastructure (Hosting, Build-Tools, etc)">🚇</a></td>
-      <td align="center" valign="top" width="14.28%"><a href="http://kaosiso-ezealigo.netlify.app"><img src="https://avatars.githubusercontent.com/u/99529776?v=4?s=100" width="100px;" alt="Kaosi Ezealigo"/><br /><sub><b>Kaosi Ezealigo</b></sub></a><br /><a href="https://github.com/Agenta-AI/agenta/issues?q=author%3Abekossy" title="Bug reports">🐛</a> <a href="https://github.com/Agenta-AI/agenta/commits?author=bekossy" title="Code">💻</a></td>
+      <td align="center" valign="top" width="14.28%"><a href="http://kaosiso-ezealigo.netlify.app"><img src="https://avatars.githubusercontent.com/u/99529776?v=4?s=100" width="100px;" alt="Kaosiso Ezealigo"/><br /><sub><b>Kaosiso Ezealigo</b></sub></a><br /><a href="https://github.com/Agenta-AI/agenta/issues?q=author%3Abekossy" title="Bug reports">🐛</a> <a href="https://github.com/Agenta-AI/agenta/commits?author=bekossy" title="Code">💻</a></td>
       <td align="center" valign="top" width="14.28%"><a href="https://github.com/albnunes"><img src="https://avatars.githubusercontent.com/u/46302915?v=4?s=100" width="100px;" alt="Alberto Nunes"/><br /><sub><b>Alberto Nunes</b></sub></a><br /><a href="https://github.com/Agenta-AI/agenta/issues?q=author%3Aalbnunes" title="Bug reports">🐛</a></td>
       <td align="center" valign="top" width="14.28%"><a href="https://www.linkedin.com/in/mohammed-maaz-6290b0116/"><img src="https://avatars.githubusercontent.com/u/17180132?v=4?s=100" width="100px;" alt="Maaz Bin Khawar"/><br /><sub><b>Maaz Bin Khawar</b></sub></a><br /><a href="https://github.com/Agenta-AI/agenta/commits?author=MohammedMaaz" title="Code">💻</a> <a href="https://github.com/Agenta-AI/agenta/pulls?q=is%3Apr+reviewed-by%3AMohammedMaaz" title="Reviewed Pull Requests">👀</a> <a href="#mentoring-MohammedMaaz" title="Mentoring">🧑‍🏫</a></td>
       <td align="center" valign="top" width="14.28%"><a href="https://github.com/devgenix"><img src="https://avatars.githubusercontent.com/u/56418363?v=4?s=100" width="100px;" alt="Nehemiah Onyekachukwu Emmanuel"/><br /><sub><b>Nehemiah Onyekachukwu Emmanuel</b></sub></a><br /><a href="https://github.com/Agenta-AI/agenta/commits?author=devgenix" title="Code">💻</a> <a href="#example-devgenix" title="Examples">💡</a> <a href="https://github.com/Agenta-AI/agenta/commits?author=devgenix" title="Documentation">📖</a></td>

diff --git a/agenta-backend/agenta_backend/services/results_service.py b/agenta-backend/agenta_backend/services/results_service.py
@@ -122,4 +122,5 @@ async def fetch_results_for_single_model_test(evaluation_id: str):
                 continue  # Skip if the string is not a valid integer
 
         scores_and_counts[score] = scores_and_counts.get(score, 0) + 1
+
     return scores_and_counts
diff --git a/agenta-backend/pyproject.toml b/agenta-backend/pyproject.toml
@@ -1,6 +1,6 @@
 [tool.poetry]
 name = "agenta_backend"
-version = "0.17.4"
+version = "0.17.5"
 description = ""
 authors = ["Mahmoud Mabrouk <[email protected]>"]
 readme = "README.md"

diff --git a/agenta-cli/agenta/sdk/tracing/callbacks.py b/agenta-cli/agenta/sdk/tracing/callbacks.py
@@ -55,7 +55,9 @@ def log_stream_event(self, kwargs, response_obj, start_time, end_time):
                     "message": kwargs.get(
                         "complete_streaming_response"
                     ),  # the complete streamed response (only set if `completion(..stream=True)`)
-                    "usage": response_obj.usage.dict(),  # litellm calculates usage
+                    "usage": response_obj.usage.dict()
+                    if hasattr(response_obj, "usage")
+                    else None,  # litellm calculates usage
                     "cost": kwargs.get(
                         "response_cost"
                     ),  # litellm calculates response cost
@@ -69,7 +71,9 @@ def log_success_event(
             self._trace.end_span(
                 outputs={
                     "message": response_obj.choices[0].message.content,
-                    "usage": response_obj.usage.dict(),  # litellm calculates usage
+                    "usage": response_obj.usage.dict()
+                    if hasattr(response_obj, "usage")
+                    else None,  # litellm calculates usage
                     "cost": kwargs.get(
                         "response_cost"
                     ),  # litellm calculates response cost
@@ -93,7 +97,9 @@ def log_failure_event(
             self._trace.end_span(
                 outputs={
                     "message": kwargs["exception"],  # the Exception raised
-                    "usage": response_obj.usage.dict(),  # litellm calculates usage
+                    "usage": response_obj.usage.dict()
+                    if hasattr(response_obj, "usage")
+                    else None,  # litellm calculates usage
                     "cost": kwargs.get(
                         "response_cost"
                     ),  # litellm calculates response cost
@@ -109,7 +115,9 @@ async def async_log_stream_event(
                     "message": kwargs.get(
                         "complete_streaming_response"
                     ),  # the complete streamed response (only set if `completion(..stream=True)`)
-                    "usage": response_obj.usage.dict(),  # litellm calculates usage
+                    "usage": response_obj.usage.dict()
+                    if hasattr(response_obj, "usage")
+                    else None,  # litellm calculates usage
                     "cost": kwargs.get(
                         "response_cost"
                     ),  # litellm calculates response cost
@@ -123,7 +131,9 @@ async def async_log_success_event(
             self._trace.end_span(
                 outputs={
                     "message": response_obj.choices[0].message.content,
-                    "usage": response_obj.usage.dict(),  # litellm calculates usage
+                    "usage": response_obj.usage.dict()
+                    if hasattr(response_obj, "usage")
+                    else None,  # litellm calculates usage
                     "cost": kwargs.get(
                         "response_cost"
                     ),  # litellm calculates response cost
@@ -147,7 +157,9 @@ async def async_log_failure_event(
             self._trace.end_span(
                 outputs={
                     "message": kwargs["exception"],  # the Exception raised
-                    "usage": response_obj.usage.dict(),  # litellm calculates usage
+                    "usage": response_obj.usage.dict()
+                    if hasattr(response_obj, "usage")
+                    else None,  # litellm calculates usage
                     "cost": kwargs.get(
                         "response_cost"
                     ),  # litellm calculates response cost