readme organize

run-llama · Feb 3, 2025 · c98239c · c98239c
1 parent ec9b733
commit c98239c
Show file tree

Hide file tree

Showing 5 changed files with 120 additions and 162 deletions.
diff --git a/README.md b/README.md
@@ -1,165 +1,51 @@
-# LlamaParse
-
-[![PyPI - Downloads](https://img.shields.io/pypi/dm/llama-parse)](https://pypi.org/project/llama-parse/)
-[![GitHub contributors](https://img.shields.io/github/contributors/run-llama/llama_parse)](https://github.com/run-llama/llama_parse/graphs/contributors)
+[![PyPI - Downloads](https://img.shields.io/pypi/dm/llama-cloud-services)](https://pypi.org/project/llama-cloud-services/)
+[![GitHub contributors](https://img.shields.io/github/contributors/run-llama/llama_cloud_services)](https://github.com/run-llama/llama_cloud_services/graphs/contributors)
 [![Discord](https://img.shields.io/discord/1059199217496772688)](https://discord.gg/dGcwcsnxhU)
 
-LlamaParse is a **GenAI-native document parser** that can parse complex document data for any downstream LLM use case (RAG, agents).
-
-It is really good at the following:
-
-- ✅ **Broad file type support**: Parsing a variety of unstructured file types (.pdf, .pptx, .docx, .xlsx, .html) with text, tables, visual elements, weird layouts, and more.
-- ✅ **Table recognition**: Parsing embedded tables accurately into text and semi-structured representations.
-- ✅ **Multimodal parsing and chunking**: Extracting visual elements (images/diagrams) into structured formats and return image chunks using the latest multimodal models.
-- ✅ **Custom parsing**: Input custom prompt instructions to customize the output the way you want it.
+# Llama Cloud Services
 
-LlamaParse directly integrates with [LlamaIndex](https://github.com/run-llama/llama_index).
+This repository contains the code for hand-written SDKs and clients for interacting with LlamaCloud.
 
-The free plan is up to 1000 pages a day. Paid plan is free 7k pages per week + 0.3c per additional page by default. There is a sandbox available to test the API [**https://cloud.llamaindex.ai/parse ↗**](https://cloud.llamaindex.ai/parse).
+This includes:
 
-Read below for some quickstart information, or see the [full documentation](https://docs.cloud.llamaindex.ai/).
-
-If you're a company interested in enterprise RAG solutions, and/or high volume/on-prem usage of LlamaParse, come [talk to us](https://www.llamaindex.ai/contact).
+- [LlamaParse](./parse.md) - A GenAI-native document parser that can parse complex document data for any downstream LLM use case (Agents, RAG, data processing, etc.).
+- [LlamaReport (beta/invite-only)](./report.md) - A prebuilt agentic report builder that can be used to build reports from a variety of data sources.
+- [LlamaExtract (beta/invite-only)](./extract.md) - A prebuilt agentic data extractor that can be used to transform data into a structured JSON representation.
 
 ## Getting Started
 
-First, login and get an api-key from [**https://cloud.llamaindex.ai/api-key ↗**](https://cloud.llamaindex.ai/api-key).
-
-Then, make sure you have the latest LlamaIndex version installed.
-
-**NOTE:** If you are upgrading from v0.9.X, we recommend following our [migration guide](https://pretty-sodium-5e0.notion.site/v0-10-0-Migration-Guide-6ede431dcb8841b09ea171e7f133bd77), as well as uninstalling your previous version first.
-
-```
-pip uninstall llama-index  # run this if upgrading from v0.9.x or older
-pip install -U llama-index --upgrade --no-cache-dir --force-reinstall
-```
-
-Lastly, install the package:
-
-`pip install llama-parse`
-
-Now you can parse your first PDF file using the command line interface. Use the command `llama-parse [file_paths]`. See the help text with `llama-parse --help`.
+Install the package:
 
 ```bash
-export LLAMA_CLOUD_API_KEY='llx-...'
-
-# output as text
-llama-parse my_file.pdf --result-type text --output-file output.txt
-
-# output as markdown
-llama-parse my_file.pdf --result-type markdown --output-file output.md
-
-# output as raw json
-llama-parse my_file.pdf --output-raw-json --output-file output.json
+pip install llama-cloud-services
 ```
 
-You can also create simple scripts:
-
-```python
-import nest_asyncio
-
-nest_asyncio.apply()
-
-from llama_parse import LlamaParse
-
-parser = LlamaParse(
-    api_key="llx-...",  # can also be set in your env as LLAMA_CLOUD_API_KEY
-    result_type="markdown",  # "markdown" and "text" are available
-    num_workers=4,  # if multiple files passed, split in `num_workers` API calls
-    verbose=True,
-    language="en",  # Optionally you can define a language, default=en
-)
-
-# sync
-documents = parser.load_data("./my_file.pdf")
-
-# sync batch
-documents = parser.load_data(["./my_file1.pdf", "./my_file2.pdf"])
+Then, get your API key from [LlamaCloud](https://cloud.llamaindex.ai/).
 
-# async
-documents = await parser.aload_data("./my_file.pdf")
-
-# async batch
-documents = await parser.aload_data(["./my_file1.pdf", "./my_file2.pdf"])
-```
-
-## Using with file object
-
-You can parse a file object directly:
+Then, you can use the services in your code:
 
 ```python
-import nest_asyncio
-
-nest_asyncio.apply()
-
-from llama_parse import LlamaParse
+from llama_cloud_services import LlamaParse, LlamaReport, LlamaExtract
 
-parser = LlamaParse(
-    api_key="llx-...",  # can also be set in your env as LLAMA_CLOUD_API_KEY
-    result_type="markdown",  # "markdown" and "text" are available
-    num_workers=4,  # if multiple files passed, split in `num_workers` API calls
-    verbose=True,
-    language="en",  # Optionally you can define a language, default=en
-)
-
-file_name = "my_file1.pdf"
-extra_info = {"file_name": file_name}
-
-with open(f"./{file_name}", "rb") as f:
-    # must provide extra_info with file_name key with passing file object
-    documents = parser.load_data(f, extra_info=extra_info)
-
-# you can also pass file bytes directly
-with open(f"./{file_name}", "rb") as f:
-    file_bytes = f.read()
-    # must provide extra_info with file_name key with passing file bytes
-    documents = parser.load_data(file_bytes, extra_info=extra_info)
+parser = LlamaParse(api_key="YOUR_API_KEY")
+report = LlamaReport(api_key="YOUR_API_KEY")
+extractor = LlamaExtract(api_key="YOUR_API_KEY")
 ```
 
-## Using with `SimpleDirectoryReader`
+See the quickstart guides for each service for more information:
 
-You can also integrate the parser as the default PDF loader in `SimpleDirectoryReader`:
-
-```python
-import nest_asyncio
-
-nest_asyncio.apply()
-
-from llama_parse import LlamaParse
-from llama_index.core import SimpleDirectoryReader
-
-parser = LlamaParse(
-    api_key="llx-...",  # can also be set in your env as LLAMA_CLOUD_API_KEY
-    result_type="markdown",  # "markdown" and "text" are available
-    verbose=True,
-)
-
-file_extractor = {".pdf": parser}
-documents = SimpleDirectoryReader(
-    "./data", file_extractor=file_extractor
-).load_data()
-```
-
-Full documentation for `SimpleDirectoryReader` can be found on the [LlamaIndex Documentation](https://docs.llamaindex.ai/en/stable/module_guides/loading/simpledirectoryreader.html).
-
-## Examples
-
-Several end-to-end indexing examples can be found in the examples folder
-
-- [Getting Started](examples/demo_basic.ipynb)
-- [Advanced RAG Example](examples/demo_advanced.ipynb)
-- [Raw API Usage](examples/demo_api.ipynb)
+- [LlamaParse](./parse.md)
+- [LlamaReport (beta/invite-only)](./report.md)
+- [LlamaExtract (beta/invite-only)](./extract.md)
 
 ## Documentation
 
-[https://docs.cloud.llamaindex.ai/](https://docs.cloud.llamaindex.ai/)
+You can see complete SDK and API documentation for each service on [our official docs](https://docs.cloud.llamaindex.ai/).
 
 ## Terms of Service
 
 See the [Terms of Service Here](./TOS.pdf).
 
 ## Get in Touch (LlamaCloud)
 
-LlamaParse is part of LlamaCloud, our e2e enterprise RAG platform that provides out-of-the-box, production-ready connectors, indexing, and retrieval over your complex data sources. We offer SaaS and VPC options.
-
-LlamaCloud is currently available via waitlist (join by [creating an account](https://cloud.llamaindex.ai/)). If you're interested in state-of-the-art quality and in centralizing your RAG efforts, come [get in touch with us](https://www.llamaindex.ai/contact).
+You can get in touch with us by following our [contact link](https://www.llamaindex.ai/contact).
diff --git a/llama_cloud_services/extract/README.md → extract.md b/llama_cloud_services/extract/README.md → extract.md
@@ -1,4 +1,4 @@
-# LlamaExtract
+# LlamaExtract (beta/invite-only)
 
 > **⚠️ EXPERIMENTAL**
 > This library is under active development with frequent breaking changes. APIs and functionality may change significantly between versions. If you're interested in being an early adopter, please contact us at [[email protected]](mailto:[email protected]) or join our [Discord](https://discord.com/invite/eN6D2HQ4aX).
@@ -7,6 +7,10 @@ LlamaExtract provides a simple API for extracting structured data from unstructu
 
 ## Quick Start
 
+```bash
+pip install llama-cloud-services
+```
+
 ```python
 from llama_extract import LlamaExtract
 from pydantic import BaseModel, Field
@@ -154,12 +158,6 @@ agent = extractor.get_agent(name="resume-parser")
 extractor.delete_agent(agent.id)
 ```
 
-## Installation
-
-```bash
-pip install llama-extract==0.1.0
-```
-
 ## Tips & Best Practices
 
 1. **Schema Design**:
@@ -182,5 +180,5 @@ pip install llama-extract==0.1.0
 
 ## Additional Resources
 
-- [Example Notebook](examples/resume_screening.ipynb) - Detailed walkthrough of resume parsing
+- [Example Notebook](examples/extract/resume_screening.ipynb) - Detailed walkthrough of resume parsing
 - [Discord Community](https://discord.com/invite/eN6D2HQ4aX) - Get help and share feedback
diff --git a/llama_cloud_services/report/README.md b/llama_cloud_services/report/README.md
diff --git a/llama_cloud_services/parse/README.md → parse.md b/llama_cloud_services/parse/README.md → parse.md
@@ -1,9 +1,5 @@
 # LlamaParse
 
-[![PyPI - Downloads](https://img.shields.io/pypi/dm/llama-parse)](https://pypi.org/project/llama-parse/)
-[![GitHub contributors](https://img.shields.io/github/contributors/run-llama/llama_parse)](https://github.com/run-llama/llama_parse/graphs/contributors)
-[![Discord](https://img.shields.io/discord/1059199217496772688)](https://discord.gg/dGcwcsnxhU)
-
 LlamaParse is a **GenAI-native document parser** that can parse complex document data for any downstream LLM use case (RAG, agents).
 
 It is really good at the following:
@@ -36,7 +32,7 @@ pip install -U llama-index --upgrade --no-cache-dir --force-reinstall
 
 Lastly, install the package:
 
-`pip install llama-parse`
+`pip install llama-cloud-services`
 
 Now you can parse your first PDF file using the command line interface. Use the command `llama-parse [file_paths]`. See the help text with `llama-parse --help`.
 
@@ -146,20 +142,10 @@ Full documentation for `SimpleDirectoryReader` can be found on the [LlamaIndex D
 
 Several end-to-end indexing examples can be found in the examples folder
 
-- [Getting Started](examples/demo_basic.ipynb)
-- [Advanced RAG Example](examples/demo_advanced.ipynb)
-- [Raw API Usage](examples/demo_api.ipynb)
+- [Getting Started](examples/parse/demo_basic.ipynb)
+- [Advanced RAG Example](examples/parse/demo_advanced.ipynb)
+- [Raw API Usage](examples/parse/demo_api.ipynb)
 
 ## Documentation
 
 [https://docs.cloud.llamaindex.ai/](https://docs.cloud.llamaindex.ai/)
-
-## Terms of Service
-
-See the [Terms of Service Here](./TOS.pdf).
-
-## Get in Touch (LlamaCloud)
-
-LlamaParse is part of LlamaCloud, our e2e enterprise RAG platform that provides out-of-the-box, production-ready connectors, indexing, and retrieval over your complex data sources. We offer SaaS and VPC options.
-
-LlamaCloud is currently available via waitlist (join by [creating an account](https://cloud.llamaindex.ai/)). If you're interested in state-of-the-art quality and in centralizing your RAG efforts, come [get in touch with us](https://www.llamaindex.ai/contact).
diff --git a/report.md b/report.md
@@ -0,0 +1,88 @@
+# LlamaReport (beta/invite-only)
+
+LlamaReport is a prebuilt agentic report builder that can be used to build reports from a variety of data sources.
+
+The python SDK for interacting with the LlamaReport API. The SDK provides two main classes:
+
+- `LlamaReport`: For managing reports (create, list, delete)
+- `ReportClient`: For working with a specific report (editing, approving, etc.)
+
+## Quickstart
+
+```bash
+pip install llama-cloud-services
+```
+
+```python
+from llama_report import LlamaReport
+
+# Initialize the client
+client = LlamaReport(
+    api_key="your-api-key",
+    # Optional: Specify project_id, organization_id, async_httpx_client
+)
+
+# Create a new report
+report = client.create_report(
+    "My Report",
+    # must have one of template_text or template_instructions
+    template_text="Your template text",
+    template_instructions="Instructions for the template",
+    # must have one of input_files or retriever_id
+    input_files=["data1.pdf", "data2.pdf"],
+    retriever_id="retriever-id",
+)
+```
+
+## Working with Reports
+
+The typical workflow for a report involves:
+
+1. Creating the report
+2. Waiting for and approving the plan
+3. Waiting for report generation
+4. Making edits to the report
+
+Here's a complete example:
+
+```python
+# Create a report
+report = client.create_report(
+    "Quarterly Analysis", input_files=["q1_data.pdf", "q2_data.pdf"]
+)
+
+# Wait for the plan to be ready
+plan = report.wait_for_plan()
+
+# Option 1: Directly approve the plan
+report.update_plan(action="approve")
+
+# Option 2: Suggest and review edits to the plan
+suggestions = report.suggest_edits(
+    "Can you add a section about market trends?"
+)
+for suggestion in suggestions:
+    print(suggestion)
+
+    # Accept or reject the suggestion
+    if input("Accept? (y/n): ").lower() == "y":
+        report.accept_edit(suggestion)
+    else:
+        report.reject_edit(suggestion)
+
+# Wait for the report to complete
+report = report.wait_for_completion()
+
+# Make edits to the final report
+suggestions = report.suggest_edits("Make the executive summary more concise")
+
+# Review and accept/reject suggestions as above
+...
+```
+
+## Additional Features
+
+- **Async Support**: All methods have async counterparts: `create_report` -> `acreate_report`, `wait_for_plan` -> `await_for_plan`, etc.
+- **Automatic Chat History**: The SDK automatically keeps track of chat history for each suggestion, unless you specify `auto_history=False` in `suggest_edits`.
+- **Custom HTTP Client**: You can provide your own `httpx.AsyncClient` to the `LlamaReport` class.
+- **Project and Organization IDs**: You can specify `project_id` and `organization_id` to use a specific project or organization.