Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

api v1alpha1 #17

Merged
merged 25 commits into from
Feb 3, 2025
Merged
Changes from 1 commit
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
528ca3a
api v1alpha1
guimou Dec 10, 2024
1f17348
use actual types in request models and refactor
dolfim-ibm Jan 24, 2025
32f358a
make gradio optional and update README
dolfim-ibm Jan 24, 2025
04e2457
Run workflow jobs sequentially to avoid disk space outage (#19)
vishnoianil Jan 22, 2025
930d3fd
Add github job to build image (and not publish) on PR creation (#20)
vishnoianil Jan 23, 2025
c3836ed
add start_server script for local dev
dolfim-ibm Jan 27, 2025
fda5862
fix 3.12-only syntax
dolfim-ibm Jan 27, 2025
26c6ac4
fix more py3.10-11 compatibility
dolfim-ibm Jan 27, 2025
5bedade
rework output format and background tasks
dolfim-ibm Jan 27, 2025
26765ac
speficy return schemas for openapi
dolfim-ibm Jan 27, 2025
6a5aa98
add processing time and update REDAME
dolfim-ibm Jan 27, 2025
407d827
lint markdown
dolfim-ibm Jan 27, 2025
8a09a10
add MD033 to config
dolfim-ibm Jan 27, 2025
13e281e
Merge remote-tracking branch 'origin/main' into api-upgrade
dolfim-ibm Jan 28, 2025
bae6b71
use port 5000
dolfim-ibm Jan 28, 2025
de49a13
use port 5001 as default
dolfim-ibm Jan 28, 2025
1bcfe7f
update deps
dolfim-ibm Jan 28, 2025
2758bf6
refactor input request
dolfim-ibm Jan 28, 2025
ca47ef8
return docling document
dolfim-ibm Jan 28, 2025
c567a82
update new payload in README
dolfim-ibm Jan 28, 2025
95f448d
add base64 example
dolfim-ibm Jan 28, 2025
c7f2601
wrap example in <details>
dolfim-ibm Jan 28, 2025
76d08a9
rename /url in /source
dolfim-ibm Feb 2, 2025
daf959e
Merge remote-tracking branch 'origin/main' into api-upgrade
dolfim-ibm Feb 2, 2025
574d190
move main execution to __main__
dolfim-ibm Feb 2, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
use port 5001 as default
Signed-off-by: Michele Dolfi <[email protected]>
dolfim-ibm committed Jan 28, 2025

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
commit de49a13ac8535ae1ec7674d8bfb5c3dcbe6c17d6
2 changes: 1 addition & 1 deletion Containerfile
Original file line number Diff line number Diff line change
@@ -56,6 +56,6 @@ RUN pip install --no-cache-dir poetry && \

COPY --chown=1001:0 --chmod=664 ./docling_serve ./docling_serve

EXPOSE 5000
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there any specific need to run this service on port 8080? I am wondering If we can revert it 5000 or some other non-80** port? @dolfim-ibm

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was to run on the same "standard" port as other serving runtimes in OpenShift AI. But it's an env var, so can be anything in the end.

EXPOSE 5001

CMD ["python", "docling_serve/app.py"]
8 changes: 4 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
@@ -56,7 +56,7 @@ Payload example:

```sh
curl -X 'POST' \
'http://localhost:5000/v1alpha/convert/url' \
'http://localhost:5001/v1alpha/convert/url' \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-d '{
@@ -101,7 +101,7 @@ curl -X 'POST' \
import httpx

async_client = httpx.AsyncClient(timeout=60.0)
url = "http://localhost:5000/v1alpha/convert/url"
url = "http://localhost:5001/v1alpha/convert/url"
payload = {
"from_formats": ["docx", "pptx", "html", "image", "pdf", "asciidoc", "md", "xlsx"],
"to_formats": ["md", "json", "html", "text", "doctags"],
@@ -133,7 +133,7 @@ The endpoint is: `/v1alpha/convert/file`, listening for POST requests of Form pa

```sh
curl -X 'POST' \
'http://127.0.0.1:5000/v1alpha/convert/file' \
'http://127.0.0.1:5001/v1alpha/convert/file' \
-H 'accept: application/json' \
-H 'Content-Type: multipart/form-data' \
-F 'ocr_engine=easyocr' \
@@ -159,7 +159,7 @@ curl -X 'POST' \
import httpx

async_client = httpx.AsyncClient(timeout=60.0)
url = "http://localhost:5000/v1alpha/convert/file"
url = "http://localhost:5001/v1alpha/convert/file"
parameters = {
"from_formats": ["docx", "pptx", "html", "image", "pdf", "asciidoc", "md", "xlsx"],
"to_formats": ["md", "json", "html", "text", "doctags"],
2 changes: 1 addition & 1 deletion docling_serve/app.py
Original file line number Diff line number Diff line change
@@ -232,7 +232,7 @@ async def process_file(
if __name__ == "__main__":
from uvicorn import run

port = int(os.getenv("PORT", "5000"))
port = int(os.getenv("PORT", "5001"))
workers = int(os.getenv("UVICORN_WORKERS", "1"))
reload = _str_to_bool(os.getenv("RELOAD", "False"))
run(
6 changes: 3 additions & 3 deletions docling_serve/gradio_ui.py
Original file line number Diff line number Diff line change
@@ -81,7 +81,7 @@


def health_check():
response = requests.get(f"http://localhost:{int(os.getenv('PORT', '5000'))}/health")
response = requests.get(f"http://localhost:{int(os.getenv('PORT', '5001'))}/health")
if response.status_code == 200:
return "Healthy"
return "Unhealthy"
@@ -191,7 +191,7 @@ def process_url(
raise gr.Error("No input sources provided.", print_exception=False)
try:
response = requests.post(
f"http://localhost:{int(os.getenv('PORT', '5000'))}/v1alpha/convert/url",
f"http://localhost:{int(os.getenv('PORT', '5001'))}/v1alpha/convert/url",
json=parameters,
)
except Exception as e:
@@ -239,7 +239,7 @@ def process_file(

try:
response = requests.post(
f"http://localhost:{int(os.getenv('PORT', '5000'))}/v1alpha/convert/file",
f"http://localhost:{int(os.getenv('PORT', '5001'))}/v1alpha/convert/file",
files=files_data,
data=parameters,
)
2 changes: 1 addition & 1 deletion start_server.sh
Original file line number Diff line number Diff line change
@@ -2,7 +2,7 @@
set -Eeuo pipefail

# Network settings
export PORT="${PORT:-5000}"
export PORT="${PORT:-5001}"
export HOST="${HOST:-"0.0.0.0"}"

# Performance settings
2 changes: 1 addition & 1 deletion tests/test_1-file-all-outputs.py
Original file line number Diff line number Diff line change
@@ -16,7 +16,7 @@ async def async_client():
@pytest.mark.asyncio
async def test_convert_file(async_client):
"""Test convert single file to all outputs"""
url = "http://localhost:5000/v1alpha/convert/file"
url = "http://localhost:5001/v1alpha/convert/file"
parameters = {
"from_formats": [
"docx",
2 changes: 1 addition & 1 deletion tests/test_1-url-all-outputs.py
Original file line number Diff line number Diff line change
@@ -15,7 +15,7 @@ async def async_client():
@pytest.mark.asyncio
async def test_convert_url(async_client):
"""Test convert URL to all outputs"""
url = "http://localhost:5000/v1alpha/convert/url"
url = "http://localhost:5001/v1alpha/convert/url"
payload = {
"from_formats": [
"docx",
2 changes: 1 addition & 1 deletion tests/test_2-files-all-outputs.py
Original file line number Diff line number Diff line change
@@ -16,7 +16,7 @@ async def async_client():
@pytest.mark.asyncio
async def test_convert_file(async_client):
"""Test convert single file to all outputs"""
url = "http://localhost:5000/v1alpha/convert/file"
url = "http://localhost:5001/v1alpha/convert/file"
parameters = {
"from_formats": [
"docx",
2 changes: 1 addition & 1 deletion tests/test_2-urls-all-outputs.py
Original file line number Diff line number Diff line change
@@ -13,7 +13,7 @@ async def async_client():
@pytest.mark.asyncio
async def test_convert_url(async_client):
"""Test convert URL to all outputs"""
url = "http://localhost:5000/v1alpha/convert/url"
url = "http://localhost:5001/v1alpha/convert/url"
payload = {
"from_formats": [
"docx",