-
Notifications
You must be signed in to change notification settings - Fork 340
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] 1-Quickstart.ipynb indexing does not complete #157
Comments
Hi @GarrettE - it isn't able to find your AOAI resource. Double check your deploy.parameters.json, particularly the GRAPHRAG_API_BASE, GRAPHRAG_API_VERSION, and MODEL/DEPLOYMENT name variables. |
@timothymeyers As far as I can tell there appears to have been a change in the OpenAI REST APIs (or some problem with API versioning?) that is causing the issue. I rewrote the test.py (/graphrag-accelerator/backend/src/api/test.py) to use the openai python SKD (see attached py) I get the following: graphrag-solution-accelerator-py3.10vscode@docker-desktop:/graphrag-accelerator/backend/src/api$ python test3.py with the regular test.py: graphrag-solution-accelerator-py3.10vscode@docker-desktop:/graphrag-accelerator/backend/src/api$ python test.py If I want to continue to use the SDK in my env .. where do I make the change? do I need to redeploy to implement the change? |
@GarrettE - I think you may be using the wrong endpoint in the Quickstart notebook? Just to clarify, the accelerator deploys So you wouldn't be giving the AOAI endpoint in the Quickstart notebook - you'd be giving the APIM endpoint (and subsequently be providing the APIM Subscription Access Key) |
Hi @GarrettE, I got similar behavior when my AOAI quotas were insufficient and I had a mistake in my API Version. I had success using 2024-05-01-preview with model gpt-4o in a swedencentral deployment. |
I'm using |
Thanks @timothymeyers @biggernaaz I was able to resolve the issue by re-deploying using the docker in WSL and not windows11. I also created a gpt-4o-mini deployment (eastus) as it had much more quota :) |
I'm facing a similar issue. Indexing reaches still step 3/16 and then stops abruptly, because of a 400 error. After that, it reverts backs to step 1/16 and doesn't progress beyond that I'm running the package on a dev container in VSCode, and using the example Wiki articles. Here are my deployment parameters: "APIM_NAME": "api-gRAG", I've tried adding the api keys to the pipeline files and different API versions but nothing seems to be working. Attached are the screenshots of the errors from the indexing pod logs. |
Hello, "GRAPHRAG_API_VERSION": "2024-08-01-preview", |
for me it also stuck on 6.25 percent for hours, i've used the following setting:
I think we need to split the |
Splitting the
|
Describe the bug
index_status stuck at ''Workflow create_base_extracted_entities started.''
log error:
{
'type': 'on_workflow_start',
'data': 'Index: wikiTestIndexv2 -- Workflow (1/16): create_base_text_units started.',
'details': {
'workflow_name': 'create_base_text_units',
'index_name': 'wikiTestIndexv2',
},
}
{
'type': 'on_workflow_end',
'data': 'Index: wikiTestIndexv2 -- Workflow (1/16): create_base_text_units complete.',
'details': {
'workflow_name': 'create_base_text_units',
'index_name': 'wikiTestIndexv2',
},
}
{
'type': 'on_workflow_start',
'data': 'Index: wikiTestIndexv2 -- Workflow (2/16): create_base_extracted_entities started.',
'details': {
'workflow_name': 'create_base_extracted_entities',
'index_name': 'wikiTestIndexv2',
},
}
{
'type': 'error',
'data': 'Error Invoking LLM',
'cause': "Error code: 404 - {'error': {'code': '404', 'message': 'Resource not found'}}",
'stack': (
'Traceback (most recent call last):\n'
' File "/usr/local/lib/python3.10/site-packages/graphrag/llm/base/base_llm.py", line 53, in _invoke\n'
' output = await self._execute_llm(input, **kwargs)\n'
' File "/usr/local/lib/python3.10/site-packages/graphrag/llm/openai/openai_chat_llm.py", line 55, in _execu'
'te_llm\n'
' completion = await self.client.chat.completions.create(\n'
' File "/usr/local/lib/python3.10/site-packages/openai/resources/chat/completions.py", line 1289, in create'
'\n'
' return await self._post(\n'
' File "/usr/local/lib/python3.10/site-packages/openai/_base_client.py", line 1826, in post\n'
' return await self.request(cast_to, opts, stream=stream, stream_cls=stream_cls)\n'
' File "/usr/local/lib/python3.10/site-packages/openai/_base_client.py", line 1519, in request\n'
' return await self._request(\n'
' File "/usr/local/lib/python3.10/site-packages/openai/_base_client.py", line 1620, in _request\n'
' raise self._make_status_error_from_response(err.response) from None\n'
"openai.NotFoundError: Error code: 404 - {'error': {'code': '404', 'message': 'Resource not found'}}\n"
To Reproduce
1-Quickstart.ipynb:
def index_status(index_name: str) -> requests.Response:
url = endpoint + f"/index/status/{index_name}"
return requests.get(url, headers=headers)
response = index_status(index_name)
pprint(response.json())
{
'status_code': 200,
'index_name': 'wikiTestIndexv2',
'storage_name': 'wikiTest',
'status': 'running',
'percent_complete': 6.25,
'progress': 'Workflow create_base_extracted_entities started.',
}
Expected behavior
Indexing should complete within a few hours. No activity in aoai nor indexes in AI Search
I tried adding the aoai api_key to /graphrag-accelerator/backend/src/api/pipeline-settings.yaml
llm:
type: azure_openai_chat
api_base: $GRAPHRAG_API_BASE
api_version: $GRAPHRAG_API_VERSION
api_key: XXXXXXXXXXXXXXXXXXXXXXX
model: $GRAPHRAG_LLM_MODEL
deployment_name: $GRAPHRAG_LLM_DEPLOYMENT_NAME
cognitive_services_endpoint: $GRAPHRAG_COGNITIVE_SERVICES_ENDPOINT
model_supports_json: True
tokens_per_minute: 80000
requests_per_minute: 480
thread_count: 50
concurrent_requests: 25
embeddings:
async_mode: threaded
llm:
type: azure_openai_embedding
api_base: $GRAPHRAG_API_BASE
api_version: $GRAPHRAG_API_VERSION
api_key: XXXXXXXXXXXXXXXXXXXXXXXXXX
batch_size: 16
model: $GRAPHRAG_EMBEDDING_MODEL
deployment_name: $GRAPHRAG_EMBEDDING_DEPLOYMENT_NAME
cognitive_services_endpoint: $GRAPHRAG_COGNITIVE_SERVICES_ENDPOINT
tokens_per_minute: 350000
concurrent_requests: 25
requests_per_minute: 2100
thread_count: 50
max_retries: 50
parallelization:
stagger: 0.25
num_threads: 10
vector_store:
type: azure_ai_search
collection_name: PLACEHOLDER
title_column: name
overwrite: True
url: $AI_SEARCH_URL
audience: $AI_SEARCH_AUDIENCE
The text was updated successfully, but these errors were encountered: