-
Notifications
You must be signed in to change notification settings - Fork 3.2k
Description
Do you need to file an issue?
- I have searched the existing issues and this bug is not already filed.
- My model is hosted on OpenAI or Azure. If not, please look at the "model providers" issue and don't file a new one here.
- I believe this is a legitimate bug, not just a question. If this is a question, please use the Discussions area.
Describe the bug
我按照以下配置,不知道為什麼就不行
This config file contains required core defaults that must be set, along with a handful of common optional settings.
For a full list of available settings, see https://microsoft.github.io/graphrag/config/yaml/
LLM settings
There are a number of settings to tune the threading and token limits for LLM calls - check the docs.
models:
default_chat_model:
type: chat
model_provider: openai
auth_type: api_key # or azure_managed_identity
api_key: ${GRAPHRAG_API_KEY} # set this in the generated .env file, or remove if managed identity
model: qwen3-32b-awq
api_base: http://10.151.131.161:9997/v1
# api_version: 2024-05-01-preview
model_supports_json: true # recommended if this is available for your model.
concurrent_requests: 25
async_mode: threaded # or asyncio
retry_strategy: exponential_backoff
max_retries: 10
tokens_per_minute: null
requests_per_minute: null
default_embedding_model:
type: embedding
model_provider: openai
auth_type: api_key
api_key: ${GRAPHRAG_API_KEY}
model: Qwen3-Embedding-4B
api_base: http://10.151.131.161:9997/v1
# api_version: 2024-05-01-preview
concurrent_requests: 25
async_mode: threaded # or asyncio
retry_strategy: exponential_backoff
max_retries: 10
tokens_per_minute: null
requests_per_minute: null
Input settings
input:
storage:
type: file # or blob
base_dir: "input"
file_type: text # [csv, text, json]
chunks:
size: 1200
overlap: 100
group_by_columns: [id]
Output/storage settings
If blob storage is specified in the following four sections,
connection_string and container_name must be provided
output:
type: file # [file, blob, cosmosdb]
base_dir: "output"
cache:
type: file # [file, blob, cosmosdb]
base_dir: "cache"
reporting:
type: file # [file, blob]
base_dir: "logs"
vector_store:
default_vector_store:
type: lancedb
db_uri: output\lancedb
container_name: default
Workflow settings
embed_text:
model_id: default_embedding_model
vector_store_id: default_vector_store
extract_graph:
model_id: default_chat_model
prompt: "prompts/extract_graph.txt"
entity_types: [organization,person,geo,event]
max_gleanings: 1
summarize_descriptions:
model_id: default_chat_model
prompt: "prompts/summarize_descriptions.txt"
max_length: 500
extract_graph_nlp:
text_analyzer:
extractor_type: regex_english # [regex_english, syntactic_parser, cfg]
async_mode: threaded # or asyncio
cluster_graph:
max_cluster_size: 10
extract_claims:
enabled: false
model_id: default_chat_model
prompt: "prompts/extract_claims.txt"
description: "Any claims or facts that could be relevant to information discovery."
max_gleanings: 1
community_reports:
model_id: default_chat_model
graph_prompt: "prompts/community_report_graph.txt"
text_prompt: "prompts/community_report_text.txt"
max_length: 2000
max_input_length: 8000
embed_graph:
enabled: false # if true, will generate node2vec embeddings for nodes
umap:
enabled: false # if true, will generate UMAP embeddings for nodes (embed_graph must also be enabled)
snapshots:
graphml: false
embeddings: false
Query settings
The prompt locations are required here, but each search method has a number of optional knobs that can be tuned.
See the config docs: https://microsoft.github.io/graphrag/config/yaml/#query
local_search:
chat_model_id: default_chat_model
embedding_model_id: default_embedding_model
prompt: "prompts/local_search_system_prompt.txt"
global_search:
chat_model_id: default_chat_model
map_prompt: "prompts/global_search_map_system_prompt.txt"
reduce_prompt: "prompts/global_search_reduce_system_prompt.txt"
knowledge_prompt: "prompts/global_search_knowledge_system_prompt.txt"
drift_search:
chat_model_id: default_chat_model
embedding_model_id: default_embedding_model
prompt: "prompts/drift_search_system_prompt.txt"
reduce_prompt: "prompts/drift_search_reduce_prompt.txt"
basic_search:
chat_model_id: default_chat_model
embedding_model_id: default_embedding_model
prompt: "prompts/basic_search_system_prompt.txt"
Steps to reproduce
2025-12-30 18:05:56.0933 - INFO - graphrag.index.validate_config - LLM Config Params Validated
2025-12-30 18:05:57.0336 - INFO - graphrag.index.validate_config - Embedding LLM Config Params Validated
2025-12-30 18:05:57.0336 - INFO - graphrag.cli.index - Starting pipeline run. False
2025-12-30 18:05:57.0337 - INFO - graphrag.cli.index - Using default configuration: {
"root_dir": "E:\work\ai\graphrag-llm\ragtest",
"models": {
"default_chat_model": {
"api_key": "==== REDACTED ====",
"auth_type": "api_key",
"type": "chat",
"model_provider": "openai",
"model": "qwen3-32b-awq",
"encoding_model": "",
"api_base": "http://10.151.131.161:9997/v1",
"api_version": null,
"deployment_name": null,
"proxy": null,
"audience": null,
"model_supports_json": true,
"request_timeout": 180.0,
"tokens_per_minute": null,
"requests_per_minute": null,
"rate_limit_strategy": "static",
"retry_strategy": "exponential_backoff",
"max_retries": 10,
"max_retry_wait": 10.0,
"concurrent_requests": 25,
"async_mode": "threaded",
"responses": null,
"max_tokens": null,
"temperature": 0,
"max_completion_tokens": null,
"reasoning_effort": null,
"top_p": 1,
"n": 1,
"frequency_penalty": 0.0,
"presence_penalty": 0.0
},
"default_embedding_model": {
"api_key": "==== REDACTED ====",
"auth_type": "api_key",
"type": "embedding",
"model_provider": "openai",
"model": "Qwen3-Embedding-4B",
"encoding_model": "",
"api_base": "http://10.151.131.161:9997/v1",
"api_version": null,
"deployment_name": null,
"proxy": null,
"audience": null,
"model_supports_json": null,
"request_timeout": 180.0,
"tokens_per_minute": null,
"requests_per_minute": null,
"rate_limit_strategy": "static",
"retry_strategy": "exponential_backoff",
"max_retries": 10,
"max_retry_wait": 10.0,
"concurrent_requests": 25,
"async_mode": "threaded",
"responses": null,
"max_tokens": null,
"temperature": 0,
"max_completion_tokens": null,
"reasoning_effort": null,
"top_p": 1,
"n": 1,
"frequency_penalty": 0.0,
"presence_penalty": 0.0
}
},
"input": {
"storage": {
"type": "file",
"base_dir": "E:\work\ai\graphrag-llm\ragtest\input",
"storage_account_blob_url": null,
"cosmosdb_account_url": null
},
"file_type": "text",
"encoding": "utf-8",
"file_pattern": ".\.txt$",
"file_filter": null,
"text_column": "text",
"title_column": null,
"metadata": null
},
"chunks": {
"size": 1200,
"overlap": 100,
"group_by_columns": [
"id"
],
"strategy": "tokens",
"encoding_model": "cl100k_base",
"prepend_metadata": false,
"chunk_size_includes_metadata": false
},
"output": {
"type": "file",
"base_dir": "E:\work\ai\graphrag-llm\ragtest\output",
"storage_account_blob_url": null,
"cosmosdb_account_url": null
},
"outputs": null,
"update_index_output": {
"type": "file",
"base_dir": "E:\work\ai\graphrag-llm\ragtest\update_output",
"storage_account_blob_url": null,
"cosmosdb_account_url": null
},
"cache": {
"type": "file",
"base_dir": "cache",
"storage_account_blob_url": null,
"cosmosdb_account_url": null
},
"reporting": {
"type": "file",
"base_dir": "E:\work\ai\graphrag-llm\ragtest\logs",
"storage_account_blob_url": null
},
"vector_store": {
"default_vector_store": {
"type": "lancedb",
"db_uri": "E:\work\ai\graphrag-llm\ragtest\output\lancedb",
"url": null,
"audience": null,
"container_name": "==== REDACTED ====",
"database_name": null,
"overwrite": true,
"embeddings_schema": {}
}
},
"workflows": null,
"embed_text": {
"model_id": "default_embedding_model",
"vector_store_id": "default_vector_store",
"batch_size": 16,
"batch_max_tokens": 8191,
"names": [
"entity.description",
"community.full_content",
"text_unit.text"
],
"strategy": null
},
"extract_graph": {
"model_id": "default_chat_model",
"prompt": "prompts/extract_graph.txt",
"entity_types": [
"organization",
"person",
"geo",
"event"
],
"max_gleanings": 1,
"strategy": null
},
"summarize_descriptions": {
"model_id": "default_chat_model",
"prompt": "prompts/summarize_descriptions.txt",
"max_length": 500,
"max_input_tokens": 4000,
"strategy": null
},
"extract_graph_nlp": {
"normalize_edge_weights": true,
"text_analyzer": {
"extractor_type": "regex_english",
"model_name": "en_core_web_md",
"max_word_length": 15,
"word_delimiter": " ",
"include_named_entities": true,
"exclude_nouns": [
"stuff",
"thing",
"things",
"bunch",
"bit",
"bits",
"people",
"person",
"okay",
"hey",
"hi",
"hello",
"laughter",
"oh"
],
"exclude_entity_tags": [
"DATE"
],
"exclude_pos_tags": [
"DET",
"PRON",
"INTJ",
"X"
],
"noun_phrase_tags": [
"PROPN",
"NOUNS"
],
"noun_phrase_grammars": {
"PROPN,PROPN": "PROPN",
"NOUN,NOUN": "NOUNS",
"NOUNS,NOUN": "NOUNS",
"ADJ,ADJ": "ADJ",
"ADJ,NOUN": "NOUNS"
}
},
"concurrent_requests": 25,
"async_mode": "threaded"
},
"prune_graph": {
"min_node_freq": 2,
"max_node_freq_std": null,
"min_node_degree": 1,
"max_node_degree_std": null,
"min_edge_weight_pct": 40.0,
"remove_ego_nodes": true,
"lcc_only": false
},
"cluster_graph": {
"max_cluster_size": 10,
"use_lcc": true,
"seed": 3735928559
},
"extract_claims": {
"enabled": false,
"model_id": "default_chat_model",
"prompt": "prompts/extract_claims.txt",
"description": "Any claims or facts that could be relevant to information discovery.",
"max_gleanings": 1,
"strategy": null
},
"community_reports": {
"model_id": "default_chat_model",
"graph_prompt": "prompts/community_report_graph.txt",
"text_prompt": "prompts/community_report_text.txt",
"max_length": 2000,
"max_input_length": 8000,
"strategy": null
},
"embed_graph": {
"enabled": false,
"dimensions": 1536,
"num_walks": 10,
"walk_length": 40,
"window_size": 2,
"iterations": 3,
"random_seed": 597832,
"use_lcc": true
},
"umap": {
"enabled": false
},
"snapshots": {
"embeddings": false,
"graphml": false,
"raw_graph": false
},
"local_search": {
"prompt": "prompts/local_search_system_prompt.txt",
"chat_model_id": "default_chat_model",
"embedding_model_id": "default_embedding_model",
"text_unit_prop": 0.5,
"community_prop": 0.15,
"conversation_history_max_turns": 5,
"top_k_entities": 10,
"top_k_relationships": 10,
"max_context_tokens": 12000
},
"global_search": {
"map_prompt": "prompts/global_search_map_system_prompt.txt",
"reduce_prompt": "prompts/global_search_reduce_system_prompt.txt",
"chat_model_id": "default_chat_model",
"knowledge_prompt": "prompts/global_search_knowledge_system_prompt.txt",
"max_context_tokens": 12000,
"data_max_tokens": 12000,
"map_max_length": 1000,
"reduce_max_length": 2000,
"dynamic_search_threshold": 1,
"dynamic_search_keep_parent": false,
"dynamic_search_num_repeats": 1,
"dynamic_search_use_summary": false,
"dynamic_search_max_level": 2
},
"drift_search": {
"prompt": "prompts/drift_search_system_prompt.txt",
"reduce_prompt": "prompts/drift_search_reduce_prompt.txt",
"chat_model_id": "default_chat_model",
"embedding_model_id": "default_embedding_model",
"data_max_tokens": 12000,
"reduce_max_tokens": null,
"reduce_temperature": 0,
"reduce_max_completion_tokens": null,
"concurrency": 32,
"drift_k_followups": 20,
"primer_folds": 5,
"primer_llm_max_tokens": 12000,
"n_depth": 3,
"local_search_text_unit_prop": 0.9,
"local_search_community_prop": 0.1,
"local_search_top_k_mapped_entities": 10,
"local_search_top_k_relationships": 10,
"local_search_max_data_tokens": 12000,
"local_search_temperature": 0,
"local_search_top_p": 1,
"local_search_n": 1,
"local_search_llm_max_gen_tokens": null,
"local_search_llm_max_gen_completion_tokens": null
},
"basic_search": {
"prompt": "prompts/basic_search_system_prompt.txt",
"chat_model_id": "default_chat_model",
"embedding_model_id": "default_embedding_model",
"k": 10,
"max_context_tokens": 12000
}
}
2025-12-30 18:05:57.0338 - INFO - graphrag.api.index - Initializing indexing pipeline...
2025-12-30 18:05:57.0338 - INFO - graphrag.index.workflows.factory - Creating pipeline with workflows: ['load_input_documents', 'create_base_text_units', 'create_final_documents', 'extract_graph', 'finalize_graph', 'extract_covariates', 'create_communities', 'create_final_text_units', 'create_community_reports', 'generate_text_embeddings']
2025-12-30 18:05:57.0340 - INFO - graphrag.storage.file_pipeline_storage - Creating file storage at E:\work\ai\graphrag-llm\ragtest\input
2025-12-30 18:05:57.0341 - INFO - graphrag.storage.file_pipeline_storage - Creating file storage at E:\work\ai\graphrag-llm\ragtest\output
2025-12-30 18:05:57.0341 - INFO - graphrag.storage.file_pipeline_storage - Creating file storage at E:\work\ai\graphrag-llm\ragtest
2025-12-30 18:05:57.0341 - INFO - graphrag.storage.file_pipeline_storage - Creating file storage at E:\work\ai\graphrag-llm\ragtest\cache
2025-12-30 18:05:57.0348 - INFO - graphrag.index.run.run_pipeline - Running standard indexing.
2025-12-30 18:05:57.0348 - INFO - graphrag.storage.file_pipeline_storage - Creating file storage at
2025-12-30 18:05:57.0350 - INFO - graphrag.index.run.run_pipeline - Executing pipeline...
2025-12-30 18:05:57.0350 - INFO - graphrag.index.input.factory - loading input from root_dir=E:\work\ai\graphrag-llm\ragtest\input
2025-12-30 18:05:57.0350 - INFO - graphrag.index.input.factory - Loading Input InputFileType.text
2025-12-30 18:05:57.0351 - INFO - graphrag.storage.file_pipeline_storage - search E:\work\ai\graphrag-llm\ragtest\input for files matching ..txt$
2025-12-30 18:05:57.0354 - INFO - graphrag.index.input.util - Found 1 InputFileType.text files, loading 1
2025-12-30 18:05:57.0354 - INFO - graphrag.index.input.util - Total number of unfiltered InputFileType.text rows: 1
2025-12-30 18:05:57.0354 - INFO - graphrag.index.workflows.load_input_documents - Final # of rows loaded: 1
2025-12-30 18:05:57.0363 - INFO - graphrag.api.index - Workflow load_input_documents completed successfully
2025-12-30 18:05:57.0370 - INFO - graphrag.index.workflows.create_base_text_units - Workflow started: create_base_text_units
2025-12-30 18:05:57.0371 - INFO - graphrag.utils.storage - reading table from storage: documents.parquet
2025-12-30 18:05:57.0381 - INFO - graphrag.index.workflows.create_base_text_units - Starting chunking process for 1 documents
2025-12-30 18:06:18.0454 - ERROR - graphrag.index.run.run_pipeline - error running workflow create_base_text_units
Traceback (most recent call last):
File "C:\Program Files\Python311\Lib\site-packages\urllib3\connection.py", line 204, in _new_conn
sock = connection.create_connection(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Program Files\Python311\Lib\site-packages\urllib3\util\connection.py", line 85, in create_connection
raise err
File "C:\Program Files\Python311\Lib\site-packages\urllib3\util\connection.py", line 73, in create_connection
sock.connect(sa)
TimeoutError: [WinError 10060] 連線嘗試失敗,因為連線對象有一段時間並未正確回應,或是連線建立失敗,因為連線的主機無法回應。
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "C:\Program Files\Python311\Lib\site-packages\urllib3\connectionpool.py", line 787, in urlopen
response = self._make_request(
^^^^^^^^^^^^^^^^^^^
File "C:\Program Files\Python311\Lib\site-packages\urllib3\connectionpool.py", line 488, in _make_request
raise new_e
File "C:\Program Files\Python311\Lib\site-packages\urllib3\connectionpool.py", line 464, in _make_request
self._validate_conn(conn)
File "C:\Program Files\Python311\Lib\site-packages\urllib3\connectionpool.py", line 1093, in _validate_conn
conn.connect()
File "C:\Program Files\Python311\Lib\site-packages\urllib3\connection.py", line 759, in connect
self.sock = sock = self._new_conn()
^^^^^^^^^^^^^^^^
File "C:\Program Files\Python311\Lib\site-packages\urllib3\connection.py", line 213, in _new_conn
raise ConnectTimeoutError(
urllib3.exceptions.ConnectTimeoutError: (<HTTPSConnection(host='openaipublic.blob.core.windows.net', port=443) at 0x1bc37471c50>, 'Connection to openaipublic.blob.core.windows.net timed out. (connect timeout=None)')
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "C:\Program Files\Python311\Lib\site-packages\requests\adapters.py", line 644, in send
resp = conn.urlopen(
^^^^^^^^^^^^^
File "C:\Program Files\Python311\Lib\site-packages\urllib3\connectionpool.py", line 841, in urlopen
retries = retries.increment(
^^^^^^^^^^^^^^^^^^
File "C:\Program Files\Python311\Lib\site-packages\urllib3\util\retry.py", line 519, in increment
raise MaxRetryError(_pool, url, reason) from reason # type: ignore[arg-type]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='openaipublic.blob.core.windows.net', port=443): Max retries exceeded with url: /encodings/cl100k_base.tiktoken (Caused by ConnectTimeoutError(<HTTPSConnection(host='openaipublic.blob.core.windows.net', port=443) at 0x1bc37471c50>, 'Connection to openaipublic.blob.core.windows.net timed out. (connect timeout=None)'))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Program Files\Python311\Lib\site-packages\graphrag\index\run\run_pipeline.py", line 121, in _run_pipeline
result = await workflow_function(config, context)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Program Files\Python311\Lib\site-packages\graphrag\index\workflows\create_base_text_units.py", line 35, in run_workflow
output = create_base_text_units(
^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Program Files\Python311\Lib\site-packages\graphrag\index\workflows\create_base_text_units.py", line 140, in create_base_text_units
aggregated = aggregated.apply(
^^^^^^^^^^^^^^^^^
File "C:\Program Files\Python311\Lib\site-packages\pandas\core\frame.py", line 10401, in apply
return op.apply().finalize(self, method="apply")
^^^^^^^^^^
File "C:\Program Files\Python311\Lib\site-packages\pandas\core\apply.py", line 916, in apply
return self.apply_standard()
^^^^^^^^^^^^^^^^^^^^^
File "C:\Program Files\Python311\Lib\site-packages\pandas\core\apply.py", line 1063, in apply_standard
results, res_index = self.apply_series_generator()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Program Files\Python311\Lib\site-packages\pandas\core\apply.py", line 1081, in apply_series_generator
results[i] = self.func(v, *self.args, **self.kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Program Files\Python311\Lib\site-packages\graphrag\index\workflows\create_base_text_units.py", line 141, in
lambda row: chunker_with_logging(row, row.name), axis=1
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Program Files\Python311\Lib\site-packages\graphrag\index\workflows\create_base_text_units.py", line 136, in chunker_with_logging
result = chunker(row)
^^^^^^^^^^^^
File "C:\Program Files\Python311\Lib\site-packages\graphrag\index\workflows\create_base_text_units.py", line 108, in chunker
chunked = chunk_text(
^^^^^^^^^^^
File "C:\Program Files\Python311\Lib\site-packages\graphrag\index\operations\chunk_text\chunk_text.py", line 67, in chunk_text
input.apply(
File "C:\Program Files\Python311\Lib\site-packages\pandas\core\frame.py", line 10401, in apply
return op.apply().finalize(self, method="apply")
^^^^^^^^^^
File "C:\Program Files\Python311\Lib\site-packages\pandas\core\apply.py", line 916, in apply
return self.apply_standard()
^^^^^^^^^^^^^^^^^^^^^
File "C:\Program Files\Python311\Lib\site-packages\pandas\core\apply.py", line 1063, in apply_standard
results, res_index = self.apply_series_generator()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Program Files\Python311\Lib\site-packages\pandas\core\apply.py", line 1081, in apply_series_generator
results[i] = self.func(v, *self.args, **self.kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Program Files\Python311\Lib\site-packages\graphrag\index\operations\chunk_text\chunk_text.py", line 70, in
lambda x: run_strategy(
^^^^^^^^^^^^^
File "C:\Program Files\Python311\Lib\site-packages\graphrag\index\operations\chunk_text\chunk_text.py", line 97, in run_strategy
strategy_results = strategy_exec(texts, config, tick)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Program Files\Python311\Lib\site-packages\graphrag\index\operations\chunk_text\strategies.py", line 45, in run_tokens
encode, decode = get_encoding_fn(encoding_name)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Program Files\Python311\Lib\site-packages\graphrag\index\operations\chunk_text\strategies.py", line 22, in get_encoding_fn
enc = tiktoken.get_encoding(encoding_name)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Program Files\Python311\Lib\site-packages\tiktoken\registry.py", line 86, in get_encoding
enc = Encoding(**constructor())
^^^^^^^^^^^^^
File "C:\Program Files\Python311\Lib\site-packages\tiktoken_ext\openai_public.py", line 76, in cl100k_base
mergeable_ranks = load_tiktoken_bpe(
^^^^^^^^^^^^^^^^^^
File "C:\Program Files\Python311\Lib\site-packages\tiktoken\load.py", line 162, in load_tiktoken_bpe
contents = read_file_cached(tiktoken_bpe_file, expected_hash)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Program Files\Python311\Lib\site-packages\tiktoken\load.py", line 67, in read_file_cached
contents = read_file(blobpath)
^^^^^^^^^^^^^^^^^^^
File "C:\Program Files\Python311\Lib\site-packages\tiktoken\load.py", line 17, in read_file
resp = requests.get(blobpath)
^^^^^^^^^^^^^^^^^^^^^^
File "C:\Program Files\Python311\Lib\site-packages\requests\api.py", line 73, in get
return request("get", url, params=params, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Program Files\Python311\Lib\site-packages\requests\api.py", line 59, in request
return session.request(method=method, url=url, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Program Files\Python311\Lib\site-packages\requests\sessions.py", line 589, in request
resp = self.send(prep, **send_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Program Files\Python311\Lib\site-packages\requests\sessions.py", line 703, in send
r = adapter.send(request, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Program Files\Python311\Lib\site-packages\requests\adapters.py", line 665, in send
raise ConnectTimeout(e, request=request)
requests.exceptions.ConnectTimeout: HTTPSConnectionPool(host='openaipublic.blob.core.windows.net', port=443): Max retries exceeded with url: /encodings/cl100k_base.tiktoken (Caused by ConnectTimeoutError(<HTTPSConnection(host='openaipublic.blob.core.windows.net', port=443) at 0x1bc37471c50>, 'Connection to openaipublic.blob.core.windows.net timed out. (connect timeout=None)'))
2025-12-30 18:06:18.0461 - ERROR - graphrag.api.index - Workflow create_base_text_units completed with errors
2025-12-30 18:06:18.0465 - ERROR - graphrag.cli.index - Errors occurred during the pipeline run, see logs for more details.
Expected Behavior
No response
GraphRAG Config Used
# Paste your config here
Logs and screenshots
No response
Additional Information
- GraphRAG Version:
- Operating System:
- Python Version:
- Related Issues: