[Bug]: <title>

### Do you need to file an issue?

- [ ] I have searched the existing issues and this bug is not already filed.
- [ ] My model is hosted on OpenAI or Azure. If not, please look at the "model providers" issue and don't file a new one here.
- [ ] I believe this is a legitimate bug, not just a question. If this is a question, please use the Discussions area.

### Describe the bug

我按照以下配置，不知道為什麼就不行

### This config file contains required core defaults that must be set, along with a handful of common optional settings.
### For a full list of available settings, see https://microsoft.github.io/graphrag/config/yaml/

### LLM settings ###
## There are a number of settings to tune the threading and token limits for LLM calls - check the docs.

models:
  default_chat_model:
    type: chat
    model_provider: openai
    auth_type: api_key # or azure_managed_identity
    api_key: ${GRAPHRAG_API_KEY} # set this in the generated .env file, or remove if managed identity
    model: qwen3-32b-awq
    api_base: http://10.151.131.161:9997/v1
    # api_version: 2024-05-01-preview
    model_supports_json: true # recommended if this is available for your model.
    concurrent_requests: 25
    async_mode: threaded # or asyncio
    retry_strategy: exponential_backoff
    max_retries: 10
    tokens_per_minute: null
    requests_per_minute: null
  default_embedding_model:
    type: embedding
    model_provider: openai
    auth_type: api_key
    api_key: ${GRAPHRAG_API_KEY}
    model: Qwen3-Embedding-4B
    api_base: http://10.151.131.161:9997/v1
    # api_version: 2024-05-01-preview
    concurrent_requests: 25
    async_mode: threaded # or asyncio
    retry_strategy: exponential_backoff
    max_retries: 10
    tokens_per_minute: null
    requests_per_minute: null

### Input settings ###

input:
  storage:
    type: file # or blob
    base_dir: "input"
  file_type: text # [csv, text, json]

chunks:
  size: 1200
  overlap: 100
  group_by_columns: [id]

### Output/storage settings ###
## If blob storage is specified in the following four sections,
## connection_string and container_name must be provided

output:
  type: file # [file, blob, cosmosdb]
  base_dir: "output"
    
cache:
  type: file # [file, blob, cosmosdb]
  base_dir: "cache"

reporting:
  type: file # [file, blob]
  base_dir: "logs"

vector_store:
  default_vector_store:
    type: lancedb
    db_uri: output\lancedb
    container_name: default

### Workflow settings ###

embed_text:
  model_id: default_embedding_model
  vector_store_id: default_vector_store

extract_graph:
  model_id: default_chat_model
  prompt: "prompts/extract_graph.txt"
  entity_types: [organization,person,geo,event]
  max_gleanings: 1

summarize_descriptions:
  model_id: default_chat_model
  prompt: "prompts/summarize_descriptions.txt"
  max_length: 500

extract_graph_nlp:
  text_analyzer:
    extractor_type: regex_english # [regex_english, syntactic_parser, cfg]
  async_mode: threaded # or asyncio

cluster_graph:
  max_cluster_size: 10

extract_claims:
  enabled: false
  model_id: default_chat_model
  prompt: "prompts/extract_claims.txt"
  description: "Any claims or facts that could be relevant to information discovery."
  max_gleanings: 1

community_reports:
  model_id: default_chat_model
  graph_prompt: "prompts/community_report_graph.txt"
  text_prompt: "prompts/community_report_text.txt"
  max_length: 2000
  max_input_length: 8000

embed_graph:
  enabled: false # if true, will generate node2vec embeddings for nodes

umap:
  enabled: false # if true, will generate UMAP embeddings for nodes (embed_graph must also be enabled)

snapshots:
  graphml: false
  embeddings: false

### Query settings ###
## The prompt locations are required here, but each search method has a number of optional knobs that can be tuned.
## See the config docs: https://microsoft.github.io/graphrag/config/yaml/#query

local_search:
  chat_model_id: default_chat_model
  embedding_model_id: default_embedding_model
  prompt: "prompts/local_search_system_prompt.txt"

global_search:
  chat_model_id: default_chat_model
  map_prompt: "prompts/global_search_map_system_prompt.txt"
  reduce_prompt: "prompts/global_search_reduce_system_prompt.txt"
  knowledge_prompt: "prompts/global_search_knowledge_system_prompt.txt"

drift_search:
  chat_model_id: default_chat_model
  embedding_model_id: default_embedding_model
  prompt: "prompts/drift_search_system_prompt.txt"
  reduce_prompt: "prompts/drift_search_reduce_prompt.txt"

basic_search:
  chat_model_id: default_chat_model
  embedding_model_id: default_embedding_model
  prompt: "prompts/basic_search_system_prompt.txt"


### Steps to reproduce

2025-12-30 18:05:56.0933 - INFO - graphrag.index.validate_config - LLM Config Params Validated
2025-12-30 18:05:57.0336 - INFO - graphrag.index.validate_config - Embedding LLM Config Params Validated
2025-12-30 18:05:57.0336 - INFO - graphrag.cli.index - Starting pipeline run. False
2025-12-30 18:05:57.0337 - INFO - graphrag.cli.index - Using default configuration: {
    "root_dir": "E:\\work\\ai\\graphrag-llm\\ragtest",
    "models": {
        "default_chat_model": {
            "api_key": "==== REDACTED ====",
            "auth_type": "api_key",
            "type": "chat",
            "model_provider": "openai",
            "model": "qwen3-32b-awq",
            "encoding_model": "",
            "api_base": "http://10.151.131.161:9997/v1",
            "api_version": null,
            "deployment_name": null,
            "proxy": null,
            "audience": null,
            "model_supports_json": true,
            "request_timeout": 180.0,
            "tokens_per_minute": null,
            "requests_per_minute": null,
            "rate_limit_strategy": "static",
            "retry_strategy": "exponential_backoff",
            "max_retries": 10,
            "max_retry_wait": 10.0,
            "concurrent_requests": 25,
            "async_mode": "threaded",
            "responses": null,
            "max_tokens": null,
            "temperature": 0,
            "max_completion_tokens": null,
            "reasoning_effort": null,
            "top_p": 1,
            "n": 1,
            "frequency_penalty": 0.0,
            "presence_penalty": 0.0
        },
        "default_embedding_model": {
            "api_key": "==== REDACTED ====",
            "auth_type": "api_key",
            "type": "embedding",
            "model_provider": "openai",
            "model": "Qwen3-Embedding-4B",
            "encoding_model": "",
            "api_base": "http://10.151.131.161:9997/v1",
            "api_version": null,
            "deployment_name": null,
            "proxy": null,
            "audience": null,
            "model_supports_json": null,
            "request_timeout": 180.0,
            "tokens_per_minute": null,
            "requests_per_minute": null,
            "rate_limit_strategy": "static",
            "retry_strategy": "exponential_backoff",
            "max_retries": 10,
            "max_retry_wait": 10.0,
            "concurrent_requests": 25,
            "async_mode": "threaded",
            "responses": null,
            "max_tokens": null,
            "temperature": 0,
            "max_completion_tokens": null,
            "reasoning_effort": null,
            "top_p": 1,
            "n": 1,
            "frequency_penalty": 0.0,
            "presence_penalty": 0.0
        }
    },
    "input": {
        "storage": {
            "type": "file",
            "base_dir": "E:\\work\\ai\\graphrag-llm\\ragtest\\input",
            "storage_account_blob_url": null,
            "cosmosdb_account_url": null
        },
        "file_type": "text",
        "encoding": "utf-8",
        "file_pattern": ".*\\.txt$",
        "file_filter": null,
        "text_column": "text",
        "title_column": null,
        "metadata": null
    },
    "chunks": {
        "size": 1200,
        "overlap": 100,
        "group_by_columns": [
            "id"
        ],
        "strategy": "tokens",
        "encoding_model": "cl100k_base",
        "prepend_metadata": false,
        "chunk_size_includes_metadata": false
    },
    "output": {
        "type": "file",
        "base_dir": "E:\\work\\ai\\graphrag-llm\\ragtest\\output",
        "storage_account_blob_url": null,
        "cosmosdb_account_url": null
    },
    "outputs": null,
    "update_index_output": {
        "type": "file",
        "base_dir": "E:\\work\\ai\\graphrag-llm\\ragtest\\update_output",
        "storage_account_blob_url": null,
        "cosmosdb_account_url": null
    },
    "cache": {
        "type": "file",
        "base_dir": "cache",
        "storage_account_blob_url": null,
        "cosmosdb_account_url": null
    },
    "reporting": {
        "type": "file",
        "base_dir": "E:\\work\\ai\\graphrag-llm\\ragtest\\logs",
        "storage_account_blob_url": null
    },
    "vector_store": {
        "default_vector_store": {
            "type": "lancedb",
            "db_uri": "E:\\work\\ai\\graphrag-llm\\ragtest\\output\\lancedb",
            "url": null,
            "audience": null,
            "container_name": "==== REDACTED ====",
            "database_name": null,
            "overwrite": true,
            "embeddings_schema": {}
        }
    },
    "workflows": null,
    "embed_text": {
        "model_id": "default_embedding_model",
        "vector_store_id": "default_vector_store",
        "batch_size": 16,
        "batch_max_tokens": 8191,
        "names": [
            "entity.description",
            "community.full_content",
            "text_unit.text"
        ],
        "strategy": null
    },
    "extract_graph": {
        "model_id": "default_chat_model",
        "prompt": "prompts/extract_graph.txt",
        "entity_types": [
            "organization",
            "person",
            "geo",
            "event"
        ],
        "max_gleanings": 1,
        "strategy": null
    },
    "summarize_descriptions": {
        "model_id": "default_chat_model",
        "prompt": "prompts/summarize_descriptions.txt",
        "max_length": 500,
        "max_input_tokens": 4000,
        "strategy": null
    },
    "extract_graph_nlp": {
        "normalize_edge_weights": true,
        "text_analyzer": {
            "extractor_type": "regex_english",
            "model_name": "en_core_web_md",
            "max_word_length": 15,
            "word_delimiter": " ",
            "include_named_entities": true,
            "exclude_nouns": [
                "stuff",
                "thing",
                "things",
                "bunch",
                "bit",
                "bits",
                "people",
                "person",
                "okay",
                "hey",
                "hi",
                "hello",
                "laughter",
                "oh"
            ],
            "exclude_entity_tags": [
                "DATE"
            ],
            "exclude_pos_tags": [
                "DET",
                "PRON",
                "INTJ",
                "X"
            ],
            "noun_phrase_tags": [
                "PROPN",
                "NOUNS"
            ],
            "noun_phrase_grammars": {
                "PROPN,PROPN": "PROPN",
                "NOUN,NOUN": "NOUNS",
                "NOUNS,NOUN": "NOUNS",
                "ADJ,ADJ": "ADJ",
                "ADJ,NOUN": "NOUNS"
            }
        },
        "concurrent_requests": 25,
        "async_mode": "threaded"
    },
    "prune_graph": {
        "min_node_freq": 2,
        "max_node_freq_std": null,
        "min_node_degree": 1,
        "max_node_degree_std": null,
        "min_edge_weight_pct": 40.0,
        "remove_ego_nodes": true,
        "lcc_only": false
    },
    "cluster_graph": {
        "max_cluster_size": 10,
        "use_lcc": true,
        "seed": 3735928559
    },
    "extract_claims": {
        "enabled": false,
        "model_id": "default_chat_model",
        "prompt": "prompts/extract_claims.txt",
        "description": "Any claims or facts that could be relevant to information discovery.",
        "max_gleanings": 1,
        "strategy": null
    },
    "community_reports": {
        "model_id": "default_chat_model",
        "graph_prompt": "prompts/community_report_graph.txt",
        "text_prompt": "prompts/community_report_text.txt",
        "max_length": 2000,
        "max_input_length": 8000,
        "strategy": null
    },
    "embed_graph": {
        "enabled": false,
        "dimensions": 1536,
        "num_walks": 10,
        "walk_length": 40,
        "window_size": 2,
        "iterations": 3,
        "random_seed": 597832,
        "use_lcc": true
    },
    "umap": {
        "enabled": false
    },
    "snapshots": {
        "embeddings": false,
        "graphml": false,
        "raw_graph": false
    },
    "local_search": {
        "prompt": "prompts/local_search_system_prompt.txt",
        "chat_model_id": "default_chat_model",
        "embedding_model_id": "default_embedding_model",
        "text_unit_prop": 0.5,
        "community_prop": 0.15,
        "conversation_history_max_turns": 5,
        "top_k_entities": 10,
        "top_k_relationships": 10,
        "max_context_tokens": 12000
    },
    "global_search": {
        "map_prompt": "prompts/global_search_map_system_prompt.txt",
        "reduce_prompt": "prompts/global_search_reduce_system_prompt.txt",
        "chat_model_id": "default_chat_model",
        "knowledge_prompt": "prompts/global_search_knowledge_system_prompt.txt",
        "max_context_tokens": 12000,
        "data_max_tokens": 12000,
        "map_max_length": 1000,
        "reduce_max_length": 2000,
        "dynamic_search_threshold": 1,
        "dynamic_search_keep_parent": false,
        "dynamic_search_num_repeats": 1,
        "dynamic_search_use_summary": false,
        "dynamic_search_max_level": 2
    },
    "drift_search": {
        "prompt": "prompts/drift_search_system_prompt.txt",
        "reduce_prompt": "prompts/drift_search_reduce_prompt.txt",
        "chat_model_id": "default_chat_model",
        "embedding_model_id": "default_embedding_model",
        "data_max_tokens": 12000,
        "reduce_max_tokens": null,
        "reduce_temperature": 0,
        "reduce_max_completion_tokens": null,
        "concurrency": 32,
        "drift_k_followups": 20,
        "primer_folds": 5,
        "primer_llm_max_tokens": 12000,
        "n_depth": 3,
        "local_search_text_unit_prop": 0.9,
        "local_search_community_prop": 0.1,
        "local_search_top_k_mapped_entities": 10,
        "local_search_top_k_relationships": 10,
        "local_search_max_data_tokens": 12000,
        "local_search_temperature": 0,
        "local_search_top_p": 1,
        "local_search_n": 1,
        "local_search_llm_max_gen_tokens": null,
        "local_search_llm_max_gen_completion_tokens": null
    },
    "basic_search": {
        "prompt": "prompts/basic_search_system_prompt.txt",
        "chat_model_id": "default_chat_model",
        "embedding_model_id": "default_embedding_model",
        "k": 10,
        "max_context_tokens": 12000
    }
}
2025-12-30 18:05:57.0338 - INFO - graphrag.api.index - Initializing indexing pipeline...
2025-12-30 18:05:57.0338 - INFO - graphrag.index.workflows.factory - Creating pipeline with workflows: ['load_input_documents', 'create_base_text_units', 'create_final_documents', 'extract_graph', 'finalize_graph', 'extract_covariates', 'create_communities', 'create_final_text_units', 'create_community_reports', 'generate_text_embeddings']
2025-12-30 18:05:57.0340 - INFO - graphrag.storage.file_pipeline_storage - Creating file storage at E:\work\ai\graphrag-llm\ragtest\input
2025-12-30 18:05:57.0341 - INFO - graphrag.storage.file_pipeline_storage - Creating file storage at E:\work\ai\graphrag-llm\ragtest\output
2025-12-30 18:05:57.0341 - INFO - graphrag.storage.file_pipeline_storage - Creating file storage at E:\work\ai\graphrag-llm\ragtest
2025-12-30 18:05:57.0341 - INFO - graphrag.storage.file_pipeline_storage - Creating file storage at E:\work\ai\graphrag-llm\ragtest\cache
2025-12-30 18:05:57.0348 - INFO - graphrag.index.run.run_pipeline - Running standard indexing.
2025-12-30 18:05:57.0348 - INFO - graphrag.storage.file_pipeline_storage - Creating file storage at 
2025-12-30 18:05:57.0350 - INFO - graphrag.index.run.run_pipeline - Executing pipeline...
2025-12-30 18:05:57.0350 - INFO - graphrag.index.input.factory - loading input from root_dir=E:\work\ai\graphrag-llm\ragtest\input
2025-12-30 18:05:57.0350 - INFO - graphrag.index.input.factory - Loading Input InputFileType.text
2025-12-30 18:05:57.0351 - INFO - graphrag.storage.file_pipeline_storage - search E:\work\ai\graphrag-llm\ragtest\input for files matching .*\.txt$
2025-12-30 18:05:57.0354 - INFO - graphrag.index.input.util - Found 1 InputFileType.text files, loading 1
2025-12-30 18:05:57.0354 - INFO - graphrag.index.input.util - Total number of unfiltered InputFileType.text rows: 1
2025-12-30 18:05:57.0354 - INFO - graphrag.index.workflows.load_input_documents - Final # of rows loaded: 1
2025-12-30 18:05:57.0363 - INFO - graphrag.api.index - Workflow load_input_documents completed successfully
2025-12-30 18:05:57.0370 - INFO - graphrag.index.workflows.create_base_text_units - Workflow started: create_base_text_units
2025-12-30 18:05:57.0371 - INFO - graphrag.utils.storage - reading table from storage: documents.parquet
2025-12-30 18:05:57.0381 - INFO - graphrag.index.workflows.create_base_text_units - Starting chunking process for 1 documents
2025-12-30 18:06:18.0454 - ERROR - graphrag.index.run.run_pipeline - error running workflow create_base_text_units
Traceback (most recent call last):
  File "C:\Program Files\Python311\Lib\site-packages\urllib3\connection.py", line 204, in _new_conn
    sock = connection.create_connection(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Program Files\Python311\Lib\site-packages\urllib3\util\connection.py", line 85, in create_connection
    raise err
  File "C:\Program Files\Python311\Lib\site-packages\urllib3\util\connection.py", line 73, in create_connection
    sock.connect(sa)
TimeoutError: [WinError 10060] 連線嘗試失敗，因為連線對象有一段時間並未正確回應，或是連線建立失敗，因為連線的主機無法回應。

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "C:\Program Files\Python311\Lib\site-packages\urllib3\connectionpool.py", line 787, in urlopen
    response = self._make_request(
               ^^^^^^^^^^^^^^^^^^^
  File "C:\Program Files\Python311\Lib\site-packages\urllib3\connectionpool.py", line 488, in _make_request
    raise new_e
  File "C:\Program Files\Python311\Lib\site-packages\urllib3\connectionpool.py", line 464, in _make_request
    self._validate_conn(conn)
  File "C:\Program Files\Python311\Lib\site-packages\urllib3\connectionpool.py", line 1093, in _validate_conn
    conn.connect()
  File "C:\Program Files\Python311\Lib\site-packages\urllib3\connection.py", line 759, in connect
    self.sock = sock = self._new_conn()
                       ^^^^^^^^^^^^^^^^
  File "C:\Program Files\Python311\Lib\site-packages\urllib3\connection.py", line 213, in _new_conn
    raise ConnectTimeoutError(
urllib3.exceptions.ConnectTimeoutError: (<HTTPSConnection(host='openaipublic.blob.core.windows.net', port=443) at 0x1bc37471c50>, 'Connection to openaipublic.blob.core.windows.net timed out. (connect timeout=None)')

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "C:\Program Files\Python311\Lib\site-packages\requests\adapters.py", line 644, in send
    resp = conn.urlopen(
           ^^^^^^^^^^^^^
  File "C:\Program Files\Python311\Lib\site-packages\urllib3\connectionpool.py", line 841, in urlopen
    retries = retries.increment(
              ^^^^^^^^^^^^^^^^^^
  File "C:\Program Files\Python311\Lib\site-packages\urllib3\util\retry.py", line 519, in increment
    raise MaxRetryError(_pool, url, reason) from reason  # type: ignore[arg-type]
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='openaipublic.blob.core.windows.net', port=443): Max retries exceeded with url: /encodings/cl100k_base.tiktoken (Caused by ConnectTimeoutError(<HTTPSConnection(host='openaipublic.blob.core.windows.net', port=443) at 0x1bc37471c50>, 'Connection to openaipublic.blob.core.windows.net timed out. (connect timeout=None)'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Program Files\Python311\Lib\site-packages\graphrag\index\run\run_pipeline.py", line 121, in _run_pipeline
    result = await workflow_function(config, context)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Program Files\Python311\Lib\site-packages\graphrag\index\workflows\create_base_text_units.py", line 35, in run_workflow
    output = create_base_text_units(
             ^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Program Files\Python311\Lib\site-packages\graphrag\index\workflows\create_base_text_units.py", line 140, in create_base_text_units
    aggregated = aggregated.apply(
                 ^^^^^^^^^^^^^^^^^
  File "C:\Program Files\Python311\Lib\site-packages\pandas\core\frame.py", line 10401, in apply
    return op.apply().__finalize__(self, method="apply")
           ^^^^^^^^^^
  File "C:\Program Files\Python311\Lib\site-packages\pandas\core\apply.py", line 916, in apply
    return self.apply_standard()
           ^^^^^^^^^^^^^^^^^^^^^
  File "C:\Program Files\Python311\Lib\site-packages\pandas\core\apply.py", line 1063, in apply_standard
    results, res_index = self.apply_series_generator()
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Program Files\Python311\Lib\site-packages\pandas\core\apply.py", line 1081, in apply_series_generator
    results[i] = self.func(v, *self.args, **self.kwargs)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Program Files\Python311\Lib\site-packages\graphrag\index\workflows\create_base_text_units.py", line 141, in <lambda>
    lambda row: chunker_with_logging(row, row.name), axis=1
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Program Files\Python311\Lib\site-packages\graphrag\index\workflows\create_base_text_units.py", line 136, in chunker_with_logging
    result = chunker(row)
             ^^^^^^^^^^^^
  File "C:\Program Files\Python311\Lib\site-packages\graphrag\index\workflows\create_base_text_units.py", line 108, in chunker
    chunked = chunk_text(
              ^^^^^^^^^^^
  File "C:\Program Files\Python311\Lib\site-packages\graphrag\index\operations\chunk_text\chunk_text.py", line 67, in chunk_text
    input.apply(
  File "C:\Program Files\Python311\Lib\site-packages\pandas\core\frame.py", line 10401, in apply
    return op.apply().__finalize__(self, method="apply")
           ^^^^^^^^^^
  File "C:\Program Files\Python311\Lib\site-packages\pandas\core\apply.py", line 916, in apply
    return self.apply_standard()
           ^^^^^^^^^^^^^^^^^^^^^
  File "C:\Program Files\Python311\Lib\site-packages\pandas\core\apply.py", line 1063, in apply_standard
    results, res_index = self.apply_series_generator()
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Program Files\Python311\Lib\site-packages\pandas\core\apply.py", line 1081, in apply_series_generator
    results[i] = self.func(v, *self.args, **self.kwargs)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Program Files\Python311\Lib\site-packages\graphrag\index\operations\chunk_text\chunk_text.py", line 70, in <lambda>
    lambda x: run_strategy(
              ^^^^^^^^^^^^^
  File "C:\Program Files\Python311\Lib\site-packages\graphrag\index\operations\chunk_text\chunk_text.py", line 97, in run_strategy
    strategy_results = strategy_exec(texts, config, tick)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Program Files\Python311\Lib\site-packages\graphrag\index\operations\chunk_text\strategies.py", line 45, in run_tokens
    encode, decode = get_encoding_fn(encoding_name)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Program Files\Python311\Lib\site-packages\graphrag\index\operations\chunk_text\strategies.py", line 22, in get_encoding_fn
    enc = tiktoken.get_encoding(encoding_name)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Program Files\Python311\Lib\site-packages\tiktoken\registry.py", line 86, in get_encoding
    enc = Encoding(**constructor())
                     ^^^^^^^^^^^^^
  File "C:\Program Files\Python311\Lib\site-packages\tiktoken_ext\openai_public.py", line 76, in cl100k_base
    mergeable_ranks = load_tiktoken_bpe(
                      ^^^^^^^^^^^^^^^^^^
  File "C:\Program Files\Python311\Lib\site-packages\tiktoken\load.py", line 162, in load_tiktoken_bpe
    contents = read_file_cached(tiktoken_bpe_file, expected_hash)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Program Files\Python311\Lib\site-packages\tiktoken\load.py", line 67, in read_file_cached
    contents = read_file(blobpath)
               ^^^^^^^^^^^^^^^^^^^
  File "C:\Program Files\Python311\Lib\site-packages\tiktoken\load.py", line 17, in read_file
    resp = requests.get(blobpath)
           ^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Program Files\Python311\Lib\site-packages\requests\api.py", line 73, in get
    return request("get", url, params=params, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Program Files\Python311\Lib\site-packages\requests\api.py", line 59, in request
    return session.request(method=method, url=url, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Program Files\Python311\Lib\site-packages\requests\sessions.py", line 589, in request
    resp = self.send(prep, **send_kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Program Files\Python311\Lib\site-packages\requests\sessions.py", line 703, in send
    r = adapter.send(request, **kwargs)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Program Files\Python311\Lib\site-packages\requests\adapters.py", line 665, in send
    raise ConnectTimeout(e, request=request)
requests.exceptions.ConnectTimeout: HTTPSConnectionPool(host='openaipublic.blob.core.windows.net', port=443): Max retries exceeded with url: /encodings/cl100k_base.tiktoken (Caused by ConnectTimeoutError(<HTTPSConnection(host='openaipublic.blob.core.windows.net', port=443) at 0x1bc37471c50>, 'Connection to openaipublic.blob.core.windows.net timed out. (connect timeout=None)'))
2025-12-30 18:06:18.0461 - ERROR - graphrag.api.index - Workflow create_base_text_units completed with errors
2025-12-30 18:06:18.0465 - ERROR - graphrag.cli.index - Errors occurred during the pipeline run, see logs for more details.


### Expected Behavior

_No response_

### GraphRAG Config Used

```yaml
# Paste your config here

```


### Logs and screenshots

_No response_

### Additional Information

- GraphRAG Version:
- Operating System:
- Python Version:
- Related Issues:


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: <title> #2162

Do you need to file an issue?

Describe the bug

This config file contains required core defaults that must be set, along with a handful of common optional settings.

For a full list of available settings, see https://microsoft.github.io/graphrag/config/yaml/

LLM settings

There are a number of settings to tune the threading and token limits for LLM calls - check the docs.

Input settings

Output/storage settings

If blob storage is specified in the following four sections,

connection_string and container_name must be provided

Workflow settings

Query settings

The prompt locations are required here, but each search method has a number of optional knobs that can be tuned.

See the config docs: https://microsoft.github.io/graphrag/config/yaml/#query

Steps to reproduce

Expected Behavior

GraphRAG Config Used

Logs and screenshots

Additional Information

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Bug]: <title> #2162

Description

Do you need to file an issue?

Describe the bug

This config file contains required core defaults that must be set, along with a handful of common optional settings.

For a full list of available settings, see https://microsoft.github.io/graphrag/config/yaml/

LLM settings

There are a number of settings to tune the threading and token limits for LLM calls - check the docs.

Input settings

Output/storage settings

If blob storage is specified in the following four sections,

connection_string and container_name must be provided

Workflow settings

Query settings

The prompt locations are required here, but each search method has a number of optional knobs that can be tuned.

See the config docs: https://microsoft.github.io/graphrag/config/yaml/#query

Steps to reproduce

Expected Behavior

GraphRAG Config Used

Logs and screenshots

Additional Information

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions