AccessViolationException occurs during multiple calls #23713

RockNHawk · 2025-02-15T10:15:01Z

AccessViolationException occurs during multiple calls (the crash happens after Profiler::EndProfiling while writing profiler data to a file).

It's not because of the issue with EnableProfiling. I enabled EnableProfiling because the existing logs were crashing when I tried to view them.

When I call InferenceSession.Run in a multi-threaded environment, this function crashes after running several times.

If I create a new InferenceSession instance every time, it does not crash and works well.

I have tried adding a lock to prevent concurrent calls, but that did not solve the issue. I understand that InferenceSession.Run is thread-safe.

Here is the log.

Crash log1:

lfOwnBufferHelper] For ort_value with index: 266, block in memory pattern size is: 194560 but the actual size is: 27904, fall back to default allocation behavior
......
2025-02-15 17:57:52.0556390 [V:onnxruntime:, execution_frame.cc:563 onnxruntime::ExecutionFrame::AllocateMLValueTensorSelfOwnBufferHelper] For ort_value with index: 339, block in memory pattern size is: 276480 but the actual size is: 104448, fall back to default allocation behavior
2025-02-15 17:57:52.0618346 [V:onnxruntime:, execution_frame.cc:563 onnxruntime::ExecutionFrame::AllocateMLValueTensorSelfOwnBufferHelper] For ort_value with index: 340, block in memory pattern size is: 276480 but the actual size is: 104448, fall back to default allocation behavior

2025-02-15 17:57:52.1096720 [I:onnxruntime:, profiler.cc:115 onnxruntime::profiling::Profiler::EndProfiling] Writing profiler data to file onnxruntime_profile__2025-02-15_17-55-37.json

Fatal error. System.AccessViolationException: Attempted to read or write protected memory. This is often an indication that other memory is corrupt.

Crash log2:

2025-02-15 18:06:59.7739997 [V:onnxruntime:, execution_frame.cc:563 onnxruntime::ExecutionFrame::AllocateMLValueTensorSelfOwnBufferHelper] For ort_value with index: 339, block in memory pattern size is: 122880 but the actual size is: 251904, fall back to default allocation behavior
2025-02-15 18:06:59.7802401 [V:onnxruntime:, execution_frame.cc:563 onnxruntime::ExecutionFrame::AllocateMLValueTensorSelfOwnBufferHelper] For ort_value with index: 340, block in memory pattern size is: 122880 but the actual size is: 251904, fall back to default allocation behavior

2025-02-15 18:07:00.4918686 [I:onnxruntime:, profiler.cc:115 onnxruntime::profiling::Profiler::EndProfiling] Writing profiler data to file onnxruntime_profile__2025-02-15_18-04-46.json

2025-02-15 18:07:00.8677663 [V:onnxruntime:, inference_session.cc:2904 onnxruntime::InferenceSession::EndProfiling] Profiler is disabled.

Fatal error. System.AccessViolationException: Attempted to read or write protected memory. This is often an indication that other memory is corrupt.

code

public SentenceEncoder(Tokenizer tokenizer, TextModelCommonInfo modelInfo, SessionOptions? sessionOptions = null)
{
    _sessionOptions = sessionOptions ?? NewSessionOptions(null);
    _sessionOptions.LogSeverityLevel = OrtLoggingLevel.ORT_LOGGING_LEVEL_VERBOSE;
    _sessionOptions.EnableProfiling = true;
    _session = new InferenceSession(modelInfo.ModelPath, _sessionOptions);
    _tokenizer = tokenizer;
    _modelInfo = modelInfo;
    _outputNames = _session.OutputMetadata.Keys.ToArray();
    _inputNames = new[] { "input_ids", "attention_mask" };
}


public static SessionOptions NewSessionOptions(int? gpuDeviceId)
{
    var sessionOptions = new SessionOptions()
    {
        LogSeverityLevel = OrtLoggingLevel.ORT_LOGGING_LEVEL_WARNING
    };

    Exception? gpuException = null;
    if (gpuDeviceId != null)
    {
        try
        {
            sessionOptions.AppendExecutionProvider_DML(gpuDeviceId.Value);
        }
        catch (Exception ex)
        {
            gpuException = ex;
        }
    }

    sessionOptions.GraphOptimizationLevel = GraphOptimizationLevel.ORT_ENABLE_ALL;
    return sessionOptions;
}


async Task<PooledList<TextEmbeddingResult>> EncodeCore(IEnumerable<TextEmbeddingItem> items)
{
    // inputs
    using var batchInput = EmbeddingGeneratorUtility.GetBatchInput(items, _tokenizer);
    // equals to numSentences
    var batchCount = batchInput.BatchCount;
    var maxTokenLength = batchInput.MaxTokenLength;
    var numSentences = batchCount;
    var tokenCount = maxTokenLength;
    Assertion.Assert(tokenCount == maxTokenLength);
    var allInputLength = maxTokenLength * batchCount;
    long[] flattenIDs = new long[allInputLength];
    long[] flattenAttentionMask = new long[allInputLength];
    long[]? flattenTokenTypeIds = ModelType == ModelTypes.MiniLM ? new long[allInputLength] : null;


    var inputList = batchInput.Inputs.ToList();
    Assertion.Debug(inputList.Count == batchCount);
    Assertion.Debug(inputList.Count * maxTokenLength == allInputLength);
    for (int i = 0; i < inputList.Count; i++)
    {
        var offset = i * maxTokenLength;
        var item = inputList[i];
        Assertion.Debug(item.InputIds.Length == maxTokenLength);
        // this operation is requried
        item.InputIds.CopyTo(flattenIDs, offset);
        item.AttentionMask.CopyTo(flattenAttentionMask, offset);
    }

    using var runOptions = new RunOptions();
    using var inputIdsOrtValue = OrtValue.CreateTensorValueFromMemory(flattenIDs, [batchCount, maxTokenLength]);
    using var attentionMaskOrtValue = OrtValue.CreateTensorValueFromMemory(flattenAttentionMask, [batchCount, maxTokenLength]);

    var inputs = new List<OrtValue>
    {
        { inputIdsOrtValue },
        { attentionMaskOrtValue },
    };

    using var outputValue = OrtValue.CreateAllocatedTensorValue(OrtAllocator.DefaultInstance, TensorElementType.Float, [batchCount, maxTokenLength, Dim]);
    await _session.RunAsync(runOptions, _inputNames, inputs, _outputNames, [outputValue]);
    // Assertion.Debug(outputTensor.Dimensions[2] == Dim);
    var outputVectors = outputValue.GetTensorDataAsSpan<float>();
    Assertion.Assert(outputVectors.Length == batchCount * maxTokenLength * Dim);

    ReadOnlySpan<int> optDims = stackalloc int[]
    {
        numSentences, tokenCount, Dim
    };
    DenseTensor<float> outputTensor = new DenseTensor<float>(outputVectors.ToArray(), optDims);

    var output_pooled = MeanPooling(outputTensor, inputList);
    var output_pooled_normalized = Normalize(output_pooled);
    Assertion.Debug(output_pooled.Length == output_pooled_normalized.Length);
    Assertion.Debug(output_pooled.Length == batchCount * Dim);

    var results = PolledCollectionUtility.List<TextEmbeddingResult>(batchCount);
    for (int s = 0; s < batchCount; s++)
    {
        var emb = new float[Dim];
        for (int i = 0; i < Dim; i++)
        {
            emb[i] = output_pooled_normalized[s, i];
        }

        results.Add(new(inputList[s].InputSource, emb));
    }

    return results;
}

The text was updated successfully, but these errors were encountered:

RockNHawk · 2025-02-15T10:48:36Z

And I tried set EnableCpuMemArena & EnableMemoryPattern = false but not work, then I and add an _sessionOptions.IsInvalid check & re-new InferenceSession for fix , it also crash, but Application can run more times.

public SentenceEncoder(Tokenizer tokenizer, TextModelCommonInfo modelInfo, SessionOptions? sessionOptions = null)
{
    _sessionOptions = sessionOptions ?? NewSessionOptions(null);
    ConfigureSessionOptions(_sessionOptions);
    _session = new InferenceSession(modelInfo.ModelPath, _sessionOptions);
     .....
}


private void ConfigureSessionOptions(SessionOptions sessionOptions)
{
    sessionOptions.EnableCpuMemArena = false; 
    sessionOptions.EnableMemoryPattern = false; 
}


public async Task<PooledList<TextEmbeddingResult>> EncodeCore(IEnumerable<TextEmbeddingItem> items)
{
    if (_sessionOptions.IsInvalid)
    {
        Console.WriteLine("[Embedding] Invalid session");
        Console.Error.WriteLine("[Embedding] Invalid session");

        // the session is broken, do not access it even not call `.Dispose()`, it will make AccessViolationException:
        // _session.Dispose();

        _sessionOptions = NewSessionOptions(null);
        ConfigureSessionOptions(_sessionOptions);
        _session = new InferenceSession(_modelInfo.ModelPath, _sessionOptions);
    }

    return await EncodeCoreImpl(items);

crash log:

[Embedding] Invalid session
Fatal error. System.AccessViolationException: Attempted to read or write protected memory. This is often an indication that other memory is corrupt.

RockNHawk · 2025-02-15T11:10:09Z

Sorry, It's my problem, I Dispose the Session after a batch work.

RockNHawk changed the title ~~AccessViolationException occurs during multiple calls (the crash happens after Profiler::EndProfiling while writing profiler data to a file).~~ AccessViolationException occurs during multiple calls Feb 15, 2025

RockNHawk closed this as completed Feb 15, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AccessViolationException occurs during multiple calls #23713

AccessViolationException occurs during multiple calls #23713

RockNHawk commented Feb 15, 2025 •

edited

Loading

RockNHawk commented Feb 15, 2025 •

edited

Loading

RockNHawk commented Feb 15, 2025

AccessViolationException occurs during multiple calls #23713

AccessViolationException occurs during multiple calls #23713

Comments

RockNHawk commented Feb 15, 2025 • edited Loading

Crash log1:

Crash log2:

code

RockNHawk commented Feb 15, 2025 • edited Loading

RockNHawk commented Feb 15, 2025

RockNHawk commented Feb 15, 2025 •

edited

Loading

RockNHawk commented Feb 15, 2025 •

edited

Loading