Application crashes on exit with v0.5.0 #1056

f2bo · 2024-11-09T16:14:28Z

The latest release (v0.5.0) introduces additional checks designed to prevent resource leaks that will cause an application to crash unless some explicit changes are made. This can be easily reproduced with the sample code included in the README file of the Microsoft.ML.OnnxRuntimeGenAI package.

Executing this code as is will result in a crash when the application exits after displaying the following message:

Error: Shutdown must be called before process exit, please check the documentation for the proper API to call to ensure clean shutdown.

Proper disposal of the OgaHandle instance fixes the issue

Replace

OgaHandle ogaHandle = new OgaHandle();

With this

using OgaHandle ogaHandle = new OgaHandle();

I couldn't find documentation regarding the OgaHandle, nor many examples that use it, considering how essential it seems to be. This impacts applications using the ONNX Connector in Semantic Kernel, which currently is not aware of this new requirement (see microsoft/semantic-kernel#9628).

While making it easier to diagnose resource leaks is always welcome, crashing the application seems a bit heavy handed. Maybe just keep the error messages but remove the forced shutdown?

The text was updated successfully, but these errors were encountered:

skyline75489 · 2024-11-11T04:36:15Z

I remember having a discussion with @RyanUnderhill about whether we should make it std::abort. Maybe we can move the discussion here instead of Teams?

skyline75489 · 2024-11-11T04:41:07Z

In #799 the proper releasing of resources became a hard requirement. This especially impacts C# and Java scenarios where users will need to manually release the resources (using in C#, and try in Java). We should probably at least add some doc about it.

sjpritchard · 2024-11-13T11:17:32Z

Same occurs on C++. I can't find any reference in the C++ examples.

skyline75489 · 2024-11-15T06:50:13Z

@sjpritchard Hi! Could you please post the C++ code you're using? The Adapters API are quite new and I don't think it should trigger leaked source check.

sjpritchard · 2024-11-15T19:19:08Z

@skyline75489 Here it is:

#pragma once

#include "ort_genai.h"

#include <QDebug>
#include <QObject>
#include <QString>
#include <QStringList>

class RagLanguageModel : public QObject {
    Q_OBJECT
public:
    explicit RagLanguageModel(QObject* parent = nullptr);
    void Generate(QString query, QStringList context);

public slots:

signals:
    void Generated(QString text);

private:
    std::unique_ptr<OgaModel>           model_{nullptr};
    std::unique_ptr<OgaTokenizer>       tokenizer_{nullptr};
};

#include "RagLanguageModel.h"

RagLanguageModel::RagLanguageModel(QObject* parent) : QObject{parent} {
    model_ = OgaModel::Create("phi-3.5-mini-instruct"); 
    tokenizer_        = OgaTokenizer::Create(*model_);
}

void RagLanguageModel::Generate(QString query, QStringList context) {
    QString joined_context = context.join(" ");

    auto prompt = QString("<|system|>\n"
                          "You are a helpful question answering bot. "
                          "<|end|>\n"
                          "<|user|>\n"
                          "Context: %1\n"
                          "Question: %2."
                          "<|end|>\n"
                          "<|assistant|>")
                      .arg(joined_context)
                      .arg(query);

    auto tokenizer_stream = OgaTokenizerStream::Create(*tokenizer_);
    auto sequences        = OgaSequences::Create();
    tokenizer_->Encode(prompt.toUtf8().constData(), *sequences);

    auto params = OgaGeneratorParams::Create(*model_);
    params->SetInputSequences(*sequences);
    params->SetSearchOption("max_length", 1024);

    try {
        auto        generator = OgaGenerator::Create(*model_, *params);
        while (!generator->IsDone()) {
            generator->ComputeLogits();
            generator->GenerateNextToken();
            size_t         sequence_index  = 0;
            size_t         sequence_length = generator->GetSequenceCount(sequence_index);
            const int32_t* sequence_data   = generator->GetSequenceData(sequence_index);
            int32_t        latest_token    = sequence_data[sequence_length - 1];
            const char*    decoded_chunk   = tokenizer_stream->Decode(latest_token);
            auto           text            = std::string(decoded_chunk);
            emit Generated(QString::fromStdString(text));
        }
    } catch (const std::exception& e) {
        qDebug() << "Error:" << e.what();
    }
}

luomaojiang2016 · 2024-11-15T23:57:36Z

I encountered the same issue, the onnxruntime-genai.dll crashes when the program closes. It works fine in version 0.4, has problems in version 0.5. I also did an experiment, and found that the application crashes on closing as long as it calls the OgaModelCreate function. This is likely a bug in onnxruntime-genai.dll, and I hope it can be fixed as soon as possible. Thank you

skyline75489 · 2024-11-19T04:16:26Z

@sjpritchard Are you using DML? Or just CPU?

@RyanUnderhill This also looks like an adapter related crash similar to the DML one.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Application crashes on exit with v0.5.0 #1056

Application crashes on exit with v0.5.0 #1056

f2bo commented Nov 9, 2024

skyline75489 commented Nov 11, 2024

skyline75489 commented Nov 11, 2024

sjpritchard commented Nov 13, 2024 •

edited

Loading

skyline75489 commented Nov 15, 2024

sjpritchard commented Nov 15, 2024 •

edited

Loading

luomaojiang2016 commented Nov 15, 2024

skyline75489 commented Nov 19, 2024

Application crashes on exit with v0.5.0 #1056

Application crashes on exit with v0.5.0 #1056

Comments

f2bo commented Nov 9, 2024

skyline75489 commented Nov 11, 2024

skyline75489 commented Nov 11, 2024

sjpritchard commented Nov 13, 2024 • edited Loading

skyline75489 commented Nov 15, 2024

sjpritchard commented Nov 15, 2024 • edited Loading

luomaojiang2016 commented Nov 15, 2024

skyline75489 commented Nov 19, 2024

sjpritchard commented Nov 13, 2024 •

edited

Loading

sjpritchard commented Nov 15, 2024 •

edited

Loading