Description
Description
<Project Sdk="Microsoft.NET.Sdk">
<PropertyGroup>
<OutputType>Exe</OutputType>
<TargetFramework>net8.0</TargetFramework>
<ImplicitUsings>enable</ImplicitUsings>
<Nullable>enable</Nullable>
</PropertyGroup>
<ItemGroup>
<PackageReference Include="LlamaSharp" Version="0.13.0" />
<PackageReference Include="LlamaSharp.Backend.Cpu" Version="0.13.0" />
</ItemGroup>
</Project>
using LLama.Common;
using LLama;
string modelPath = @"/Users/m1test/Downloads/Meta-Llama-3-8B-Instruct-IQ3_M.gguf";
var parameters = new ModelParams(modelPath)
{
ContextSize = 1024, // The longest length of chat as memory.
GpuLayerCount = 5 // How many layers to offload to GPU. Please adjust it according to your GPU memory.
};
using var model = LLamaWeights.LoadFromFile(parameters);
using var context = model.CreateContext(parameters);
var executor = new InteractiveExecutor(context);
// Add chat histories as prompt to tell AI how to act.
var chatHistory = new ChatHistory();
chatHistory.AddMessage(AuthorRole.System, "Transcript of a dialog, where the User interacts with an Assistant named Bob. Bob is helpful, kind, honest, good at writing, and never fails to answer the User's requests immediately and with precision.");
chatHistory.AddMessage(AuthorRole.User, "Hello, Bob.");
chatHistory.AddMessage(AuthorRole.Assistant, "Hello. How may I help you today?");
ChatSession session = new(executor, chatHistory);
InferenceParams inferenceParams = new InferenceParams()
{
MaxTokens = 256, // No more than 256 tokens should appear in answer. Remove it if antiprompt is enough for control.
AntiPrompts = new List<string> { "User:" } // Stop generation once antiprompts appear.
};
Console.ForegroundColor = ConsoleColor.Yellow;
Console.Write("The chat session has started.\nUser: ");
Console.ForegroundColor = ConsoleColor.Green;
string userInput = Console.ReadLine() ?? "";
while (userInput != "exit")
{
await foreach ( // Generate the response streamingly.
var text
in session.ChatAsync(
new ChatHistory.Message(AuthorRole.User, userInput),
inferenceParams))
{
Console.ForegroundColor = ConsoleColor.White;
Console.Write(text);
}
Console.ForegroundColor = ConsoleColor.Green;
userInput = Console.ReadLine() ?? "";
}
Error throws:
Unhandled exception. System.TypeInitializationException: The type initializer for 'LLama.Native.NativeApi' threw an exception.
---> LLama.Exceptions.RuntimeError: The native library cannot be correctly loaded. It could be one of the following reasons:
-
No LLamaSharp backend was installed. Please search LLamaSharp.Backend and install one of them.
-
You are using a device with only CPU but installed cuda backend. Please install cpu backend instead.
-
One of the dependency of the native library is missed. Please use
ldd
on linux,dumpbin
on windows andotool
to check if all the dependency of the native library is satisfied. Generally you could find the libraries under your output folder. -
Try to compile llama.cpp yourself to generate a libllama library, then use
LLama.Native.NativeLibraryConfig.WithLibrary
to specify it at the very beginning of your code. For more information about compilation, please refer to LLamaSharp repo on github.at LLama.Native.NativeApi..cctor()
$(String[] args) in /Users/m1test/Downloads/LlamaSharpTest/Program.cs:line 12
--- End of inner exception stack trace ---
at LLama.Native.NativeApi.llama_max_devices()
at LLama.Abstractions.TensorSplitsCollection..ctor()
at LLama.Common.ModelParams..ctor(String modelPath)
at Program.
at Program.(String[] args)
Reproduction Steps
Use above codes.
Environment & Configuration
- Operating system: MacBook Air (M1, 2020), Chip Apple M1
- .NET runtime version: 8.0.302
- LLamaSharp version: LlamaSharp 0.13.0, LlamaSharp.Backend.Cpu 0.13.0
- CUDA version (if you are using cuda backend): NA
- CPU & GPU device: CPU
Known Workarounds
NA