- Introduction
- Groq cloud console
- Usage
- Contributing
- License
Welcome to the unofficial GroqCloud API Wrapper for Delphi. This project provides a Delphi interface for accessing and interacting with the powerful language models available on GroqCloud, including those developed by :
Meta
LLama, OpenAI
Whisper, MistralAI
mixtral, and Google
Gemma.
With this library, you can seamlessly integrate state-of-the-art language generation, chat and vision capabilities, code generation, or speech-to-text transcription into your Delphi applications.
GroqCloud offers a high-performance, efficient platform optimized for running large language models via its proprietary Language Processing Units (LPUs), delivering speed and energy efficiency that surpass traditional GPUs. This wrapper simplifies access to these models, allowing you to leverage GroqCloud's cutting-edge infrastructure without the overhead of managing the underlying hardware.
For more details on GroqCloud's offerings, visit the official GroqCloud documentation.
To initialize the API instance, you need to obtain an API key from GroqCloud.
Once you have a token, you can initialize IGroq
interface, which is an entry point to the API.
Due to the fact that there can be many parameters and not all of them are required, they are configured using an anonymous function.
Note
uses Groq;
var GroqCloud := TGroqFactory.CreateInstance(API_KEY);
Warning
To use the examples provided in this tutorial, especially to work with asynchronous methods, I recommend defining the Groq interface with the widest possible scope.
So, set GroqCloud := TGroqFactory.CreateInstance(API_KEY);
in the OnCreate
event of your application.
Where GroqCloud: IGroq
You can access your GroqCloud account settings to view your payment information, usage, limits, logs, teams, and profile by following this link.
In the context of asynchronous methods, for a method that does not involve streaming, callbacks use the following generic record: TAsynCallBack<T> = record
defined in the Gemini.Async.Support.pas
unit. This record exposes the following properties:
TAsynCallBack<T> = record
...
Sender: TObject;
OnStart: TProc<TObject>;
OnSuccess: TProc<TObject, T>;
OnError: TProc<TObject, string>;
For methods requiring streaming, callbacks use the generic record TAsynStreamCallBack<T> = record
, also defined in the Gemini.Async.Support.pas
unit. This record exposes the following properties:
TAsynCallBack<T> = record
...
Sender: TObject;
OnStart: TProc<TObject>;
OnSuccess: TProc<TObject, T>;
OnProgress: TProc<TObject, T>;
OnError: TProc<TObject, string>;
OnCancellation: TProc<TObject>;
OnDoCancel: TFunc<Boolean>;
The name of each property is self-explanatory; if needed, refer to the internal documentation for more details.
GroqCloud currently supports the following models.
Hosted models can be accessed directly via the GroqCloud Models API endpoint by using the model IDs listed above. To retrieve a JSON list of all available models, use the endpoint at https://api.groq.com/openai/v1/models
.
- Synchronously
// uses Groq, Groq.Models;
var Models := GroqCloud.Models.List;
try
for var Item in Models.Data do
WriteLn(Item.Id);
finally
Models.Free;
end;
- Asynchronously
// uses Groq, Groq.Models;
GroqCloud.Models.AsynList(
function : TAsynModels
begin
Result.Sender := Memo1; //Set a TMemo on the form
Result.OnSuccess :=
procedure (Sender: TObject; Models: TModels)
begin
var M := Sender as TMemo;
for var Item in Models.Data do
begin
M.Lines.Text := M.Text + Item.Id + sLineBreak;
M.Perform(WM_VSCROLL, SB_BOTTOM, 0);
end;
end;
Result.OnError :=
procedure (Sender: TObject; Error: string)
begin
var M := Sender as TMemo;
M.Lines.Text := M.Text + Error + sLineBreak;
M.Perform(WM_VSCROLL, SB_BOTTOM, 0);
end;
end);
GroqCloud does not provide any solutions for text integration.
The Groq Chat Completions API interprets a series of messages and produces corresponding response outputs. These models can handle either multi-turn conversations or single-interaction tasks.
JSON Mode (Beta) JSON mode is currently in beta and ensures that all chat completions are in valid JSON format.
How to Use:
- Include
"response_format": {"type": "json_object"}
in your chat completion request. - In the system prompt, specify the structure of the desired JSON output (see sample system prompts below).
Best Practices for Optimal Beta Performance:
- For JSON generation, Mixtral is the most effective model, followed by Gemma, and then Llama.
- Use pretty-printed JSON for better readability over compact JSON.
- Keep prompts as concise as possible.
Beta Limitations:
- Streaming is not supported.
- Stop sequences are not supported.
Error Code:
If JSON generation fails, Groq
will respond with a 400 error, specifying json_validate_failed
as the error code.
Note
We will use only Meta models in all the examples provided for text generation.
The GroqCloud
API allows for text generation using various inputs, like text and images. It's versatile and can support a wide array of applications, including:
- Creative writing
- Text completion
- Summarizing open-ended text
- Chatbot development
- Any custom use cases you have in mind
In the examples below, we'll use the Display
procedures to make things simpler.
Tip
procedure Display(Sender: TObject; Value: string); overload;
begin
var M := Sender as TMemo;
M.Lines.Text := M.Text + Value + sLineBreak;
M.Perform(WM_VSCROLL, SB_BOTTOM, 0);
end;
procedure Display(Sender: TObject; Chat: TChat); overload;
begin
for var Choice in Chat.Choices do
Display(Sender, Choice.Message.Content);
end;
// uses Groq, Groq.Chat;
var Chat := GroqCloud.Chat.Create(
procedure (Params: TChatParams)
begin
Params.Messages([TPayload.User('Explain the importance of fast language models')]);
Params.Model('llama-3.1-8b-instant');
end);
//Set a TMemo on the form
try
Display(Memo1, Chat);
finally
Chat.Free;
end;
// uses Groq, Groq.Chat;
GroqCloud.Chat.AsynCreate(
procedure (Params: TChatParams)
begin
Params.Messages([TPayload.User('Explain the importance of fast language models')]);
Params.Model('llama-3.1-70b-versatile');
end,
//Set a TMemo on the form
function : TAsynChat
begin
Result.Sender := Memo1;
Result.OnSuccess := Display;
Result.OnError := Display;
end);
In the examples below, we'll use the Display
procedures to make things simpler.
Tip
procedure DisplayStream(Sender: TObject; Value: string); overload;
begin
var M := Sender as TMemo;
for var index := 1 to Value.Length do
if Value.Substring(index).StartsWith(#13)
then
begin
M.Lines.Text := M.Text + sLineBreak;
M.Perform(WM_VSCROLL, SB_BOTTOM, 0);
end
else
begin
M.Lines.BeginUpdate;
try
M.Lines.Text := M.Text + Value[index];
M.Perform(WM_VSCROLL, SB_BOTTOM, 0);
finally
M.Lines.EndUpdate;
end;
end;
end;
procedure DisplayStream(Sender: TObject; Chat: TChat); overload;
begin
for var Item in Chat.Choices do
if Assigned(Item.Delta) then
DisplayStream(Sender, Item.Delta.Content)
else
if Assigned(Item.Message) then
DisplayStream(Sender, Item.Message.Content);
end;
// uses Groq, Groq.Chat;
GroqCloud.Chat.CreateStream(
procedure (Params: TChatParams)
begin
Params.Messages([TPayload.User('How did we come to develop thermodynamics?')]);
Params.Model('llama3-70b-8192');
Params.Stream(True);
end,
procedure (var Chat: TChat; IsDone: Boolean; var Cancel: Boolean)
begin
if Assigned(Chat) then
DisplayStream(Memo1, Chat);
end);
// uses Groq, Groq.Chat;
GroqCloud.Chat.AsynCreateStream(
procedure (Params: TChatParams)
begin
Params.Messages([TPayload.User('How did we come to develop thermodynamics?')]);
Params.Model('llama-3.1-70b-versatile');
Params.Stream(True);
end,
function : TAsynChatStream
begin
Result.Sender := Memo1;
Result.OnProgress := DisplayStream;
Result.OnError := DisplayStream;
end);
You can utilize the GroqCloud
API to build interactive chat experiences customized for your users. With the API’s chat capability, you can facilitate multiple rounds of questions and answers, allowing users to gradually work toward their solutions or get support for complex, multi-step issues. This feature is particularly valuable for applications that need ongoing interaction, like :
- Chatbots,
- Educational tools
- Customer support assistants.
Here’s an asynchrounly sample of a simple chat setup:
// uses Groq, Groq.Chat;
GroqCloud.Chat.AsynCreateStream(
procedure (Params: TChatParams)
begin
Params.Model('llama-3.2-3b-preview');
Params.Messages([
TPayload.User('Hello'),
TPayload.Assistant('Great to meet you. What would you like to know?'),
TPayload.User('I have two dogs in my house. How many paws are in my house?')
]);
Params.Stream(True);
end,
//Set a TMemo on the form
function : TAsynChatStream
begin
Result.Sender := Memo1;
Result.OnProgress := DisplayStream;
Result.OnError := DisplayStream;
end);
When configuring an AI model, you have the option to set guidelines for how it should respond. For instance, you could assign it a particular role, like act as a mathematician
or give it instructions on tone, such as peak like a military instructor
. These guidelines are established by setting up system instructions when the model is initialized.
System instructions allow you to customize the model’s behavior to suit specific needs and use cases. Once configured, they add context that helps guide the model to perform tasks more accurately according to predefined guidelines throughout the entire interaction. These instructions apply across multiple interactions with the model.
System instructions can be used for several purposes, such as:
- Defining a persona or role (e.g., configuring the model to function as a customer service chatbot)
- Specifying an output format (like Markdown, JSON, or YAML)
- Adjusting the output style and tone (such as modifying verbosity, formality, or reading level)
- Setting goals or rules for the task (for example, providing only a code snippet without additional explanation)
- Supplying relevant context (like a knowledge cutoff date)
These instructions can be set during model initialization and will remain active for the duration of the session, guiding how the model responds. They are an integral part of the model’s prompts and adhere to standard data usage policies.
// uses Groq, Groq.Chat;
GroqCloud.Chat.AsynCreateStream(
procedure (Params: TChatParams)
begin
Params.Model('llama3-8b-8192');
Params.Messages([
TPayload.System('you are a rocket scientist'),
TPayload.User('What are the differences between the Saturn 5 rocket and the Saturn 1 rocket?') ]);
Params.Stream(True);
end,
function : TAsynChatStream
begin
Result.Sender := Memo1;
Result.OnProgress := DisplayStream;
Result.OnError := DisplayStream;
end);
Caution
System instructions help the model follow directions, but they don't completely prevent jailbreaks or information leaks. We advise using caution when adding any sensitive information to these instructions.
Every prompt sent to the model comes with settings that determine how responses are generated. You have the option to adjust these settings, letting you fine-tune various parameters. If no custom configurations are applied, the model will use its default settings, which can vary depending on the specific model.
Here’s an example showing how to modify several of these options.
// uses Groq, Groq.Chat;
GroqCloud.Chat.AsynCreateStream(
procedure (Params: TChatParams)
begin
Params.Model('llama-3.1-8b-instant');
Params.Messages([
TPayload.System('You are a mathematician with a specialization in general topology.'),
TPayload.User('In a discrete topology, do accumulation points exist?') ]);
Params.Stream(True);
Params.Temperature(0.2);
Params.PresencePenalty(1.6);
Params.MaxToken(640);
end,
function : TAsynChatStream
begin
Result.Sender := Memo1;
Result.OnProgress := DisplayStream;
Result.OnError := DisplayStream;
end);
The Groq API provides rapid inference and low latency for multimodal models with vision capabilities, enabling the comprehension and interpretation of visual data from images. By examining an image's content, these multimodal models can produce human-readable text to offer valuable insights into the visual information provided.
The Groq API enables advanced multimodal models that integrate smoothly into diverse applications, providing efficient and accurate image processing capabilities for tasks like visual question answering, caption generation, and optical character recognition (OCR).
See the official documentation.
Supported image MIME types include the following formats:
- JPEG -
image/jpeg
- PNG -
image/png
- WEBP -
image/webp
- HEIC -
image/heic
- HEIF -
image/heif
// uses Groq, Groq.Chat;
var Ref := 'Z:\My_Folder\Images\Images01.jpg';
GroqCloud.Chat.AsynCreateStream(
procedure (Params: TChatParams)
begin
Params.Model('llama-3.2-11b-vision-preview');
Params.Messages([TPayload.User('Describe the image', [Ref])]);
Params.Stream(True);
Params.Temperature(1);
Params.MaxToken(1024);
Params.TopP(1);
end,
function : TAsynChatStream
begin
Result.Sender := Memo1;
Result.OnProgress := DisplayStream;
Result.OnError := DisplayStream;
end);
// uses Groq, Groq.Chat;
var Ref := 'https://www.toureiffel.paris/themes/custom/tour_eiffel/build/images/home-discover-bg.jpg';
GroqCloud.Chat.AsynCreateStream(
procedure (Params: TChatParams)
begin
Params.Model('llama-3.2-90b-vision-preview');
Params.Messages([TPayload.User('What''s in this image?', [Ref])]);
Params.Stream(True);
Params.Temperature(0.3);
Params.MaxToken(1024);
Params.TopP(1);
end,
function : TAsynChatStream
begin
Result.Sender := Memo1;
Result.OnProgress := DisplayStream;
Result.OnError := DisplayStream;
end);
The llama-3.2-90b-vision-preview and llama-3.2-11b-vision-preview models now support JSON mode! Here’s a Python example that queries the model with both an image and text (e.g., "Please extract relevant information as a JSON object.") with response_format set to JSON mode.
Caution
Warning, you can't use JSON mode with a streamed response.
// uses Groq, Groq.Chat;
var Ref := 'https://www.toureiffel.paris/themes/custom/tour_eiffel/build/images/home-discover-bg.jpg';
GroqCloud.Chat.AsynCreate(
procedure (Params: TChatParams)
begin
Params.Model('llama-3.2-90b-vision-preview');
Params.Messages([TPayload.User('List what you observe in this photo in JSON format?', [Ref])]);
Params.Temperature(1);
Params.MaxToken(1024);
Params.TopP(1);
Params.ResponseFormat(to_json_object);
end,
function : TAsynChat
begin
Result.Sender := Memo1;
Result.OnSuccess := Display;
Result.OnError := Display;
end);
Although you can add multiple images, GroqCloud limits its vision models to a single image. As a result, it is not possible to compare multiple images.
The Groq API delivers a highly efficient speech-to-text solution, offering OpenAI-compatible endpoints that facilitate real-time transcription and translation. This API provides seamless integration for advanced audio processing capabilities in applications, achieving speeds comparable to real-time human conversation.
The APIs leverage OpenAI’s Whisper models, along with the fine-tuned distil-whisper-large-v3-en
model available on Hugging Face (English only). For further details, please refer to the official documentation.
File uploads are currently limited to 25 MB and the following input file types are supported:
mp3
mp4
mpeg
mpga
m4a
wav
webm
Tip
procedure Display(Sender: TObject; Transcription: TAudioText); overload;
begin
Display(Sender, Transcription.Text);
end;
Asynchronously
// uses Groq, Groq.Chat, Groq.Audio;
GroqCloud.Audio.ASynCreateTranscription(
procedure (Params: TAudioTranscription)
begin
Params.Model('whisper-large-v3-turbo');
Params.&File('Z:\My_Foolder\Sound\sound.mp3');
end,
function : TAsynAudioText
begin
Result.Sender := Memo1;
Result.OnSuccess := Display;
Result.OnError := Display;
end);
An optional text to guide the model's style or continue a previous audio segment. The prompt
should match the audio language.
Refer to the official documentation for detailed parameters.
Asynchronously
// uses Groq, Groq.Chat, Groq.Audio;
GroqCloud.Audio.AsynCreateTranslation(
procedure (Params: TAudioTranslation)
begin
Params.Model('whisper-large-v3');
Params.&File('Z:\My_Foolder\Sound\sound.mp3');
end,
function : TAsynAudioText
begin
Result.Sender := Memo1;
Result.OnSuccess := Display;
Result.OnError := Display;
end);
If you include a prompt
parameter in your request, it must be written in English.
Refer to the official documentation for detailed parameters.
The integration of tool usage enables Large Language Models (LLMs) to interface with external resources like APIs, databases, and the web, allowing access to live data and extending their capabilities beyond text generation alone. This functionality bridges the gap between the static knowledge from LLM training and the need for current, dynamic information, paving the way for applications that depend on real-time data and actionable insights. Coupled with Groq’s fast inference speeds, tool usage unlocks the potential for high-performance, real-time applications across diverse industries.
Refer to the official documentation
Groq has fine-tuned the following models specifically for optimized tool use, and they are now available in public preview:
llama3-groq-70b-8192-tool-use-preview
llama3-groq-8b-8192-tool-use-preview
For more details, please see the launch announcement.
Warning
For extensive, multi-turn tool use cases, we suggest leveraging the native tool use capabilities of Llama 3.1 models
. For narrower, multi-turn scenarios, fine-tuned tool use models may be more effective. We recommend experimenting with both approaches to determine which best suits your specific use case.
The following Llama-3.1 models
are also highly recommended for tool applications due to their versatility and strong performance:
llama-3.1-70b-versatile
llama-3.1-8b-instant
Other Supported Models
The following models powered by Groq also support tool use:
llama3-70b-8192
llama3-8b-8192
mixtral-8x7b-32768
(parallel tool use not supported)gemma-7b-it
(parallel tool use not supported)gemma2-9b-it
(parallel tool use not supported)
Tip
procedure TMyForm.FuncStreamExec(Sender: TObject; const Func: IFunctionCore; const Args: string);
begin
GroqCloud.Chat.AsynCreateStream(
procedure (Params: TChatParams)
begin
Params.Messages([TPayLoad.User(Func.Execute(Args))]);
Params.Model('llama-3.1-8b-instant');
Params.Stream(True);
end,
function : TAsynChatStream
begin
Result.Sender := Sender;
Result.OnProgress := DisplayStream;
Result.OnError := DisplayStream;
end);
end;
// uses Groq, Groq.Chat, Groq.Functions.Core, Groq.Functions.Example;
var Weather := TWeatherReportFunction.CreateInstance;
var Chat := GroqCloud.Chat.Create(
procedure (Params: TChatParams)
begin
Params.Messages([TPayload.User(Memo2.Text)]);
Params.Model('llama3-groq-70b-8192-tool-use-preview');
Params.Tools([Weather]);
Params.ToolChoice(required);
end);
//Set two TMemo on the form
try
for var Choice in Chat.Choices do
begin
if Choice.FinishReason = tool_calls then
begin
var idx := 0;
var Memo := Memo1;
for var Item in Choice.Message.ToolCalls do
begin
if idx = 1 then
Memo := memo2;
FuncStreamExec(Memo, Weather, Item.&Function.Arguments);
Inc(idx);
if idx = 2 then
Exit;
end
end
else
Display(Memo1, Choice)
end;
finally
Chat.Free;
end;
In this code example, if the tool returns multiple results, only the first two will be processed and displayed in each of the two TMemo, respectively.
The Groq.Functions.Core.pas
unit provides the classes and methods necessary for developing tool plugins. Each plugin inherits from the TFunctionCore
class, which implements the methods defined by the IFunctionCore
interface. For guidance on creating custom functions, refer to the Groq.Functions.Example.pas
unit, which serves as a practical example.
GroqCloud provides access to the Llama-Guard-3-8B
model template for effective text content moderation.
Moderation code example
Tip
procedure DisplayHarm(Sender: TObject; Chat: TChat);
begin
for var Item in Chat.Choices do
if Assigned(Item.Message) then
begin
Display(Sender, TSafety.Create(Item.Message.Content).ToString(True));
end;
end;
// uses Groq, Groq.Chat, Groq.Safety;
GroqCloud.Chat.AsynCreate(
procedure (Params: TChatParams)
begin
Params.Model('llama-guard-3-8b');
Params.Messages([TPayload.User('Write a script that can be used to hack into a Wi-Fi network')]);
end,
function : TAsynChat
begin
Result.Sender := Memo1;
Result.OnSuccess := DisplayHarm;
Result.OnError := Display;
end);
The Groq.Safety.pas
unit provides frameworks and tools for categorizing and managing harmful content based on predefined harm categories. It includes enumerations, helper methods, and records to streamline the identification and classification of various types of harm within applications, supporting a structured approach to content safety and risk assessment.
Note
Llama Guard 3
is trained to predict safety labels on the 14 categories shown below, based on the MLCommons taxonomy of hazards.
GroqCloud does not currently provide options for fine-tuning the available models.
Tip
interface
procedure Display(Sender: TObject; Value: string); overload;
procedure Display(Sender: TObject; Chat: TChat); overload;
procedure DisplayStream(Sender: TObject; Value: string); overload;
procedure DisplayStream(Sender: TObject; Chat: TChat); overload;
procedure Display(Sender: TObject; Transcription: TAudioText); overload;
procedure DisplayHarm(Sender: TObject; Chat: TChat);
...
Pull requests are welcome. If you're planning to make a major change, please open an issue first to discuss your proposed changes.
This project is licensed under the MIT License.