A powerful Dart client library for interacting with the Groq Cloud API, empowering you to easily harness the capabilities of state-of-the-art Large Language Models (LLMs) within your Dart and Flutter applications.
Note: This is an independent project and not an official package maintained by Groq. For official resources, please visit groq.com.
- Intuitive Chat Interface: Seamlessly create and manage chat sessions with Groq's LLMs.
- Model Management: Retrieve metadata about available Groq models and dynamically switch between them.
- Customization: Configure chat settings to fine-tune responses (temperature, max tokens, etc.).
- Tool Use: Let the model invoke functions to retrieve additional data or perform tasks.
- Resource Usage Tracking: Get detailed insights into token usage and request/response times.
- Rate Limit Information: Stay informed about your Groq API usage limits.
- Future-proof: Easily support new Groq models as they become available.
- Audio Transcription: Transcribe audio files into text using Groq's powerful Whisper models.
- Audio Translation: Translate audio files directly into english.
- Content Moderation: Easily check if texts are harmful.
-
Obtain a Groq API Key:
- Visit the Groq Cloud console to create your API key: https://console.groq.com/keys
-
Install the Groq Dart SDK:
- Add
groq_sdk
to yourpubspec.yaml
file:dependencies: groq_sdk: ^0.1.0 # add the latest version here
- Run
dart pub get
.
- Add
This initiates a new chat session with the specified model, optionally customizing settings like temperature and max tokens.
final groq = Groq('YOUR_GROQ_API_KEY');
//Start a chat with default settings
if(!await groq.canUseModel(GroqModels.llama3_8b)) return;
final chat = groq.startNewChat(GroqModels.llama3_8b);
//Start a chat with custom settings
final customChat = groq.startNewChat(GroqModels.llama3_70b, settings: GroqChatSettings(
temperature: 0.8, //More creative responses
maxTokens: 512, //shorter responses
));
This allows you to process each message (both user requests and model responses) as they are sent and received in real-time.
final chat = groq.startNewChat(GroqModels.llama3_8b);
chat.stream.listen((event) {
event.when(request: (requestEvent) {
//Listen for user prompts
print('Request sent...');
print(requestEvent.message.content);
}, response: (responseEvent) {
//Listen for llm responses
print(
'Received response: ${responseEvent.response.choices.first.message}');
});
});
Sends a message to the model and awaits the response. The usage object provides details about token consumption and timing. It also sends a request and either a response or an error to the chat's stream
.
You can additionally retrieve the response and usage via the return values of sendMessage
final (response, usage) = await chat.sendMessage('Explain LLMs to me please');
print(response.choices.first.message);
You can request the model to return a response in JSON format by setting the expectJSON
parameter to true
when sending a message. For this feature you need to explain the JSON structure in the prompt.
final (response, usage) = await chat.sendMessage(
'Is the following city name a capital? Answer in a json format with the key "capital", which takes a bool as value: New York',
expectJSON: true,
);
It will return
{
"capital": false
}
This allows you to dynamically change the language model used in the chat session.
chat.switchModel(GroqModels.mixtral8_7b); //Also available during a running chat
The Groq SDK allows you to register tools that can be invoked dynamically during a chat. Tools encapsulate specific functionality and can accept parameters to customize their behavior.
The following example demonstrates how to create a weather tool, register it in a chat, and handle a tool call dynamically.
// Define the weather tool
final weatherTool = GroqToolItem(
functionName: 'get_weather',
functionDescription: 'Get weather information for a specified location',
parameters: [
GroqToolParameter(
parameterName: 'location',
parameterDescription: 'City or location name',
parameterType: GroqToolParameterType.string,
isRequired: true,
),
GroqToolParameter(
parameterName: 'units',
parameterDescription: 'Temperature units (metric or imperial)',
parameterType: GroqToolParameterType.string,
isRequired: false,
allowedValues: ['metric', 'imperial'],
),
],
function: (args) {
final location = args['location'] as String;
final units = args['units'] as String? ?? 'metric';
return MyWeatherApi.getWeather(location, units);
},
);
final chat = groq.startNewChat(GroqModels.llama3_groq_70b_tool_use_preview);
// Register the tool with the chat
chat.registerTool(weatherTool);
// Send a message to the chat and handle tool calls
final (response, usage) = await chat.sendMessage(
'What is the weather in Boston like (in metric units)?',
);
final message = response.choices.first.messageData;
// Handle tool calls dynamically
if (message.isToolCall) {
for (final toolCall in message.toolCalls) {
print('Tool call: ${toolCall.functionName}');
final retrieveWeatherInBoston = chat.getToolCallable(toolCall);
print('Weather result: ${retrieveWeatherInBoston()}');
}
}
weatherTool
specifies theget_weather
function, requiring a location parameter and an optionalunits
parameter, which can only have specific values or null.- The
registerTool
method adds the tool to the chat, making it available for invocation. - When the model makes a tool call (
isToolCall
is true), retrieve the callable function usingchat.getToolCallable
and execute it as you want.
The output of the above example should look something like this:
Tool call: get_weather
Weather result: {location: Boston, temperature: 22, units: metric}
Tool call: get_weather
Weather result: {location: Boston, temperature: 71.6, units: imperial}
Provides information about the remaining API calls and tokens you can use within your current rate limit period.
final rateLimitInfo = chat.rateLimitInfo;
print(rateLimitInfo.remainingRequestsToday);
This gives you the token usage details (prompt tokens, completion tokens, total tokens) for the most recent response. It also gives you response times and prompt times
final latestUsage = chat.latestResponse.usage;
print(latestUsage.totalTokens);
Calculates the cumulative token usage for all requests and responses within the current chat session.
final totalTokensUsed = chat.totalTokens;
print('Total tokens used in this chat: $totalTokensUsed');
Transcribe audio files using Groq's supported whisper-large-v3
model (or other available models). Replace './path/to/your/audio.mp3'
with the actual path to your audio file.
final groq = Groq('YOUR_GROQ_API_KEY');
try {
final (transcriptionResult, rateLimitInformation) = await groq.transcribeAudio(
filePath: './path/to/your/audio.mp3', // Adjust file path as needed
);
print(transcriptionResult.text); // The transcribed text
} on GroqException catch (e) {
print('Error transcribing audio: $e');
}
Easily check if a text is harmful using the isTextHarmful method. It analyzes the text and returns whether it's harmful, the harmful category, and usage details.
final (isHarmful, harmfulCategory, usage, rateLimit) = await groq.isTextHarmful(
text: 'YOUR_TEXT',
);
if (isHarmful) {
print('Harmful content detected: $harmfulCategory');
}
Instead of looking up the standard models, you can use the ids via provided constants in GroqModels
:
const String mixtral8_7b = 'mixtral-8x7b-32768';
const String gemma_7b = 'gemma-7b-it';
const String llama3_8b = 'llama3-8b-8192';
const String llama3_70b = 'llama3-70b-8192';
const String whisper_large_v3 = 'whisper-large-v3';
static const String llama31_70b_versatile = 'llama-3.1-70b-versatile';
static const String llama31_8b_instant = 'llama-3.1-8b-instant';
static const String llama3_groq_70b_tool_use_preview =
'llama3-groq-70b-8192-tool-use-preview';
static const String llama3_groq_8b_tool_use_preview =
'llama3-groq-8b-8192-tool-use-preview';
static const String llama_guard_3_8b = 'llama-guard-3-8b';
You can use these constants directly when starting a new chat or switching models:
final chat = groq.startNewChat(GroqModels.mixtral8_7b);
Parameter | Description | Default |
---|---|---|
maxConversationalMemoryLength | The number of previous messages to include in the context for the model's response. Higher values provide more context-aware responses. | 1024 |
temperature | Controls the randomness of responses (0.0 - deterministic, 2.0 - very random). | 1.0 |
maxTokens | Maximum number of tokens allowed in the generated response. | 8192 |
topP | Controls the nucleus sampling probability mass (0.0 - narrow focus, 1.0 - consider all options). | 1.0 |
stop | Optional stop sequence(s) to terminate response generation. | null |
- Replace
"YOUR_GROQ_API_KEY"
with your actual Groq API key, obtained from the Groq Cloud console: https://console.groq.com/keys - The Groq Cloud console is your central hub for managing API keys, exploring documentation, and accessing other Groq Cloud features: https://console.groq.com/
- Multiple choices in GroqResponses are not supported yet.