Tool Use - FlexAI Docs

Tool use (a.k.a. “function calling”) lets the model decide when to call a function you’ve defined, returns the arguments it wants to pass, and lets you feed the result back in for a final answer. Models that support tool use are labeled tool_use in the catalog — for this guide we’ll use Qwen2.5-32B-Instruct-FP8.

Full round trip

import json
import os
from openai import OpenAI

client = OpenAI(base_url="https://tokens.flex.ai/v1", api_key=os.environ["FLEXAI_API_KEY"])

def get_weather(city: str) -> dict:
    # In a real app this would hit a weather API.
    return {"city": city, "temperature_c": 21, "conditions": "clear"}

tools = [{
    "type": "function",
    "function": {
        "name": "get_weather",
        "description": "Return current weather for a city.",
        "parameters": {
            "type": "object",
            "properties": {"city": {"type": "string"}},
            "required": ["city"],
        },
    },
}]

messages = [{"role": "user", "content": "What's the weather in Paris?"}]

# First turn: model decides to call the tool.
first = client.chat.completions.create(
    model="Qwen2.5-32B-Instruct-FP8",
    messages=messages,
    tools=tools,
)
tool_call = first.choices[0].message.tool_calls[0]
args = json.loads(tool_call.function.arguments)
result = get_weather(**args)

# Second turn: feed the tool result back in.
messages.extend([
    first.choices[0].message,
    {
        "role": "tool",
        "tool_call_id": tool_call.id,
        "content": json.dumps(result),
    },
])

second = client.chat.completions.create(
    model="Qwen2.5-32B-Instruct-FP8",
    messages=messages,
    tools=tools,
)
print(second.choices[0].message.content)

Forcing a specific tool

Set tool_choice to force the model’s hand:

{ "tool_choice": { "type": "function", "function": { "name": "get_weather" } } }

Or "tool_choice": "required" to force it to call some tool, or "none" to forbid tool calls entirely.

Parallel tool calls

Most tool-capable models we host can emit multiple tool calls per turn (message.tool_calls will have length > 1). Handle each one, then add one role: "tool" message per call — each with the matching tool_call_id — before the next completion call.

​Full round trip

​Forcing a specific tool

​Parallel tool calls

Full round trip

Forcing a specific tool

Parallel tool calls