[LiteLLM] Gemma models via Ollama enter infinite tool-calling loop due to wrong tool message role

When using Gemma models (2B/4B) via Ollama through the LiteLLM adapter, agents with tools enter an infinite tool-calling loop and never produce a final response.

Root cause: _content_to_message_param serializes tool result messages with role="tool" (OpenAI-compatible default), but Gemma's chat template expects role="tool_responses" (according documentation: https://ai.google.dev/gemma/docs/core/prompt-formatting-gemma4). 

This mismatch causes the model to misinterpret the tool result as a new turn instead of a response to its own tool call.

This is not a hardware or quantization issue — the same behavior occurs on high-end GPUs.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[LiteLLM] Gemma models via Ollama enter infinite tool-calling loop due to wrong tool message role #5650

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[LiteLLM] Gemma models via Ollama enter infinite tool-calling loop due to wrong tool message role #5650

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions