Base Client¶

llm_annotator.clients.base ¶

Abstract interface for LLM provider clients.

ProviderRuntimeOptions `dataclass` ¶

ProviderRuntimeOptions(
    max_tokens: int | None = None,
    json_schema: dict[str, Any] | None = None,
)

Shared generation options for provider calls; can be subclassed and extended.

Attributes:

Name	Type	Description
`max_tokens`	`int \| None`	Optional maximum output token count.
`json_schema`	`dict[str, Any] \| None`	Optional JSON schema dict for structured output. When provided, clients that support guided decoding (e.g. vLLM) will constrain generation to valid JSON matching the schema. Other clients will use the schema for post-processing / parsing only.

to_payload ¶

to_payload() -> dict[str, Any]

Convert options to a provider-specific API request payload dict.

Subclasses override this to build the exact kwargs expected by their SDK. The default implementation returns an empty dict.

Returns:

Type	Description
`dict[str, Any]`	A dict of provider-specific request parameters.

View source on GitHub: src/llm_annotator/clients/base.py lines 41–50

Response `dataclass` ¶

Response(
    text: str,
    stop_reason: str | None = None,
    model: str | None = None,
    provider: Provider | None = None,
    num_output_tokens: int | None = None,
    full_response: object | None = None,
    error: str | None = None,
    error_type: str | None = None,
)

Structured response object returned by provider clients.

Client ¶

Client(
    model: str,
    max_workers: int | None = None,
    on_error: OnError = "warn",
)

Bases: ABC, Generic[T_Options]

Base client interface used by all provider adapters.

Initialize a provider client.

Parameters:

Name	Type	Description	Default
`model`	`str`	Provider-specific model name.	required
`max_workers`	`int \| None`	Maximum number of concurrent worker threads for `batch_generate`. Clients that support native batching may ignore this parameter.	`None`
`on_error`	`OnError`	Error behavior for provider failures. - `"raise"`: raise a :class:`ProviderError` (default). - `"ignore"`: return a :class:`Response` with `error` set. - `"warn"`: log a warning and return an error :class:`Response`.	`'warn'`

View source on GitHub: src/llm_annotator/clients/base.py lines 72–96

enter ¶

__enter__() -> Self

Enter the context manager, returning this client instance.

View source on GitHub: src/llm_annotator/clients/base.py lines 160–162

exit ¶

__exit__(exc_type: Any, exc: Any, tb: Any) -> None

Exit the context manager cleanup.

View source on GitHub: src/llm_annotator/clients/base.py lines 164–166

generate `abstractmethod` ¶

generate(
    *,
    messages: list[dict[str, str]],
    options: T_Options | None = None,
    gen_kwargs: dict[str, Any] | None = None,
) -> Response

Generate a response from the provider.

Parameters:

Name	Type	Description	Default
`messages`	`list[dict[str, str]]`	List of message dicts with "role" and "content" keys.	required
`options`	`T_Options \| None`	Provider-specific generation options. NOTE: using this over gen_kwargs is preferred and implemented to facilitate sub-classing and satisfying typing and code-hinting.	`None`
`gen_kwargs`	`dict[str, Any] \| None`	Additional provider-specific generation kwargs that are not covered by the standard options. Has precedence over `options`.	`None`

Returns:

Type	Description
`Response`	A Response object containing the generated response.

View source on GitHub: src/llm_annotator/clients/base.py lines 175–198

batch_generate ¶

batch_generate(
    *,
    messages: list[list[dict[str, str]]],
    options: T_Options | None = None,
    gen_kwargs: dict[str, Any] | None = None,
) -> list[Response]

Generate responses for a batch of inputs.

The default implementation calls :meth:generate sequentially. Override this method in subclasses that support native batching (e.g. vLLM offline and vLLM server) for better throughput.

Parameters:

Name	Type	Description	Default
`messages`	`list[list[dict[str, str]]]`	List of message lists, where each message dict has "role" and "content" keys.	required
`options`	`T_Options \| None`	Provider-specific generation options.	`None`
`gen_kwargs`	`dict[str, Any] \| None`	Additional provider-specific generation kwargs that are not covered by the standard options. Has precedence over `options`.	`None`

Returns:

Type	Description
`list[Response]`	A list of Response objects containing the generated responses.

View source on GitHub: src/llm_annotator/clients/base.py lines 200–229

warm_up ¶

warm_up(
    *,
    system_message: str | None = None,
    prompt_prefix: str | None = None,
    options: T_Options | None = None,
) -> None

Prime the client before the main workload (no-op by default).

Override in clients that benefit from a warm-up pass (e.g. :class:~llm_annotator.clients.VLLMOfflineClient uses this to prime the KV-cache with a shared prefix before the first real batch).

Parameters:

Name	Type	Description	Default
`system_message`	`str \| None`	Optional system message shared across all requests.	`None`
`prompt_prefix`	`str \| None`	Optional fixed prefix that starts every user turn.	`None`
`options`	`T_Options \| None`	Optional generation options used to derive the warm-up params.	`None`

View source on GitHub: src/llm_annotator/clients/base.py lines 231–248

destroy ¶

destroy() -> None

Clean up any resources used by the client.

View source on GitHub: src/llm_annotator/clients/base.py lines 250–251

Base Client¶

llm_annotator.clients.base ¶

ProviderRuntimeOptions dataclass ¶

to_payload ¶

Response dataclass ¶

Client ¶

__enter__ ¶

__exit__ ¶

generate abstractmethod ¶

batch_generate ¶

warm_up ¶

destroy ¶

ProviderRuntimeOptions `dataclass` ¶

Response `dataclass` ¶

enter ¶

exit ¶

generate `abstractmethod` ¶