OpenAI Client¶

llm_annotator.clients.openai_client ¶

OpenAI provider implementation.

OpenAIRuntimeOptions `dataclass` ¶

OpenAIRuntimeOptions(
    max_tokens: int | None = None,
    json_schema: dict[str, Any] | None = None,
    frequency_penalty: float | None = None,
    reasoning_effort: Literal[
        "none", "minimal", "low", "medium", "high", "xhigh"
    ]
    | None = None,
    temperature: float | None = None,
    top_p: float | None = None,
    presence_penalty: float | None = None,
)

Bases: ProviderRuntimeOptions

frequency_penalty `class-attribute` `instance-attribute` ¶

frequency_penalty: float | None = None

Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

reasoning_effort `class-attribute` `instance-attribute` ¶

reasoning_effort: (
    Literal[
        "none", "minimal", "low", "medium", "high", "xhigh"
    ]
    | None
) = None

Only for supported reasoning models. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response.

temperature `class-attribute` `instance-attribute` ¶

temperature: float | None = None

What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.

top_p `class-attribute` `instance-attribute` ¶

top_p: float | None = None

An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered.

presence_penalty `class-attribute` `instance-attribute` ¶

presence_penalty: float | None = None

Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

OpenAIClient ¶

OpenAIClient(
    model: str,
    max_workers: int | None = None,
    base_url: str | None = None,
    api_key: str | None = None,
    on_error: OnError = "warn",
)

Bases: Client[T_OpenAIOptions]

Client wrapper for OpenAI APIs.

Initialize the OpenAI client.

Parameters:

Name	Type	Description	Default
`model`	`str`	OpenAI model identifier.	required
`max_workers`	`int \| None`	Maximum number of concurrent worker threads for `batch_generate`. Lower this value if you are getting rate limited. If set to None, 1 or lower, multithreading will be disabled.	`None`
`base_url`	`str \| None`	Base URL for the OpenAI API endpoint.	`None`
`api_key`	`str \| None`	OpenAI API key. If omitted, the SDK will use `OPENAI_API_KEY` from the environment.	`None`
`on_error`	`OnError`	Error behavior when generation fails. Valid options are: - `"raise"`: raise a :class:`ProviderError` (default). - `"ignore"`: return a :class:`Response` with `error` set. - `"warn"`: log a warning and return an error :class:`Response`.	`'warn'`

View source on GitHub: src/llm_annotator/clients/openai_client.py lines 77–107

destroy ¶

destroy() -> None

Cancel any in-flight batches and clean up resources.

View source on GitHub: src/llm_annotator/clients/openai_client.py lines 151–158

generate ¶

generate(
    *,
    messages: list[dict[str, str]],
    options: T_OpenAIOptions | None = None,
    gen_kwargs: dict[str, Any] | None = None,
) -> Response

Generate a response using OpenAI.

Parameters:

Name	Type	Description	Default
`messages`	`list[dict[str, str]]`	List of message dictionaries.	required
`options`	`T_OpenAIOptions \| None`	Optional generation configuration.	`None`
`gen_kwargs`	`dict[str, Any] \| None`	Additional provider-specific generation kwargs that are not covered by the standard options. Has precedence over `options`.	`None`

Returns:

Type	Description
`Response`	A Response object containing the generated response.

Raises:

Type	Description
`ProviderError`	If the provider call fails.

View source on GitHub: src/llm_annotator/clients/openai_client.py lines 336–391

batch_generate ¶

batch_generate(
    *,
    messages: list[list[dict[str, str]]],
    options: T_OpenAIOptions | None = None,
    gen_kwargs: dict[str, Any] | None = None,
    use_batch_api: bool = False,
    poll_interval: float = 10.0,
) -> list[Response]

Generate responses for a batch of inputs.

By default, requests are dispatched in parallel using a thread pool. When use_batch_api=True, the OpenAI Batch API is used instead: all requests are submitted as a single batch job and results are retrieved once the job completes. The Batch API supports a completion window of up to 24 hours and offers lower cost, but adds latency.

Parameters:

Name	Type	Description	Default
`messages`	`list[list[dict[str, str]]]`	List of message lists, one per request.	required
`options`	`T_OpenAIOptions \| None`	Optional generation configuration.	`None`
`gen_kwargs`	`dict[str, Any] \| None`	Additional provider-specific generation kwargs that are not covered by the standard options. Has precedence over `options`.	`None`
`use_batch_api`	`bool`	When `True`, use the OpenAI Batch API instead of concurrent individual requests. Defaults to `False`.	`False`
`poll_interval`	`float`	Seconds between batch status polls. Only used when `use_batch_api=True`. Defaults to `10.0`.	`10.0`

Returns:

Type	Description
`list[Response]`	A list of Response objects in the same order as the input.

Raises:

Type	Description
`ProviderError`	If any individual request fails.

View source on GitHub: src/llm_annotator/clients/openai_client.py lines 393–475

OpenAI Client¶

llm_annotator.clients.openai_client ¶

OpenAIRuntimeOptions dataclass ¶

frequency_penalty class-attribute instance-attribute ¶

reasoning_effort class-attribute instance-attribute ¶

temperature class-attribute instance-attribute ¶

top_p class-attribute instance-attribute ¶

presence_penalty class-attribute instance-attribute ¶

OpenAIClient ¶

destroy ¶

generate ¶

batch_generate ¶

OpenAIRuntimeOptions `dataclass` ¶

frequency_penalty `class-attribute` `instance-attribute` ¶

reasoning_effort `class-attribute` `instance-attribute` ¶

temperature `class-attribute` `instance-attribute` ¶

top_p `class-attribute` `instance-attribute` ¶

presence_penalty `class-attribute` `instance-attribute` ¶