OpenAI Client¶
llm_annotator.clients.openai_client
¶
OpenAI provider implementation.
OpenAIRuntimeOptions
dataclass
¶
OpenAIRuntimeOptions(
max_tokens: int | None = None,
json_schema: dict[str, Any] | None = None,
frequency_penalty: float | None = None,
reasoning_effort: Literal[
"none", "minimal", "low", "medium", "high", "xhigh"
]
| None = None,
temperature: float | None = None,
top_p: float | None = None,
presence_penalty: float | None = None,
)
Bases: ProviderRuntimeOptions
frequency_penalty
class-attribute
instance-attribute
¶
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
reasoning_effort
class-attribute
instance-attribute
¶
Only for supported reasoning models. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response.
temperature
class-attribute
instance-attribute
¶
What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.
top_p
class-attribute
instance-attribute
¶
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered.
presence_penalty
class-attribute
instance-attribute
¶
Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
OpenAIClient
¶
OpenAIClient(
model: str,
max_workers: int = 4,
base_url: str | None = None,
api_key: str | None = None,
on_error: OnError = "warn",
)
Bases: Client[T_OpenAIOptions]
Client wrapper for OpenAI APIs.
Initialize the OpenAI client.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model
|
str
|
OpenAI model identifier. |
required |
max_workers
|
int
|
Maximum number of concurrent worker threads for |
4
|
base_url
|
str | None
|
Base URL for the OpenAI API endpoint. |
None
|
api_key
|
str | None
|
OpenAI API key. If omitted, the SDK will use
|
None
|
on_error
|
OnError
|
Error behavior when generation fails. Valid options are:
- |
'warn'
|
View source on GitHub: src/llm_annotator/clients/openai_client.py lines 73–103
destroy
¶
Cancel any in-flight batches and clean up resources.
View source on GitHub: src/llm_annotator/clients/openai_client.py lines 147–154
generate
¶
generate(
*,
messages: list[dict[str, str]],
options: T_OpenAIOptions | None = None,
gen_kwargs: dict[str, Any] | None = None,
) -> Response
Generate a response using OpenAI.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
messages
|
list[dict[str, str]]
|
List of message dictionaries. |
required |
options
|
T_OpenAIOptions | None
|
Optional generation configuration. |
None
|
gen_kwargs
|
dict[str, Any] | None
|
Additional provider-specific generation kwargs that are not covered by the standard options.
Has precedence over |
None
|
Returns:
| Type | Description |
|---|---|
Response
|
A Response object containing the generated response. |
Raises:
| Type | Description |
|---|---|
ProviderError
|
If the provider call fails. |
View source on GitHub: src/llm_annotator/clients/openai_client.py lines 329–384
batch_generate
¶
batch_generate(
*,
messages: list[list[dict[str, str]]],
options: T_OpenAIOptions | None = None,
gen_kwargs: dict[str, Any] | None = None,
use_batch_api: bool = False,
poll_interval: float = 10.0,
) -> list[Response]
Generate responses for a batch of inputs.
By default, requests are dispatched in parallel using a thread pool.
When use_batch_api=True, the OpenAI Batch API is used instead:
all requests are submitted as a single batch job and results are
retrieved once the job completes. The Batch API supports a completion
window of up to 24 hours and offers lower cost, but adds latency.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
messages
|
list[list[dict[str, str]]]
|
List of message lists, one per request. |
required |
options
|
T_OpenAIOptions | None
|
Optional generation configuration. |
None
|
gen_kwargs
|
dict[str, Any] | None
|
Additional provider-specific generation kwargs that are not covered by the standard options.
Has precedence over |
None
|
use_batch_api
|
bool
|
When |
False
|
poll_interval
|
float
|
Seconds between batch status polls. Only used when
|
10.0
|
Returns:
| Type | Description |
|---|---|
list[Response]
|
A list of Response objects in the same order as the input. |
Raises:
| Type | Description |
|---|---|
ProviderError
|
If any individual request fails. |
View source on GitHub: src/llm_annotator/clients/openai_client.py lines 386–451