LlmClient¶
This class provides the Robot interface for llm interactions with an LLM server.
Type: LIBRARY
Scope: TEST
Keywords¶
Assert State¶
Assert that the screen matches a state description.
Args: description: Description of the expected screen state. image: Image to inspect. If omitted, a screenshot is grabbed. custom_system_prompt: Optional system prompt override.
Raises: AssertionError: If the state does not match the description.
Positional and named arguments¶
Name |
Type |
Default Value |
Kind |
Required |
|---|---|---|---|---|
description |
string |
POSITIONAL_OR_NAMED |
Yes |
|
image |
None |
None |
POSITIONAL_OR_NAMED |
No |
custom_system_prompt |
None |
None |
POSITIONAL_OR_NAMED |
No |
Check For Visual Corruption¶
Detect if an image is corrupted.
Args: image: The image to check. If no image is provided, a new screenshot is grabbed. custom_prompt: Optional custom prompt to guide the LLM.
Returns: A dict containing the LLM's assessment of whether the image is corrupted and a description.
Raises: VQAValidationError: If the image is assessed as corrupted by the LLM.
Return¶
{‘name’: ‘dict’, ‘typedoc’: ‘dictionary’, ‘nested’: [{‘name’: ‘str’, ‘typedoc’: ‘string’, ‘nested’: [], ‘union’: False}, {‘name’: ‘Any’, ‘typedoc’: ‘Any’, ‘nested’: [], ‘union’: False}], ‘union’: False}
Positional and named arguments¶
Name |
Type |
Default Value |
Kind |
Required |
|---|---|---|---|---|
image |
None |
None |
POSITIONAL_OR_NAMED |
No |
custom_prompt |
None |
None |
POSITIONAL_OR_NAMED |
No |
Configure Llm Client¶
Configure the LLM client with the given parameters.
Args: **kwargs: Configuration parameters for the LLM client.
Raises: TypeError: If unknown parameters are provided. ValueError: If parameter values are of incorrect type.
Positional and named arguments¶
Name |
Type |
Default Value |
Kind |
Required |
|---|---|---|---|---|
kwargs |
Any |
VAR_NAMED |
No |
Execute Gui Action¶
Execute a GUI action as specified by the LLM response.
Args: action: A dict containing the action_type, text, and point_2d. description: The description provided to the LLM.
Raises: ValueError: If the action type is unsupported or if required fields are missing.
Positional and named arguments¶
Name |
Type |
Default Value |
Kind |
Required |
|---|---|---|---|---|
action |
dictionary |
POSITIONAL_OR_NAMED |
Yes |
|
description |
string |
POSITIONAL_OR_NAMED |
No |
Get Object Position¶
Get the position of an object on the screen in relative coordinates.
Args: description: Description of the object to locate. image: Image to inspect. If omitted, a screenshot is grabbed. custom_system_prompt: Optional system prompt override.
Returns: The object position as normalized relative coordinates [x, y], where each value is typically in the range 0..1.
Raises: VQADetectionError: If the LLM indicates that the object was not
Return¶
{‘name’: ‘list’, ‘typedoc’: ‘list’, ‘nested’: [{‘name’: ‘float’, ‘typedoc’: ‘float’, ‘nested’: [], ‘union’: False}], ‘union’: False}
Positional and named arguments¶
Name |
Type |
Default Value |
Kind |
Required |
|---|---|---|---|---|
description |
string |
POSITIONAL_OR_NAMED |
Yes |
|
image |
None |
None |
POSITIONAL_OR_NAMED |
No |
custom_system_prompt |
None |
None |
POSITIONAL_OR_NAMED |
No |
Get Single Gui Action¶
Get a single GUI action from the LLM.
Args: task: The task description to provide to the LLM. image: Image to inspect. If omitted, a screenshot is grabbed. custom_system_prompt: Optional system prompt override.
Returns: The next GUI action as returned by the LLM. For pointer-based actions, point_2d contains the raw coordinates from the LLM's 1000x1000 grid.
Return¶
{‘name’: ‘dict’, ‘typedoc’: ‘dictionary’, ‘nested’: [{‘name’: ‘str’, ‘typedoc’: ‘string’, ‘nested’: [], ‘union’: False}, {‘name’: ‘Any’, ‘typedoc’: ‘Any’, ‘nested’: [], ‘union’: False}], ‘union’: False}
Positional and named arguments¶
Name |
Type |
Default Value |
Kind |
Required |
|---|---|---|---|---|
task |
string |
POSITIONAL_OR_NAMED |
Yes |
|
image |
None |
None |
POSITIONAL_OR_NAMED |
No |
custom_system_prompt |
None |
None |
POSITIONAL_OR_NAMED |
No |
Multiple Step Action¶
Perform a multiple step action by prompting the LLM iteratively until a "Finish" action is returned.
Args: task: The task description to complete. custom_system_prompt: Optional system prompt override. max_steps: Number of actions to attempt before stopping.
Raises: RuntimeError: If the LLM cannot finish within max_steps.
Positional and named arguments¶
Name |
Type |
Default Value |
Kind |
Required |
|---|---|---|---|---|
task |
string |
POSITIONAL_OR_NAMED |
Yes |
|
custom_system_prompt |
None |
None |
POSITIONAL_OR_NAMED |
No |
max_steps |
integer |
50 |
POSITIONAL_OR_NAMED |
No |
Prompt Llm¶
Send a prompt (text-only or text+image) to the LLM and get the response.
Args: prompt: The text prompt to send to the LLM. image: Optional image (PIL Image or path) to include in the prompt. system_prompt: Optional system prompt to guide the LLM.
Returns: The response from the LLM.
Return¶
{‘name’: ‘str’, ‘typedoc’: ‘string’, ‘nested’: [], ‘union’: False}
Positional and named arguments¶
Name |
Type |
Default Value |
Kind |
Required |
|---|---|---|---|---|
prompt |
string |
POSITIONAL_OR_NAMED |
Yes |
|
image |
None |
None |
POSITIONAL_OR_NAMED |
No |
system_prompt |
None |
None |
POSITIONAL_OR_NAMED |
No |