Skip to main content

Background Automation (Windows)

Run automation while you work:
with VisionAgent(background=True) as agent:
    agent.act("Open Excel, create report, email to team")
# You can keep working

Multi-Display

with VisionAgent() as agent:
    displays = agent.tools.os.list_displays()

    agent.tools.os.set_active_display(1)
    agent.act("Open dashboard")

    agent.tools.os.set_active_display(2)
    agent.act("Open monitoring")

Window Targeting

agent.act("""
    Find Chrome window with "Email" in title
    Focus that window
    Click "Compose"
""")

Direct Control

# Keyboard
agent.tools.os.keyboard_tap("v", modifier_keys=["control"])  # Paste
agent.tools.os.keyboard_type("Hello!")

# Mouse
agent.tools.os.mouse_move(100, 200)
agent.tools.os.mouse_click("left")
agent.tools.os.mouse_scroll(0, -5)

Companion Devices

Control mobile devices via capture card + Bluetooth HID:
DeviceConnection
iOSBluetooth HID
AndroidBluetooth/USB HID
Industrial HMISerial/USB

Unicode Support

agent.act('Type "Hello 世界 🎉"')  # Works with all Unicode