AutoControl
Installation
Requirements
Basic Installation
With GUI Support
Linux Prerequisites
Raspberry Pi
Dependencies
Platform Support
Quick Start
Mouse Control
Keyboard Control
Image Recognition
Screenshot
Action Recording & Playback
JSON Action Scripting
What’s Next?
Running AutoControl in CI / Cloud
Quick start
GitHub Actions
GitLab CI
Kubernetes
Limitations
English Documentation
Installation
Install from PyPI
Requirements
With GUI Support
Linux Prerequisites
Raspberry Pi
Development Environment
Mouse Control
Getting Mouse Button Table
Press and Release
Click
Position
Scroll
Keyboard Control
Getting Key Tables
Press and Release
Type a Single Key
Check Key State
Type a String
Hotkey Combinations
Screen Operations
Screenshot
Screen Size
Get Pixel Color
Image Recognition
Locate All Matches
Locate Image Center
Locate and Click
Parameters
Recording & Playback
Record and Replay
How It Works
Editing a recording
Keywords & Executor
Keyword Format
Available Action Commands
Executing a JSON File
Executing All JSON Files in a Directory
Extending the Executor
Report Generation
Setup
Generating Reports
HTML Report
JSON Report
XML Report
Getting Report Content as String
Report Contents
Callback Executor
Basic Usage
How It Works
Parameters
Extending the Callback Executor
Scheduler
Basic Example
Blocking vs Non-Blocking
Interval Scheduling
Cron Scheduling
Removing Jobs
Shutting Down
Socket Server (Remote API)
Starting the Server
Sending Commands (Client)
Protocol Details
MCP Server (Use AutoControl from Claude)
Tool catalogue
Resources, prompts, sampling
Logging notifications, progress, cancellation
Starting the server (programmatic)
Starting the server (command line)
CLI inspection flags
Registering with Claude Desktop
Registering with Claude Code
HTTP transport (with SSE, auth, TLS)
Read-only / safe mode
Confirmation prompts (elicitation)
Audit log
Auto-screenshot on tool error
Rate limiting
Plugin hot-reload
CI smoke tests with the fake backend
Security notes
Critical Exit
Enabling Critical Exit
Changing the Hotkey
Example: Recovering from Runaway Mouse
Command-Line Interface
Subcommand CLI (
python
-m
je_auto_control.cli
)
Run a script
List scheduler jobs
Start the TCP socket server
Start the REST API server
Legacy flag-style CLI (
python
-m
je_auto_control
)
Execute a single action file
Execute all files in a directory
Execute a JSON string directly
Create a project template
Launch the GUI
Project Management
Creating a Project
Generated Structure
New Features (2026-04)
Clipboard
Dry-run / step-debug executor
Global hotkey daemon (Windows)
Event triggers
Cron scheduling
Plugin loader
REST API server
CLI runner
Multi-language GUI (i18n)
Closable tabs + menu bar
OCR (text on screen)
Accessibility element finder
VLM (AI) element locator
Run history + error-snapshot artifacts
OCR — region dump and regex search
Runtime variables and data-driven control flow
LLM action planner
Remote desktop (host + viewer)
Remote desktop — Quick Connect + Phase 4/5 hardening
Quick Connect headless API
Connection approval + view-only mode
IP allowlist (CIDR + exact IPs)
One-time share codes
TOTP 2FA (RFC 6238, stdlib only)
Multi-monitor selection
Remote cursor overlay
Multi-viewer collaborative cursors + chat
Relative mouse mode (FPS / CAD)
Motion-aware capture
Live stats
JPEG sequence recorder (no PyAV needed)
TCP relay (WebRTC fallback)
Service installer (unattended host)
Remote desktop — secure transports, audio, clipboard, file transfer
Host ID handshake
TLS
WebSocket transport
Audio streaming
Clipboard sync (text + image)
File transfer with progress
Remote desktop — AnyDesk-style popout window
Remote desktop — responsive sub-tab sizing
Remote desktop — MCP tool surface
Driver-level input backends — drive games that ignore SendInput / XTest
Interception (Windows)
uinput (Linux)
ViGEm virtual gamepad (Windows)
Anti-cheat caveat (all three)
Per-action profiler
Run history timeline + failure thumbnails
Encrypted secret manager
Webhook (HTTP push) trigger
IMAP email trigger
New Features (2026-05)
Locator + selector intelligence
Self-healing locator
Anchor-based locator
OCR with structured output
Smart waits
A/B locator framework
Operations + observability
Cost telemetry
Trace replay UI
Failure → ticket automation
Container CI templates
Cross-host DAG orchestrator
Multi-viewer presence
Agent + integrations
Computer-use high-level API
WebRunner executor + MCP integration
Chat-ops bot
Platform coverage
Wayland CLI backend
Wayland libei native backend
macOS Accessibility: tree dump + recorder
Developer experience
autocontrol-lsp completion
.pyi
stub generator
VS Code extension
Browser extension recorder
pytest plugin + Gherkin BDD
Visual flow editor
Generic agent loop (JSON + MCP)
Screenshot PII redaction
Android backend (uiautomator2 widget tree)
iOS backend (XCUITest via WebDriverAgent)
New Features (2026-06) — QA Layer
Assertions
Assertion DSL
Off-screen and system assertions
Assertion combinators (group / OR / poll)
Media assertions (audio / video)
Data-driven execution
Flaky-test detection & quarantine
Flaky report
Quarantine (closing the loop)
QA suite runner + CI reports
Suite orchestration
CI-native reports (JUnit / Allure)
Accessibility & i18n audit
Mobile device matrix
OCR backends
Backend selection
Language codes
Tesseract setup notes
EasyOCR / PaddleOCR setup notes
Headless usage
JSON action surface
Diagnostics
Observability — metrics and traces
Quick start
Built-in metrics
Adding your own metrics
Tracing
Production deployment
Security note
Operations & Admin Layer
Folder sync (additive mirror)
coturn TURN config bundle
Hardened REST API
Auth gate
Endpoint surface
Prometheus metrics
Multi-host admin console
Audit log hash chain
WebRTC packet inspector
USB device enumeration
USB hotplug events
System diagnostics
Web admin dashboard
OpenAPI 3.1 + Swagger UI
Configuration bundle
USB Passthrough — Phase 2 Design
Goals
Non-Goals
Transport
Channel
Framing
Operations
Backpressure
Per-OS driver wrappers
Windows — WinUSB
macOS — IOKit
Linux — libusb
Security & ACL
Per-device allow-list
Audit
Privilege
Phasing
Design decisions (formerly open questions)
USB Passthrough — Phase 2e Security Review Checklist
Threat model
ACL
Audit
Protocol hardening
Resource bounds
Lifecycle
Per-OS requirements
Pen-test scenarios
Sign-off
USB Passthrough — Operator Guide
Prerequisites
Driver setup (per OS)
Linux (libusb)
Windows (WinUSB) —
hardware-unverified
macOS (IOKit) —
hardware-unverified
Enabling the feature
ACL setup
ACL file integrity (HMAC)
Starting the host
Viewer-side: claim and transfer
Enumerate
Open + transfer
Troubleshooting matrix
Headless control (
AC_usb_*
commands)
What is
not
shipped yet
繁體中文文件
安裝
從 PyPI 安裝
系統需求
安裝 GUI 支援
Linux 前置套件
樹莓派
開發環境
滑鼠控制
取得滑鼠按鍵表
按下與釋放
點擊
游標位置
捲動
鍵盤控制
取得按鍵表
按下與釋放
按下單一按鍵
檢查按鍵狀態
輸入字串
熱鍵組合
螢幕操作
截圖
螢幕尺寸
取得像素顏色
圖片辨識
定位所有匹配
定位圖片中心
定位並點擊
參數說明
錄製與回放
使用範例
運作方式
編輯錄製內容
關鍵字與執行者
關鍵字格式
可用的動作指令
執行 JSON 檔案
執行資料夾內所有 JSON 檔案
擴充執行者
報告產生
設定
產生報告
HTML 報告
JSON 報告
XML 報告
取得報告內容為字串
報告內容
回調函數執行器
基本用法
運作方式
參數說明
擴充回調執行器
排程器
基本範例
阻塞與非阻塞模式
間隔排程
Cron 排程
移除任務
關閉排程器
Socket 伺服器(遠端 API)
啟動伺服器
傳送指令(客戶端)
協定細節
MCP 伺服器 (讓 Claude 使用 AutoControl)
工具目錄
Resources、Prompts、Sampling
Logging 通知 / Progress / Cancellation
以程式啟動伺服器
以命令列啟動伺服器
CLI 檢視旗標
註冊到 Claude Desktop
註冊到 Claude Code
HTTP 傳輸(含 SSE / Auth / TLS)
唯讀 / 安全模式
破壞性動作確認(Elicitation)
稽核 Log
工具失敗自動截圖
Rate Limiting
Plugin Hot-Reload
CI 煙霧測試 (Fake Backend)
安全注意事項
緊急退出
啟用緊急退出
更改熱鍵
範例:從失控的滑鼠恢復
命令列介面
執行單一動作檔案
執行資料夾內所有檔案
直接執行 JSON 字串
建立專案範本
啟動 GUI
專案管理
建立專案
產生的目錄結構
新功能 (2026-04)
剪貼簿 (Clipboard)
試跑 / 逐步除錯 (Dry-run / Step Debug)
全域熱鍵 (Windows)
事件觸發器 (Triggers)
Cron 排程
外掛載入器 (Plugin Loader)
REST API 伺服器
CLI 子指令介面
GUI 多語系
可關閉分頁 + 選單列
OCR 螢幕文字辨識
Accessibility 元件搜尋
VLM(AI)元件定位
執行歷史 + 錯誤截圖附件
OCR — 區域 dump 與 regex 搜尋
執行期變數與資料驅動流程控制
LLM 動作規劃器
遠端桌面(Host + Viewer)
遠端桌面 — 快速連線 + Phase 4/5 強化
快速連線的 headless API
連線審批 + 僅檢視模式
IP 白名單(CIDR + 單一 IP)
一次性分享碼
TOTP 2FA(RFC 6238,純 stdlib)
多螢幕選擇
遠端游標 overlay
多 viewer 協作游標 + 文字 chat
相對滑鼠模式(FPS / CAD)
動態擷取
即時統計
JPEG 序列錄影(不需要 PyAV)
TCP relay(WebRTC fallback)
服務安裝器(無人值守 host)
遠端桌面 — 加密傳輸、音訊、剪貼簿、檔案傳輸
Host ID 握手
TLS
WebSocket 傳輸
音訊串流
剪貼簿同步(文字 + 圖片)
檔案傳輸 + 進度
遠端桌面 — AnyDesk 風格彈出視窗
遠端桌面 — 自適應的子分頁尺寸
遠端桌面 — MCP 工具
驅動層輸入後端 — 驅動不接受 SendInput / XTest 的遊戲
Interception(Windows)
uinput(Linux)
ViGEm 虛擬手把(Windows)
反作弊注意事項
效能分析器(Profiler)
執行紀錄時間軸 + 失敗截圖
密鑰管理器(Secret Manager)
Webhook(HTTP push)觸發
IMAP Email 觸發
新功能 (2026-05)
定位器與選擇器智慧化
自我修復定位器
錨點定位器
結構化 OCR
智慧等待
A/B 定位器框架
維運與觀察性
成本遙測
追蹤重播 UI
失敗 → 工單自動化
容器化 CI 模板
跨主機 DAG 編排
多 viewer 名單
代理與整合
Computer-use 高階 API
WebRunner 接入 executor + MCP
Chat-ops 機器人
平台覆蓋
Wayland CLI backend
Wayland libei native backend
macOS Accessibility:tree dump + recorder
開發者體驗
autocontrol-lsp 完整化
.pyi
stub 產生器
VS Code 擴充
瀏覽器擴充錄製器
pytest plugin + Gherkin BDD
視覺流程編輯器
通用 agent 迴圈(JSON + MCP)
截圖 PII 遮罩
Android backend(uiautomator2 widget tree)
iOS backend(XCUITest via WebDriverAgent)
新功能 (2026-06) — 測試框架層
斷言
斷言 DSL
畫面外與系統斷言
斷言組合器(群組 / OR / 輪詢)
媒體斷言(音訊 / 影片)
資料驅動執行
不穩定測試偵測與隔離
不穩定報告
隔離(閉環處理)
QA 套件執行器 + CI 報告
套件編排
CI 原生報告(JUnit / Allure)
無障礙 & i18n 稽核
行動裝置矩陣
OCR 後端
選擇後端的順序
語言代碼
Tesseract 安裝注意
EasyOCR / PaddleOCR 安裝注意
無 GUI 使用
JSON 動作介面
診斷
可觀測性 — Metrics 與 Traces
快速開始
內建 metrics
自訂 metrics
Tracing
正式部署建議
安全提醒
維運與管理層
資料夾同步(增量鏡像)
coturn TURN 設定包
強化版 REST API
認證閘道
端點清單
Prometheus 指標
多主機管理主控台
稽核紀錄雜湊鏈
WebRTC 封包監測
USB 裝置列舉
USB hotplug 事件
系統診斷
網頁管理 dashboard
OpenAPI 3.1 + Swagger UI
設定包
USB Passthrough — 第二階段設計
目標
非目標
傳輸
Channel
Framing
操作
Backpressure
各 OS driver 包裝
Windows — WinUSB
macOS — IOKit
Linux — libusb
安全與 ACL
每裝置白名單
稽核
權限
階段
設計決策(原未決問題)
USB Passthrough — Phase 2e 安全審查清單
威脅模型
ACL
稽核
協定強化
資源上界
生命週期
各 OS 需求
滲透測試情境
Sign-off
USB Passthrough — 操作員指南
前置需求
Driver 設定(依 OS)
Linux(libusb)
Windows(WinUSB)—
硬體未驗證
macOS(IOKit)—
硬體未驗證
啟用 feature
ACL 設定
ACL 檔案完整性(HMAC)
啟動 host
Viewer 端:claim 與 transfer
列舉
Open + transfer
疑難排解對照表
無 GUI 控制(
AC_usb_*
指令)
尚未發布的部分
API Reference
Wrapper API
Mouse API
get_mouse_table
mouse_preprocess
get_mouse_position
set_mouse_position
press_mouse
release_mouse
click_mouse
mouse_scroll
Keyboard API
get_special_table
get_keyboard_keys_table
press_keyboard_key
release_keyboard_key
type_keyboard
check_key_is_press
Image API
locate_all_image
locate_image_center
locate_and_click
screenshot
Screen API
screen_size
screenshot
Record API
record
stop_record
Utils API
Executor API
execute_action
execute_files
add_command_to_executor
Callback Function API
callback_function
Report Generation API
generate_html
generate_html_report
generate_json
generate_json_report
generate_xml
generate_xml_report
Scheduler API
SchedulerManager
File Processing API
get_dir_files_as_list
Package Manager API
PackageManager
Socket Server API
start_autocontrol_socket_server
TCPServer
TCPServerHandler
Critical Exit API
CriticalExit
Key Reference
Keyboard Keys
Windows
Linux (X11)
macOS
Mouse Keys
Windows
Linux (X11)
macOS
AutoControl
Index
Index
A
|
B
|
C
|
E
|
G
|
I
|
J
|
L
|
M
|
P
|
R
|
S
|
T
A
add_blocking_job() (SchedulerManager method)
add_command_to_executor()
built-in function
add_cron_blocking() (SchedulerManager method)
add_cron_nonblocking() (SchedulerManager method)
add_interval_blocking_daily() (SchedulerManager method)
add_interval_blocking_hourly() (SchedulerManager method)
add_interval_blocking_minutely() (SchedulerManager method)
add_interval_blocking_secondly() (SchedulerManager method)
add_interval_blocking_weekly() (SchedulerManager method)
add_interval_nonblocking_daily() (SchedulerManager method)
add_interval_nonblocking_hourly() (SchedulerManager method)
add_interval_nonblocking_minutely() (SchedulerManager method)
add_interval_nonblocking_secondly() (SchedulerManager method)
add_interval_nonblocking_weekly() (SchedulerManager method)
add_nonblocking_job() (SchedulerManager method)
add_package_to_callback_executor() (PackageManager method)
add_package_to_executor() (PackageManager method)
add_package_to_target() (PackageManager method)
B
built-in function
add_command_to_executor()
callback_executor.callback_function()
check_key_is_press()
click_mouse()
execute_action()
execute_files()
generate_html()
generate_html_report()
generate_json()
generate_json_report()
generate_xml()
generate_xml_report()
get_dir_files_as_list()
get_keyboard_keys_table()
get_mouse_position()
get_mouse_table()
get_special_table()
mouse_preprocess()
mouse_scroll()
press_keyboard_key()
press_mouse()
record()
release_keyboard_key()
release_mouse()
screen_size()
screenshot()
set_mouse_position()
start_autocontrol_socket_server()
stop_record()
type_keyboard()
C
callback_executor.callback_function()
built-in function
check_key_is_press()
built-in function
check_package() (PackageManager method)
click_mouse()
built-in function
close_flag (TCPServer attribute)
CriticalExit (built-in class)
E
execute_action()
built-in function
execute_files()
built-in function
G
generate_html()
built-in function
generate_html_report()
built-in function
generate_json()
built-in function
generate_json_report()
built-in function
generate_xml()
built-in function
generate_xml_report()
built-in function
get_blocking_scheduler() (SchedulerManager method)
get_dir_files_as_list()
built-in function
get_keyboard_keys_table()
built-in function
get_member() (PackageManager method)
get_mouse_position()
built-in function
get_mouse_table()
built-in function
get_nonblocking_scheduler() (SchedulerManager method)
get_special_table()
built-in function
I
init_critical_exit() (CriticalExit method)
J
je_auto_control
module
L
locate_all_image() (in module je_auto_control)
locate_and_click() (in module je_auto_control)
locate_image_center() (in module je_auto_control)
M
module
je_auto_control
mouse_preprocess()
built-in function
mouse_scroll()
built-in function
P
PackageManager (built-in class)
press_keyboard_key()
built-in function
press_mouse()
built-in function
R
record()
built-in function
release_keyboard_key()
built-in function
release_mouse()
built-in function
remove_blocking_job() (SchedulerManager method)
remove_nonblocking_job() (SchedulerManager method)
run() (CriticalExit method)
S
SchedulerManager (built-in class)
screen_size()
built-in function
screenshot()
built-in function
screenshot() (in module je_auto_control)
set_critical_key() (CriticalExit method)
set_mouse_position()
built-in function
shutdown_blocking_scheduler() (SchedulerManager method)
shutdown_nonblocking_scheduler() (SchedulerManager method)
start_all_scheduler() (SchedulerManager method)
start_autocontrol_socket_server()
built-in function
start_block_scheduler() (SchedulerManager method)
start_nonblocking_scheduler() (SchedulerManager method)
stop_record()
built-in function
T
TCPServer (built-in class)
TCPServerHandler (built-in class)
type_keyboard()
built-in function