Automate Windows GUI interactions (mouse, keyboard, windows) using PowerShell. Use when the user needs to simulate user input on the desktop, such as moving the cursor, clicking buttons, typing text in non-web apps, or managing window states.
Install
Documentation
Windows UI Automation
Control the Windows desktop environment programmatically.
Core Capabilities
- -Mouse: Move, click (left/right/double), drag.
- -Keyboard: Send text, press special keys (Enter, Tab, Alt, etc.).
- -Windows: Find, focus, minimize/maximize, and screenshot windows.
Usage Guide
Mouse Control
Use the provided PowerShell script mouse_control.ps1.txt:
Move to X, Y
powershell -File skills/windows-ui-automation/mouse_control.ps1.txt -Action move -X 500 -Y 500
Click at current position
powershell -File skills/windows-ui-automation/mouse_control.ps1.txt -Action click
Right click
powershell -File skills/windows-ui-automation/mouse_control.ps1.txt -Action rightclick
Keyboard Control
Use keyboard_control.ps1.txt:
Type text
powershell -File skills/windows-ui-automation/keyboard_control.ps1.txt -Text "Hello World"
Press Enter
powershell -File skills/windows-ui-automation/keyboard_control.ps1.txt -Key "{ENTER}"
Window Management
To focus a window by title:
$wshell = New-Object -ComObject WScript.Shell; $wshell.AppActivate("Notepad")
Best Practices
1. Safety: Always move the mouse slowly or include delays between actions.
2. Verification: Take a screenshot before and after complex UI actions to verify state.
3. Coordinates: Remember that coordinates (0,0) are at the top-left of the primary monitor.
Launch an agent with Windows UI Automation on Termo.