Programmatic interfaces for executing agent-driven actions on graphical user interfaces.
click(x, y): Simulates a left mouse click at the given coordinates.type(text): Injects keyboard events for the specified text string.scroll(direction, amount): Executes mouse wheel scroll events.drag(startX, startY, endX, endY): Clicks, holds, and moves the cursor before releasing.(x, y) coordinates.3. The agent invokes the click tool with those coordinates.4. The system executes the click, takes a new screenshot, and repeats the loop until the task is complete.