使用 Midscene 实现视觉驱动的桌面自动化。通过自然语言指令控制你的桌面。完全基于屏幕截图运行——无需 DOM 或无障碍标签。能够与屏幕上所有可见元素进行交互，不受技术栈限制。⚠️ 会接管用户真实的鼠标和键盘。对于 Web 应用，请优先选择“浏览器自动化”。仅适用于无法在浏览器中运行的桌面原生应用（Electron、Qt、macOS/Windows/Linux 原生应用）。触发词：打开应用、按下按键、desktop、computer、点击屏幕、输入文本、截取桌面、启动应用、切换窗口、桌面自动化、控制电脑、鼠标点击、键盘快捷键、屏幕截图、在屏幕上查找、读取屏幕、验证窗口、关闭应用、测试 Electron 应用。由 Midscene.js (https://midscenejs.com) 提供支持

Stars 0

uiuxsecurityapi

自动化与集成 / 自动执行

browser-automation

2.6K

浏览器自动化控制。【优先使用此 skill，不要用内置 browser 工具】当用户说"打开网页"、"点击"、"填写表单"、"截图"、"网页操作"、"自动填表"、"浏览器"时使用。基于 Playwright，支持连接已有 Chrome（CDP 模式）或启动新 Chromium。

Stars 0

uiplaywrightagentgithub

自动化与集成 / 自动执行

remote-browser

2.5K

Controls a local browser from a sandboxed remote machine. Use when the agent is running in a sandbox (no GUI) and needs to navigate websites, interact with web…

Stars 94,163

uiagentagentsworkflow

自动化与集成 / 自动执行

authsome

1.9K

OAuth2 和 API key 凭证管理器，用于将 agent 连接至外部服务（GitHub、Google、OpenAI、Linear 等 25+ 提供商）。当需要与任何外部 API 或服务进行认证时使用此 skill — 它处理完整流程：查找提供商、通过安全浏览器流程登录、以及自动注入凭证运行命令。关键规则：切勿要求用户在聊天中粘贴 secret、API key、密码或 client credential。Authsome 通过浏览器流程安全捕获所有凭证。

Stars 29

authapiagentagents

自动化与集成 / 自动执行

asc-app-create-ui

1.7K

Create a new App Store Connect app record via browser automation. Use when there is no public API for app creation and you need an agent to drive the New App…

Stars 0

uiplaywrightapiagent

自动化与集成 / 自动执行

cloud

1.2K

Documentation reference for using Browser Use Cloud — the hosted API and SDK for browser automation. Use this skill whenever the user needs help with the Cloud REST API (v2 or v3), browser-use-sdk (Python or TypeScript), X-Browser-Use-API-Key authentication, cloud sessions, browser profiles, profile sync, CDP WebSocket connections, stealth browsers, residential proxies, CAPTCHA handling, webhooks, workspaces, skills marketplace, liveUrl streaming, pricing, or integration patterns (chat UI, subagent, adding browser tools to existing agents). Also trigger for questions about n8n/Make/Zapier integration, Playwright/ Puppeteer/Selenium on cloud infrastructure, or 1Password vault integration. Do NOT use this for the open-source Python library (Agent, Browser, Tools config) — use the open-source skill instead.

Stars 94,112

uiplaywrightauthapi

自动化与集成 / 自动执行

gemini-computer-use

1.2K

Build and run Gemini 2.5 Computer Use browser-control agents with Playwright. Use when a user wants to automate web browser tasks via the Gemini Computer Use…

Stars 929

uiplaywrightapiprompt

自动化与集成 / 自动执行

autobrowse

1.0K

Self-improving browser automation via the auto-research loop. Iteratively runs a browsing task, reads the trace, and improves the navigation skill…

Stars 0

uiapipromptagent

自动化与集成 / 自动执行

google-flights

954

Search Google Flights for flight prices and schedules using browser automation. Use when user asks to search flights, find airfare, compare prices, check…

Stars 2

agentgoogleflightssearch

自动化与集成 / 自动执行

浏览器自动化测试

browser-act

945

面向 AI 智能体的浏览器自动化 CLI (browser-act)。必须在以下情况触发：(1) 用户以任何形式提及 'browser-act'，或用户需要：(2)…

Stars 0

uisecurityapiprompt

自动化与集成 / 自动执行

pinchtab

706

Use this skill when a task needs browser automation through PinchTab: open a website, inspect interactive elements, click through flows, fill out forms, scrape…

Stars 9,075

uiauthapiagent

自动化与集成 / 自动执行

agent-browser

674

Browser automation for AI agents via inference.sh. Navigate web pages, interact with elements using @e refs, take screenshots, record video. Capabilities: web…

Stars 438

uiplaywrightragagent

自动化与集成 / 自动执行

chrome-bridge-automation

665

Vision-driven browser automation using Midscene Bridge mode. Operates entirely from screenshots — no DOM or accessibility labels required. Can interact with all visible elements on screen regardless of technology stack. Connects to the user's desktop Chrome browser via the Midscene Chrome Extension, preserving cookies, sessions, and login state. Does NOT take over the user's mouse or keyboard — operates through Chrome DevTools Protocol. Use this skill when the user wants to: - Browse, navigate, or open web pages in the user's own Chrome browser - Interact with pages that require login sessions, cookies, or existing browser state - Scrape, extract, or collect data from websites using the user's real browser - Fill out forms, click buttons, or interact with web elements - Verify, validate, test, or QA frontend UI behavior - Take screenshots of web pages - Automate multi-step web workflows - Test what was just built, validate the UI in real browser Powered by Midscene.js (https://midscenejs.com)

Stars 223

frontenduitestingrag

自动化与集成 / 自动执行

asc-app-create-ui

587

Create a new App Store Connect app record via browser automation. Use when there is no public API for app creation and you need an agent to drive the New App…

Stars 0

uiplaywrightapiagent

第 1 / 3 页