SherpaDownload
Windows tray app · bring your own AI

Point at anything.
Know what it is.

Sherpa lives in your system tray. Hover any button, menu, thumbnail, or chart — in any app or website — tap Pause, and an AI vision model tells you exactly what it is and what it does.

Free app · runs on your own OpenRouter key · about a tenth of a cent per question

youtube.com
8:00
9:01
10:02
11:03
12:04
13:05
youtube.com
chrome.exe · cursor (612, 348)
Sherpa

That’s a video thumbnail — the clickable preview that

opens the video when you click it. The duration sits in

the bottom-right corner, and hovering it plays a short,

silent preview of the first few seconds.

How it works

Three keystrokes from “what is this?” to knowing.

01Pause

Hover & press your key

Point at whatever’s confusing you and tap the hotkey. Pause by default — rebind it to anything in Settings.

02

Sherpa marks the exact spot

It grabs the window under your cursor and pins a crosshair on the precise pixel — so the AI explains that, not the flashiest thing nearby.

03

Get a plain-English answer

A floating chat tells you what it is and what it does — then stays open so you can ask follow-up questions.

What you get

A pocket guide for software you’ve never seen.

Works on anything

Native apps, web apps, settings panels, games. If it’s on your screen, Sherpa can read it — controls and content alike.

Pinpoint, don’t guess

A crosshair marks the exact pixel under your cursor, so you get an answer about that control — not the nearest big button.

Any model you want

Pick from every vision model on OpenRouter, ranked by GUI-grounding score right in Settings. Ships defaulting to GPT-5.2.

Pennies, not subscriptions

Bring your own OpenRouter key. A question costs a fraction of a cent — and you pay the model maker directly.

Out of the way

Lives in your tray with a global hotkey and a floating, always-on-top chat. Left-click the tray to reopen your last answer.

No middleman

No Sherpa account, no Sherpa server. Your screenshot goes straight from your PC to the model you chose.

No accounts. No servers. No catch.

Sherpa runs entirely on your machine. Captures go directly to the AI model you pick, authenticated with your own key. We never see them — there’s nothing of ours in the middle to see them with.

Questions

Good things to know.

What does it cost?+

The app itself is free. You bring an OpenRouter API key and pay the model provider directly — typically well under one cent per question on a fast model, and you can watch usage on your OpenRouter dashboard.

Where do my screenshots go?+

Straight from your computer to whichever model you select, over your own OpenRouter key. There is no Sherpa backend sitting in the middle, so there’s nowhere for us to store or see your captures.

Which AI models can I use?+

Any vision-capable model on OpenRouter — GPT-5.2, Gemini 3, Claude, Qwen3-VL and more. Settings ranks them by their ScreenSpot-Pro grounding score, so you can trade speed for accuracy at a glance.

Which operating systems are supported?+

Windows 10 and 11 today. Sherpa hooks directly into Windows’ windowing APIs to grab the right window and your exact cursor position.

How do I trigger a capture?+

Tap Pause by default, or rebind it to any shortcut in Settings. You can also capture from the tray menu, and left-click the tray icon to reopen your last chat.

Stop Googling “what does this button do.”

Install Sherpa, add your OpenRouter key, and press Pause. That’s the whole setup.

Download for WindowsWindows 10 / 11 · ~12 MB installer