Point at anything.
Know what it is.
Sherpa lives in your system tray. Hover any button, menu, thumbnail, or chart — in any app or website — tap Pause, and an AI vision model tells you exactly what it is and what it does.
Free app · runs on your own OpenRouter key · about a tenth of a cent per question
That’s a video thumbnail — the clickable preview that
opens the video when you click it. The duration sits in
the bottom-right corner, and hovering it plays a short,
silent preview of the first few seconds.
Three keystrokes from “what is this?” to knowing.
Hover & press your key
Point at whatever’s confusing you and tap the hotkey. Pause by default — rebind it to anything in Settings.
Sherpa marks the exact spot
It grabs the window under your cursor and pins a crosshair on the precise pixel — so the AI explains that, not the flashiest thing nearby.
Get a plain-English answer
A floating chat tells you what it is and what it does — then stays open so you can ask follow-up questions.
A pocket guide for software you’ve never seen.
Works on anything
Native apps, web apps, settings panels, games. If it’s on your screen, Sherpa can read it — controls and content alike.
Pinpoint, don’t guess
A crosshair marks the exact pixel under your cursor, so you get an answer about that control — not the nearest big button.
Any model you want
Pick from every vision model on OpenRouter, ranked by GUI-grounding score right in Settings. Ships defaulting to GPT-5.2.
Pennies, not subscriptions
Bring your own OpenRouter key. A question costs a fraction of a cent — and you pay the model maker directly.
Out of the way
Lives in your tray with a global hotkey and a floating, always-on-top chat. Left-click the tray to reopen your last answer.
No middleman
No Sherpa account, no Sherpa server. Your screenshot goes straight from your PC to the model you chose.
No accounts. No servers. No catch.
Sherpa runs entirely on your machine. Captures go directly to the AI model you pick, authenticated with your own key. We never see them — there’s nothing of ours in the middle to see them with.
Good things to know.
What does it cost?+
The app itself is free. You bring an OpenRouter API key and pay the model provider directly — typically well under one cent per question on a fast model, and you can watch usage on your OpenRouter dashboard.
Where do my screenshots go?+
Straight from your computer to whichever model you select, over your own OpenRouter key. There is no Sherpa backend sitting in the middle, so there’s nowhere for us to store or see your captures.
Which AI models can I use?+
Any vision-capable model on OpenRouter — GPT-5.2, Gemini 3, Claude, Qwen3-VL and more. Settings ranks them by their ScreenSpot-Pro grounding score, so you can trade speed for accuracy at a glance.
Which operating systems are supported?+
Windows 10 and 11 today. Sherpa hooks directly into Windows’ windowing APIs to grab the right window and your exact cursor position.
How do I trigger a capture?+
Tap Pause by default, or rebind it to any shortcut in Settings. You can also capture from the tray menu, and left-click the tray icon to reopen your last chat.
Stop Googling “what does this button do.”
Install Sherpa, add your OpenRouter key, and press Pause. That’s the whole setup.