Screen Context Awareness: VoxyAI Can See What You See
Most AI assistants are blind. They have no idea what you are looking at when you ask a question. You copy text, paste it into a chat window, and then explain the context yourself. VoxyAI eliminates that friction entirely. With Screen Context Awareness, VoxyAI captures the visible content of your frontmost window the instant you trigger it — so by the time you start speaking or typing, the AI already knows what you are working on.
How Screen Context Awareness Works
When you activate VoxyAI — whether by pressing a hotkey, clicking the menu bar icon, or using a voice command — three things happen in rapid succession before the VoxyAI interface even appears:
- Window identification. VoxyAI identifies the frontmost application and its active window. It knows whether you are looking at Xcode, a web browser, a terminal, a PDF, or any other app.
- Screen capture. Using Apple's ScreenCaptureKit framework, VoxyAI captures a high-resolution image of the frontmost window. This happens in milliseconds, with no visible flash or disruption.
- On-device text extraction. Apple's Vision framework performs optical character recognition (OCR) on the captured image, extracting all visible text. This processing happens entirely on your Mac — the screenshot image is never stored or transmitted anywhere.
The extracted text is then automatically included in the prompt sent to your AI provider. The AI receives both your question and the full context of what you were looking at, enabling it to deliver relevant, specific answers without you having to explain anything.
What You Can Do With Screen Context
Screen Context Awareness transforms how you interact with AI. Instead of copying and pasting, you simply ask your question while looking at the content you need help with.
Explain Code Instantly
Open a source file in your editor and trigger VoxyAI. Say "explain this code" or "what does this function do?" VoxyAI reads the visible code directly from your editor window and provides a detailed explanation — no copying required. This works with any editor: Xcode, VS Code, Sublime Text, Nova, or even a web-based IDE.
Debug Error Messages
When your terminal shows a stack trace or your build log is full of errors, just trigger VoxyAI and say "fix this" or "what went wrong?" The AI reads the error output directly from your screen and suggests solutions in context. No more manually copying multi-line stack traces.
Summarize Documents and Web Pages
Reading a long article, PDF, or email? Trigger VoxyAI and ask "summarize this" or "what are the key points?" The AI reads the visible portion of the document and gives you a concise summary. This works across any application that displays text.
Get Contextual Answers
Screen context shines when your question depends on what you are looking at. Looking at a database schema? Ask "how would I add a foreign key here?" Reading API documentation? Ask "show me how to call this endpoint in Python." The AI sees exactly what you see, so its answers are precise and relevant.
Selected Text Takes Priority
If you select a specific block of text or code before triggering VoxyAI, the selected text is used instead of the full screen capture. This gives you fine-grained control: select a single function to ask about just that function, or leave nothing selected to let VoxyAI analyze the entire visible window. The behavior is intuitive — VoxyAI uses the most specific context available.
Privacy by Design
Screen Context Awareness is built with privacy as a foundational principle, not an afterthought. Every aspect of the feature is designed to give you full control over what is captured and where it is sent.
On-Device Processing
The screen capture and text extraction happen entirely on your Mac using Apple's native frameworks. The captured screenshot is processed in memory and immediately discarded — it is never saved to disk, never uploaded, and never shared. Only the extracted text is used, and only when you explicitly trigger VoxyAI.
Privacy Mode
Privacy Mode restricts screen context to local AI providers only — such as Ollama or Apple Intelligence. When Privacy Mode is enabled, screen content is never sent to cloud-based AI services like OpenAI, Anthropic, or Google. This is ideal for developers working with proprietary code, confidential documents, or sensitive data. Privacy Mode is enabled by default.
App Blocklist
Certain applications are automatically blocked from screen capture to protect sensitive information:
- Password managers — 1Password, Bitwarden, and LastPass are blocked by default. Your passwords and vault contents are never captured.
- Keychain Access — The macOS system credential manager is blocked to prevent capturing stored secrets.
- Custom blocklist — You can add any application to the blocklist in VoxyAI's Workflows settings. If you work with financial software, healthcare systems, or any application that displays sensitive data, simply add it to the blocklist and VoxyAI will never capture its content.
Self-Exclusion
VoxyAI never captures its own windows. If VoxyAI is the frontmost application, screen capture is automatically skipped. This prevents recursive capture loops and ensures that your previous AI conversations are not inadvertently fed back into new prompts.
Setting Up Screen Context Awareness
Getting started takes just a few steps:
- Grant Screen Recording permission. The first time VoxyAI attempts a screen capture, macOS will prompt you to grant Screen Recording access. Go to System Settings → Privacy & Security → Screen Recording and enable VoxyAI. This is a one-time setup.
- Enable the feature. Click the VoxyAI menu bar icon to open settings. Scroll to the Automation section and click "Manage Policies..." to open the Actions & Policies window. Switch to the Workflows tab and toggle Screen Context Awareness on.
- Configure privacy settings. In the same Workflows tab, decide whether to keep Privacy Mode enabled (local AI only) or disable it to allow screen context with cloud providers. Review the blocked apps list and add any applications you want to exclude.
Once enabled, screen context works automatically. There is no extra step to trigger it — every time you activate VoxyAI, it captures what you are looking at and includes it as context.
The Technology Behind It
Screen Context Awareness combines two powerful Apple frameworks to deliver fast, private, on-device processing:
- ScreenCaptureKit — Apple's modern screen capture framework captures individual windows with high fidelity and minimal performance impact. VoxyAI targets only the frontmost window, not the entire screen, which reduces processing time and avoids capturing content from other applications.
- Vision framework — Apple's on-device OCR engine extracts text from the captured image with high accuracy. It supports multiple languages and handles a wide range of fonts, sizes, and layouts — from code in a monospace editor to body text in a PDF viewer.
The entire capture-and-extract pipeline completes in milliseconds. By the time VoxyAI's input window appears on screen, the context is already captured and ready. There is no perceptible delay in the user experience.
Context That Learns Over Time
Screen Context Awareness also integrates with VoxyAI's user memory system. As you use VoxyAI across different applications, it learns patterns about your workflow — which apps you use most frequently, which programming languages you work with, and what tools are part of your daily routine. These observations are stored locally and used to personalize future AI responses.
For example, if VoxyAI repeatedly captures screen context from Xcode with Swift code, it learns that you are a Swift developer and adjusts its responses accordingly — defaulting to Swift syntax in code suggestions, referencing Apple frameworks first, and tailoring explanations to your experience level. This happens gradually and transparently, with all memory stored on your device.
A New Way to Work With AI
Screen Context Awareness represents a fundamental shift in how AI assistants interact with your work. Instead of you adapting to the AI — copying text, switching windows, explaining context — the AI adapts to you. It sees what you see, understands what you are working on, and responds accordingly.
The best AI assistant is one that already knows what you need help with before you finish asking the question. With Screen Context Awareness, VoxyAI gets closer to that vision than ever before.
Try VoxyAI Free
Voice dictation with AI-powered formatting for macOS. Works with free local models or bring your own API keys.
Download VoxyAI