Mcp Server > Using Tallyfy MCP server with Claude (text chat)
Claude computer use
Claude can now control computers by looking at screens, moving cursors, clicking buttons, and typing text. This “Computer Use” capability launched in October 2024 as a public beta feature available through Anthropic’s API, Amazon Bedrock, and Google Cloud Vertex AI.
Important guidance for AI agent tasks
Your step-by-step instructions for the AI agent to perform work go into the Tallyfy task description. Start with short, bite-size and easy tasks that are just mundane and tedious. Do not try and ask an AI agent to do huge, complex decision-driven jobs that are goal-driven - they are prone to indeterministic behavior, hallucination, and it can get very expensive quickly.
Claude Computer Use vs Claude MCP Integration
This article covers Claude Computer Use - where Claude visually perceives and controls computer interfaces through screenshots, mouse movements, and keyboard actions. This is different from Claude’s MCP integration, which provides text-based chat access to data sources and APIs.
When to use each:
- Claude Computer Use (this article): For automating visual UI tasks that require seeing and interacting with interface elements (clicking buttons, filling forms, navigating menus)
- Claude MCP Integration: For data queries, API-based workflow management, and text-based automation
Both capabilities can complement each other in comprehensive automation workflows.
Instead of building thousands of app-specific integrations, Anthropic gave Claude general computer skills. Claude uses an API to see and interact with any application inside a sandboxed environment.
What to notice:
- Tallyfy provides the task description and expected outputs that guide Claude’s actions
- Claude loops through screenshot-analyze-act cycles until the task is complete
- All results, logs, and screenshots are captured back into Tallyfy fields
Available models with Computer Use:
- Claude 3.5 Sonnet - The primary model for computer use (public beta)
- Claude 3.5 Haiku - Available for simpler automation tasks
- Claude Sonnet 4.5 (October 2025) - Latest model with significant improvements
Performance benchmarks:
- Claude Sonnet 4.5 achieves 61.4% on OSWorld (up from 42.2% with Sonnet 4)
- Human performance on OSWorld remains at 72.4%
- The feature is still experimental and error-prone at this stage
This diagram shows how Tallyfy orchestrates Claude’s computer use capabilities through an iterative agent loop where Claude perceives, acts, and receives feedback until your task is complete.
What to notice:
- Tallyfy triggers your intermediary app via webhook with task data
- The loop between Claude and the sandbox continues until the task is complete
- All tool execution happens in an isolated sandbox for security
Sandboxed Computing Environment: The environment (typically a Docker container) includes:
- A virtual X11 display server (like Xvfb) for rendering the desktop
- A lightweight Linux desktop environment
- Pre-installed applications (Firefox, LibreOffice, text editors)
- Your implementations of the Anthropic-defined tools
Three Core Tools (Anthropic-defined, user-executed):
- computer: Mouse/keyboard actions (key presses, typing, cursor movement, clicks, scrolling) and taking screenshots
- text_editor: View, create, and edit files within the environment
- bash: Run shell commands in the sandboxed environment
Tool Versions:
computer_20250124for newer models with enhanced actionscomputer_20241022for Claude 3.5 Sonnet v2
API Pricing:
- Claude 3.5 Sonnet: $3 per million input tokens, $15 per million output tokens
- Claude 3.5 Haiku: $0.80 per million input tokens, $4 per million output tokens
- Additional overhead: Computer use adds 466-499 tokens to the system prompt
Access Requirements:
- Anthropic API key with sufficient credits
- Available through Anthropic API, Amazon Bedrock, or Google Cloud Vertex AI
- Docker required for the reference implementation
Computer Use works well for specific automation scenarios. Early adopters include Asana, Canva, Replit, and DoorDash.
Suitable applications:
- Automating form filling across desktop applications
- Extracting data from legacy systems without APIs
- QA and testing tasks with synthetic test case generation
- Navigating multiple applications to complete multi-step workflows
- Desktop navigation and file management tasks
Example from Replit: Using Claude 3.5 Sonnet’s computer use capabilities to evaluate apps as they’re built - a key feature requiring visual verification.
Claude’s computer use capability is still developing. Anthropic acknowledges these constraints:
Technical Limitations:
- Latency: Tasks requiring dozens or hundreds of steps can be slow
- Error-prone: Scrolling, dragging, and zooming remain challenging
- Resolution constraints: May struggle with screens higher than 1024×768 or 1280×800 due to image scaling
- Action reliability: Some actions that people perform effortlessly present challenges for Claude
Safety Concerns:
- Claude may follow instructions found in screen content, even if they conflict with user instructions
- Risk of prompt injection from webpages or images
- Potential for scaled abuse if not properly isolated
Rate Limits:
- API rate limits apply based on your tier
- Processing time varies significantly based on task complexity
You’ll need to build an intermediary application that connects Tallyfy to the Anthropic API. Anthropic provides a reference implementation with Docker.
-
Set Up Anthropic API Access:
- Obtain an Anthropic API key from the Anthropic Console
- Familiarize yourself with the API documentation on “Tool Use” and “Computer Use (beta)”
-
Install Docker:
- Install the latest version of Docker on your system
- This is required for the sandboxed environment
-
Use the Reference Implementation:
- Anthropic provides a Docker-based reference implementation
- Includes containerized environment, tool implementations, and agent loop
- Pull the image:
docker pull ghcr.io/anthropics/anthropic-quickstarts:computer-use-demo-latest
-
Configure the Computing Environment:
- Run the Docker container with proper security settings
- Container runs with minimal privileges (1 CPU, 2GB RAM default)
- Access the interface at
http://localhost:8080 - Never run Claude Computer Use unattended - monitor its actions
-
Develop the Intermediary Application:
- Receive requests via webhook from Tallyfy
- Construct appropriate prompts and tool lists for the Claude API
- Manage the agent loop between Claude and your sandbox
- Relay final outputs back to Tallyfy
-
Prompt Engineering Best Practices:
- Specify simple, well-defined tasks
- Tell Claude to verify outcomes with screenshots after each step
- Suggest keyboard shortcuts for complex UI elements
- Provide examples of successful interactions when available
- Use XML tags for structured data inputs
Here’s a simplified example of the integration flow:
from anthropic import Anthropicimport os
client = Anthropic(api_key=os.environ['ANTHROPIC_API_KEY'])
# Basic computer use requestresponse = client.messages.create( model="claude-3-5-sonnet-20241022", max_tokens=1024, tools=[ { "type": "computer_20241022", "name": "computer", "display_width_px": 1024, "display_height_px": 768, }, { "type": "text_editor_20241022", "name": "text_editor", }, { "type": "bash_20241022", "name": "bash", } ], messages=[ { "role": "user", "content": "Open the file manager and navigate to Documents folder" } ])
# Handle tool use requests in the response# Execute tools in your sandbox# Return results to Claude# Continue loop until task completeNote: This is a simplified example. Real implementations require full agent loop handling, tool execution in Docker sandbox, and result processing.
Critical security measures:
- Run Claude Computer Use in a dedicated virtual machine or container with minimal privileges
- Limit internet access to approved domains only
- Never provide access to sensitive data or account credentials
- Isolate Claude from production systems
- Require human confirmation for critical actions
- Enable comprehensive audit logging
Known Risks:
- Prompt injection vulnerabilities (Claude may follow on-screen instructions)
- Potential for malicious code execution if not properly sandboxed
- Risk of information theft if given access to sensitive data
Good fit for:
- Desktop application automation (Excel, legacy software)
- Data extraction from systems without APIs
- Automated testing of desktop applications
- Form filling across multiple applications
- Low-risk, repetitive UI tasks
Not suitable for:
- Real-time operations (latency issues)
- Time-critical tasks (can take dozens or hundreds of steps)
- Tasks requiring creative judgment
- Social media content creation (restricted by Anthropic)
- High-security environments without proper isolation
Success factors:
- Start with simple, well-defined tasks
- Implement strong security boundaries
- Monitor performance closely
- Maintain human oversight
- Test with low-risk data first
Claude Computer Use offers unique desktop control capabilities but has trade-offs:
Advantages:
- Works with any desktop or web application
- No need for app-specific APIs or integrations
- General-purpose approach adapts to UI changes
Disadvantages:
- Currently slower than traditional RPA for simple tasks
- Still experimental with error-prone execution
- Requires Docker and sandbox infrastructure
- Higher latency than direct API integrations
Alternative approaches:
- Traditional RPA tools for stable, high-volume workflows
- Direct API integrations when available
- Web-only automation tools for browser-based tasks
- Identify repetitive desktop tasks suitable for automation
- Document exact steps with screenshots
- Set up Anthropic API access with credits
- Install Docker and pull reference implementation
- Create Tallyfy process with clear task instructions
- Test with low-risk, non-sensitive data first
- Implement security isolation and monitoring
- Monitor success rates and refine prompts
- Scale gradually with proven workflows
Computer Use is in public beta as of October 2025. Anthropic describes it as “still experimental - at times cumbersome and error-prone” but expects rapid improvement over time.
The technology shows promise for automating repetitive tasks, but significant limitations around speed, reliability, and safety need addressing before widespread production adoption.
Integrations > Computer AI agents
Vendors > OpenAI agent capabilities
Was this helpful?
- 2025 Tallyfy, Inc.
- Privacy Policy
- Terms of Use
- Report Issue
- Trademarks