Mcp Server > Using Tallyfy MCP Server with Claude (Text Chat)
Claude Computer Use
Anthropic’s “Computer Use” capability is available for Claude 4 (Opus and Sonnet), Claude 3.7 Sonnet, and Claude 3.5 Sonnet v2, allowing developers to build applications where Claude can interact with computer desktop environments much like a human. This includes perceiving screen information, moving a cursor, clicking buttons, and typing text, opening new possibilities for automating UI-based tasks within Tallyfy processes.
With the release of Claude 4 models on May 22, 2025, Computer Use capabilities have been significantly enhanced, featuring new tool versions (computer_20250124), improved performance benchmarks, and the first AI model released under Anthropic’s ASL-3 safety protections. Claude 4 models lead the industry on software engineering benchmarks, with Claude Opus 4 achieving 72.5% and Claude Sonnet 4 achieving 72.7% on SWE-bench Verified.
Important guidance for AI agent tasks
Your step-by-step instructions for the AI agent to perform work go into the Tallyfy task description. Start with short, bite-size and easy tasks that are just mundane and tedious. Do not try and ask an AI agent to do huge, complex decision-driven jobs that are goal-driven - they are prone to indeterministic behavior, hallucination, and it can get very expensive quickly.
Claude Computer Use vs Claude MCP Integration
This article covers Claude Computer Use - where Claude visually perceives and controls computer interfaces through screenshots, mouse movements, and keyboard actions. This is different from Claude’s MCP integration, which provides text-based chat access to data sources and APIs.
When to use each:
- Claude Computer Use (this article): For automating visual UI tasks that require seeing and interacting with interface elements (clicking buttons, filling forms, navigating menus)
- Claude MCP Integration: For data queries, API-based automation, and text-based workflow management
Example comparison:
- Computer Use: “Open Excel, click on cell A1, type a formula, then save the file”
- MCP Integration: “Query Tallyfy API for all overdue tasks and generate a report”
Both capabilities can complement each other in comprehensive automation workflows.
Anthropic enables Claude with general computer interaction skills rather than creating specific integrations for countless applications. This is achieved through an API that allows Claude to perceive and act within a sandboxed computing environment.
Key aspects of Claude’s Computer Use include:
- Models Supported: Claude 4 (Opus and Sonnet), Claude 3.7 Sonnet, and Claude 3.5 Sonnet v2. Claude 4 models, released May 22, 2025, offer enhanced computer use capabilities with extended thinking and tool use integration.
- Leading Performance: Claude 4 models achieve state-of-the-art results on coding benchmarks, with Claude Opus 4 scoring 72.5% and Claude Sonnet 4 scoring 72.7% on SWE-bench Verified.
- API-Driven Interaction: Developers use Anthropic’s Messages API, providing Claude with a set of Anthropic-defined computer use tools. Claude then requests to use these tools to achieve a user’s goal.
- Enhanced Tool Versions: New tool versions are available for different models:
computer_20250124
for Claude 4 and Claude 3.7 models with enhanced actionscomputer_20241022
for Claude 3.5 Sonnet v2
- The Agent Loop:
- Your application sends a user prompt and the list of available computer use tools to Claude.
- Claude decides if a tool can help and responds with a
tool_use
request, specifying the tool and its inputs. - Your application (the client) is responsible for executing this tool request in a secure, sandboxed computing environment and then sending the
tool_result
(e.g., screenshot, command output, success/failure) back to Claude in a new user message. - Claude analyzes the result and decides on the next action, which could be another tool call or a final text response. This loop continues until the task is completed.
- Sandboxed Computing Environment: Essential for safety, this environment (often a Docker container) typically includes:
- A virtual X11 display server (e.g., Xvfb) for rendering the desktop.
- A lightweight Linux desktop environment (e.g., Mutter window manager, Tint2 panel).
- Pre-installed applications (e.g., Firefox, LibreOffice, text editors, file managers).
- Your implementations of the Anthropic-defined tools that translate Claude’s requests into actual UI operations.
- Anthropic-Defined Tools (User Executed): Anthropic specifies the tools, but your application executes them. Core tools include:
computer
: For mouse/keyboard actions (key presses, typing, cursor movement, clicks, drags, scrolling) and taking screenshots. Requiresdisplay_width_px
anddisplay_height_px
.text_editor
(str_replace_editor
): To view, create, and edit files within the environment.bash
: To run shell commands in the sandboxed environment.- Each tool has versions optimized for specific models.
- Extended Thinking with Tool Use: Claude 4 models support extended thinking capabilities during tool use, allowing Claude to alternate between reasoning and tool execution for improved responses.
- Parallel Tool Execution: Claude 4 models can use tools in parallel, following instructions more precisely and demonstrating improved memory capabilities.
- Natural Language Instructions: Claude translates user prompts (goals) into a sequence of tool uses.
- Broad Availability: The Claude API is accessible directly from Anthropic, on Amazon Bedrock, and Google Cloud’s Vertex AI.
- ASL-3 Protections: Claude Opus 4 is the first model released under Anthropic’s ASL-3 (AI Safety Level 3) safety protections, representing a higher standard of safety measures.
- Enhanced Safety Evaluations: Comprehensive testing includes agentic safety, alignment assessment, and CBRN (Chemical, Biological, Radiological, Nuclear) evaluations.
- Reduced Shortcut Behavior: Claude 4 models are 65% less likely to engage in shortcut or loophole behavior compared to previous models on agentic tasks.
Integrating Tallyfy with Claude’s Computer Use involves an intermediary application or service that you build to bridge Tallyfy and the Anthropic API.
-
Set Up Anthropic API Access:
- Obtain an Anthropic API key and familiarize yourself with their API documentation, particularly the sections on “Tool Use” and “Computer Use (beta)”.
-
Choose Your Model:
- Select between Claude 4 models (Opus 4 for maximum capability, Sonnet 4 for balanced performance) or Claude 3.7/3.5 models based on your requirements and budget.
-
Build or Use a Reference Implementation:
- Anthropic provides a reference implementation (Docker setup) that includes a containerized environment, implementations of the computer use tools, an agent loop, and a web interface. This supports both Claude 4 and earlier model versions.
-
Configure the Computing Environment:
- Ensure the sandboxed environment has the necessary applications (e.g., specific browser, office suite) that tasks automated via Tallyfy will need to interact with.
-
Develop the Intermediary Application/Service:
- This service will receive requests (e.g., via a webhook from Tallyfy or a Tallyfy-compatible middleware).
- It will construct the appropriate prompt and tool list for the Claude API based on the Tallyfy task instructions and input data.
- It will manage the “agent loop,” sending Claude’s
tool_use
requests to your sandboxed environment for execution and returningtool_result
back to Claude. - For Claude 4 models, consider enabling extended thinking for complex tasks.
- Once Claude indicates task completion, this service will relay the final output back to Tallyfy.
-
Prompt Engineering:
- Craft clear and explicit prompts for Claude. Anthropic recommends:
- Specifying simple, well-defined tasks.
- Instructing Claude to verify outcomes with screenshots after each step.
- Suggesting keyboard shortcuts for difficult UI elements.
- Providing examples of successful interactions if possible.
- Using XML tags like
<robot_credentials>
for sensitive data, while being mindful of prompt injection risks.
- Craft clear and explicit prompts for Claude. Anthropic recommends:
Tallyfy Task: “Open Product_Feedback.xlsx
from desktop, filter for ‘Urgent’ issues, and count them.”
- Inputs from Tallyfy Form Fields:
File Name
:Product_Feedback.xlsx
Filter Column Name
:Priority
Filter Value
:Urgent
Output Field Name
:Urgent Issue Count
- Integration Steps (Conceptual):
- Tallyfy task starts, triggering your intermediary application via a webhook, passing the form field data.
- Your application constructs an initial prompt for Claude: “Open the Excel file named
Product_Feedback.xlsx
located on the desktop. This file contains columns includingPriority
. Filter the data to show only rows where thePriority
column is ‘Urgent’. Count the number of such rows and provide the total count.” - The application initiates a session with the Claude API, providing the
computer_20250124
,text_editor_20250124
, andbash_20250124
tools (for Claude 4/3.7) or appropriate versions for Claude 3.5. - Agent Loop Begins:
- Claude might first request a
screenshot
to see the desktop. - Your app executes this in the sandbox, returns the image.
- Claude identifies the Excel file icon, requests a
double_click
at its coordinates. - Your app executes, returns a new screenshot showing Excel open.
- Claude identifies the filter controls, requests clicks to apply the filter based on
Priority
=Urgent
. - Your app executes, returns screenshots.
- Claude identifies the number of visible rows (or a status bar count), then formulates a final text response: “The number of urgent issues is 15.”
- Claude might first request a
- Your intermediary application receives this final text response, parses the count (15), and updates the
Urgent Issue Count
form field in the Tallyfy task via Tallyfy’s API.
- Automate UI Interactions for Diverse Applications: Interact with a wide array of desktop and web applications without needing application-specific APIs.
- Leverage Claude’s Advanced Reasoning: Utilize the enhanced language understanding and reasoning of Claude 4 models to interpret tasks and navigate UIs.
- Extended Thinking Capabilities: For complex tasks, Claude 4 models can think through problems step-by-step while using tools.
- Structured Task Management by Tallyfy: Tallyfy defines the “what” and “why” of the task, passes necessary inputs, and receives structured outputs, ensuring the AI’s actions are part of a documented, trackable process.
- Security Risks: Computer use poses unique risks. Always run in a dedicated virtual machine or container with minimal privileges. Avoid giving the model access to sensitive data directly. Limit internet access if possible. Require human confirmation for impactful decisions.
- Prompt Injection: Claude might follow instructions found in on-screen content even if they conflict with your primary instructions. Anthropic has classifiers to detect and flag potential prompt injections in screenshots, steering the model to ask for user confirmation.
- Model Selection and Cost: Claude 4 models offer superior performance but may have higher token costs. Choose the appropriate model based on task complexity and budget requirements.
- Latency: Human-AI interaction latency might be slower than direct human action for some tasks. Best for non-time-critical tasks or background processing.
- Accuracy and Reliability: While Claude 4 models show significant improvements, mistakes in interpreting screens or selecting actions can still occur. The extended thinking capability can help debug issues.
- Restricted Actions: Anthropic limits Claude’s ability to create accounts or generate content on social media and communication platforms to prevent impersonation.
Always carefully review and verify Claude’s computer use actions, especially in a production environment integrated with Tallyfy. Start with low-risk tasks and incorporate human review steps in Tallyfy where appropriate.
Integrations > Computer AI Agents
Computer Ai Agents > Local Computer Use Agents
Computer Ai Agents > AI Agent Vendors
- 2025 Tallyfy, Inc.
- Privacy Policy
- Terms of Use
- Report Issue
- Trademarks