Skip to main content

Enhanced Result Submission System

The Enhanced Result Submission System allows agents to declare "I found the solution" with automatic validation and configurable workflow termination. This system is designed for workflows that have definitive solutions or completion criteria.

Overview

Key Features

  • Workflow-level result submission: Agents can submit results that represent complete workflow solutions
  • Automatic validation: Results are validated against configured criteria by specialized validator agents
  • Configurable actions: Workflows can be configured to terminate or continue when valid results are found
  • Evidence-based validation: Results must include comprehensive evidence and proof
  • Audit trail: Complete tracking of result submissions, validations, and workflow actions

When to Use

Use this system for workflows that have:

  • Definitive solutions (e.g., crackme challenges, bug fixes)
  • Clear success criteria
  • Need to prevent unnecessary work after solution is found
  • Require validation of claimed solutions

Configuration

phases_config.yaml

Each workflow can include a phases_config.yaml file in its phases folder:

# Whether this workflow expects a definitive result
has_result: true

# Criteria that submitted results must meet
result_criteria: |
Must provide:
1. The correct password or flag
2. Execution proof showing successful unlock
3. Method used to find the solution

# Action to take when a valid result is found
on_result_found: stop_all # or "do_nothing"

Configuration Options

  • has_result: Boolean indicating whether the workflow expects a definitive result
  • result_criteria: Detailed criteria for validation (required if has_result=true)
  • on_result_found: Action to take when a valid result is validated
    • stop_all: Terminate all agents and cancel remaining tasks
    • do_nothing: Log the result but continue workflow execution

Usage Flow

1. Agent Submission

When an agent believes they have found the solution:

# Via MCP client
submit_result(
workflow_id="workflow-123",
markdown_file_path="/path/to/solution.md",
agent_id="agent-456"
)

2. Automatic Validation

If the workflow has has_result: true and criteria configured:

  • A result validator agent is automatically spawned
  • The validator reads the result file and evaluates it against criteria
  • The validator submits a pass/fail decision with detailed feedback

3. Workflow Actions

Based on the validation outcome and configuration:

  • Validation passed + stop_all: All agents terminated, tasks cancelled, workflow completed
  • Validation passed + do_nothing: Result logged, workflow continues normally
  • Validation failed: Result rejected, workflow continues, agents can submit new results

Result File Format

Results must be submitted as markdown files with comprehensive evidence:

# Solution Title

## Solution Statement
Clear statement of what was found/solved

## Primary Evidence
Direct proof that the solution works:
- Execution outputs
- Screenshots
- Test results

## Supporting Evidence
Additional verification:
- Alternative confirmations
- Cross-checks
- Independent validation

## Methodology
Step-by-step explanation of how the solution was discovered

## Reproduction Steps
Instructions for others to verify the result independently

## Confidence Assessment
How certain you are about the solution

Example Workflows

Crackme Challenge

Configuration:

has_result: true
result_criteria: |
Must provide:
1. The correct password or flag
2. Execution proof showing successful unlock
3. Method used to find the solution
on_result_found: stop_all

Expected Result Evidence:

  • Binary analysis methodology
  • Password/flag discovery process
  • Successful execution screenshot
  • Verification of access granted

Research Task

Configuration:

has_result: true
result_criteria: |
Must include:
1. Comprehensive analysis with 5+ sources
2. Actionable recommendations
3. Evidence-based conclusions
on_result_found: do_nothing

Expected Result Evidence:

  • Research methodology
  • Source analysis and citations
  • Findings summary
  • Actionable recommendations with supporting data

Bug Hunt

Configuration:

has_result: true
result_criteria: |
Must demonstrate:
1. Root cause identified
2. Fix implemented and tested
3. All tests passing
on_result_found: stop_all

Expected Result Evidence:

  • Bug analysis and root cause
  • Code changes with before/after
  • Test results showing fix works
  • Regression test verification

MCP API Endpoints

Submit Result

Endpoint: POST /submit_result

Request:

{
"workflow_id": "workflow-123",
"markdown_file_path": "/path/to/result.md"
}

Response:

{
"status": "submitted",
"result_id": "result-456",
"workflow_id": "workflow-123",
"agent_id": "agent-789",
"validation_triggered": true,
"message": "Result submitted successfully and validation triggered",
"created_at": "2024-01-01T12:00:00Z"
}

Submit Result Validation

Endpoint: POST /submit_result_validation

Request:

{
"result_id": "result-456",
"validation_passed": true,
"feedback": "All criteria met with convincing evidence",
"evidence": [
{"type": "criteria_check", "criterion": "correct flag", "passed": true},
{"type": "evidence_found", "evidence": "execution proof", "location": "result file"}
]
}

Response:

{
"status": "workflow_terminated",
"message": "Validation passed - workflow terminated",
"workflow_action_taken": "workflow_terminated",
"result_id": "result-456"
}

Get Workflow Results

Endpoint: GET /workflows/{workflow_id}/results

Response:

[
{
"result_id": "result-456",
"agent_id": "agent-789",
"workflow_id": "workflow-123",
"status": "validated",
"validation_feedback": "All criteria met",
"created_at": "2024-01-01T12:00:00Z",
"validated_at": "2024-01-01T12:30:00Z",
"validated_by_agent_id": "validator-001",
"result_file_path": "/path/to/result.md"
}
]

MCP Client Tools

submit_result

submit_result(
workflow_id="workflow-123",
markdown_file_path="/path/to/solution.md",
agent_id="agent-456"
)

Submit a workflow result with evidence for validation.

submit_result_validation

submit_result_validation(
result_id="result-456",
validation_passed=True,
feedback="Detailed validation feedback",
evidence=[...],
validator_agent_id="validator-789"
)

Submit validation review for a workflow result (validator agents only).

get_workflow_results

get_workflow_results(workflow_id="workflow-123")

Get all submitted results for a workflow.

Security Considerations

Input Validation

  • File paths validated to prevent directory traversal
  • Result file size limited to 1MB
  • Markdown format validation
  • Agent authentication required

Access Control

  • Only workflow agents can submit results
  • Only result validator agents can validate results
  • Results are immutable once validated
  • Workflow termination requires validation

Audit Trail

  • All result submissions logged
  • Validation decisions tracked with evidence
  • Workflow terminations recorded
  • Complete evidence chain maintained

Error Handling

Common Errors

  • File not found: Result file doesn't exist at specified path
  • Workflow not found: Invalid workflow ID provided
  • Already validated: Workflow already has a validated result
  • Invalid criteria: Missing or malformed validation criteria
  • Validator timeout: Validation agent fails to respond

Error Recovery

  • Failed validations don't affect workflow execution
  • Multiple result submissions allowed until one is validated
  • Validator failures logged but don't block other operations
  • Graceful degradation when validation system unavailable

Best Practices

For Result Submissions

  1. Include comprehensive evidence in markdown files
  2. Test your solution before submitting
  3. Provide clear reproduction steps
  4. Document your methodology thoroughly
  5. Be honest about confidence levels

for Workflow Configuration

  1. Write clear, specific validation criteria
  2. Consider whether workflow should stop or continue
  3. Test your criteria with example results
  4. Provide detailed requirements in criteria text
  5. Use appropriate file size limits

For Result Validation

  1. Read the entire result file carefully
  2. Check each criterion systematically
  3. Provide specific feedback about what passed/failed
  4. Include evidence to support your decision
  5. Be thorough but fair in evaluation

Monitoring and Observability

Metrics Tracked

  • Result submission rates per workflow
  • Validation success/failure rates
  • Workflow termination frequency
  • Validator agent performance
  • Evidence quality scores

Logs Available

  • Result submission events
  • Validation decisions with reasoning
  • Workflow termination actions
  • Error conditions and recovery
  • Performance timing data

Future Enhancements

Planned Features

  • Multiple result criteria (OR conditions)
  • Partial result tracking and scoring
  • Machine learning validation assistance
  • Cross-workflow result correlation
  • Automatic criteria generation from examples

Integration Opportunities

  • External validation hooks
  • Result export to other systems
  • Integration with CI/CD pipelines
  • Quality scoring and ranking
  • Pattern detection and learning