Results Reporting System Documentation
Table of Contentsβ
- Overview
- Architecture
- Prerequisites
- API Reference
- Markdown Template and Guidelines
- Agent Integration Guide
- Verification System
- Frontend Integration Guide
- Testing Procedures
- Troubleshooting Guide
- End-to-End Examples
- Configuration Options
Overviewβ
The Results Reporting System is the third and final component in the Hephaestus trilogy (following Git Worktree Isolation and Validation Agent System). It provides a formal mechanism for agents to document their achievements, creating a permanent record of work completed with verification tracking.
Purposeβ
- Accountability: Creates an audit trail of agent work
- Quality Assurance: Enables verification of claimed achievements
- Knowledge Preservation: Documents solutions for future reference
- Trust Building: Provides evidence-based validation of results
Key Featuresβ
- Markdown-based result reporting
- Two result types: Task-level (AgentResult) and Workflow-level (WorkflowResult)
- Multiple results per task support
- Immutable result storage
- Verification status tracking
- Integration with validation system
- File size and format validation
- Path traversal protection
Result Typesβ
Hephaestus supports two types of results:
- 
Task-Level Results ( AgentResult):- Created via POST /report_resultsendpoint
- Associated with a specific task
- Verification status: unverified,verified, ordisputed
- Multiple results can be submitted for the same task
 
- Created via 
- 
Workflow-Level Results ( WorkflowResult):- Created via POST /submit_resultendpoint
- Associated with an entire workflow
- Status: pending_validation,validated, orrejected
- Automatically triggers result validator agent when configured
- Marks workflow completion when validated
 
- Created via 
Architectureβ
System Componentsβ
βββββββββββββββββββ     ββββββββββββββββββββ     βββββββββββββββββββ
β                 β     β                  β     β                 β
β  Agent Process  ββββββΆβ   MCP Server     ββββββΆβ    Database     β
β                 β     β                  β     β                 β
βββββββββββββββββββ     ββββββββββββββββββββ     βββββββββββββββββββ
        β                       β                         β
        β                       β                         β
        βΌ                       βΌ                         βΌ
βββββββββββββββββββ     ββββββββββββββββββββ     βββββββββββββββββββ
β  Markdown File  β     β Result Service   β     β agent_results   β
β   (results.md)  β     β   Validation     β     β     table       β
βββββββββββββββββββ     ββββββββββββββββββββ     βββββββββββββββββββ
Database Schemaβ
CREATE TABLE agent_results (
    id VARCHAR PRIMARY KEY,
    agent_id VARCHAR NOT NULL,
    task_id VARCHAR NOT NULL,
    markdown_content TEXT NOT NULL,
    markdown_file_path TEXT NOT NULL,
    result_type VARCHAR NOT NULL CHECK (
        result_type IN ('implementation', 'analysis', 'fix',
                       'design', 'test', 'documentation')
    ),
    summary TEXT NOT NULL,
    verification_status VARCHAR NOT NULL DEFAULT 'unverified' CHECK (
        verification_status IN ('unverified', 'verified', 'disputed')
    ),
    created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
    verified_at TIMESTAMP,
    verified_by_validation_id VARCHAR,
    FOREIGN KEY (agent_id) REFERENCES agents(id),
    FOREIGN KEY (task_id) REFERENCES tasks(id),
    FOREIGN KEY (verified_by_validation_id) REFERENCES validation_reviews(id)
);
Data Flowβ
- Agent Completes Work: Agent finishes task implementation
- Create Markdown Report: Agent writes results to markdown file
- Submit via MCP: Agent calls report_resultsendpoint
- Validation: System validates file and ownership
- Storage: Result stored in database
- Task Update: Task marked as having results
- Verification: Validator agent can verify results
- Status Update: Verification status updated
Prerequisitesβ
Required Systemsβ
- Git Worktree Isolation System (must be implemented)
- Validation Agent System (must be implemented)
- SQLite database with proper schema
- Python 3.8+ with required dependencies
- MCP server running on port 8000
Dependenciesβ
# Required Python packages
sqlalchemy>=1.4.0
fastapi>=0.68.0
pydantic>=1.8.0
httpx>=0.19.0
pytest>=6.2.0
pytest-asyncio>=0.15.0
API Referenceβ
POST /report_resultsβ
Submit formal results for a completed task.
Request Headersβ
X-Agent-ID: <agent-id>
Content-Type: application/json
Request Bodyβ
{
    "task_id": "task-uuid",
    "markdown_file_path": "/path/to/results.md",
    "result_type": "implementation",
    "summary": "Brief summary of results"
}
Result Typesβ
- implementation: Code or feature implementation
- analysis: Research or analysis results
- fix: Bug fix or issue resolution
- design: Design documents or architecture
- test: Test implementation or results
- documentation: Documentation creation or updates
Response (200 OK)β
{
    "status": "stored",
    "result_id": "result-uuid",
    "task_id": "task-uuid",
    "agent_id": "agent-uuid",
    "verification_status": "unverified",
    "created_at": "2024-01-01T12:00:00Z"
}
Error Responsesβ
400 Bad Requestβ
{
    "detail": "File too large: 150.00KB exceeds maximum of 100KB"
}
404 Not Foundβ
{
    "detail": "Markdown file not found: /path/to/missing.md"
}
403 Forbiddenβ
{
    "detail": "Task task-123 is not assigned to agent agent-456"
}
Validation Rulesβ
- File Existence: File must exist and be readable
- File Format: Must be markdown (.md extension)
- File Size: Maximum 100KB
- Path Security: No directory traversal allowed
- Task Ownership: Agent must own the task
- Multiple Results: Agents can submit multiple results per task
- Immutability: Results cannot be modified after submission
POST /submit_resultβ
Submit a workflow-level result (different from task-level results).
Request Headersβ
X-Agent-ID: <agent-id>
Content-Type: application/json
Request Bodyβ
{
    "markdown_file_path": "/path/to/result.md",
    "explanation": "Brief explanation of what was accomplished",
    "evidence": ["Evidence item 1", "Evidence item 2"]
}
Response (200 OK)β
{
    "status": "submitted",
    "result_id": "result-uuid",
    "workflow_id": "workflow-uuid",
    "agent_id": "agent-uuid",
    "validation_triggered": true,
    "message": "Result submitted successfully and validation triggered",
    "created_at": "2024-01-01T12:00:00Z"
}
Note: This endpoint is for workflow-level results that mark completion of an entire workflow. It automatically derives the workflow_id from the agent's current task and may trigger result validation.
Markdown Template and Guidelinesβ
Template Locationβ
templates/result_report_template.md
Required Sectionsβ
- 
Task Results Header - Clear title describing the task
 
- 
Summary - 2-3 sentence overview of achievements
 
- 
Detailed Achievements - Bullet list of completed items
- Technical implementation details
 
- 
Artifacts Created - Table of files created/modified
- Code metrics if applicable
 
- 
Validation Evidence - Test results
- Manual verification steps
- Performance metrics
 
- 
Known Limitations - Current limitations
- Workarounds available
 
- 
Recommended Next Steps - Immediate actions needed
- Future enhancements
 
Best Practicesβ
# Good Example
## Summary
Successfully implemented JWT authentication with refresh tokens,
achieving 100% test coverage and sub-100ms response times.
# Bad Example
## Summary
Done with auth stuff.
Agent Integration Guideβ
Step 1: Complete Your Taskβ
# Perform the actual work
result = implement_feature()
run_tests()
verify_output()
Step 2: Create Markdown Reportβ
report_content = f"""# Task Results: {task_description}
## Summary
{summary_of_work}
## Detailed Achievements
{list_of_achievements}
## Artifacts Created
{table_of_files}
## Validation Evidence
{test_results}
"""
# Write to file
report_path = f"/tmp/results_{task_id}.md"
with open(report_path, 'w') as f:
    f.write(report_content)
Step 3: Submit Results via MCPβ
import httpx
async def report_results(task_id, report_path, agent_id):
    async with httpx.AsyncClient() as client:
        response = await client.post(
            "http://localhost:8000/report_results",
            json={
                "task_id": task_id,
                "markdown_file_path": report_path,
                "result_type": "implementation",
                "summary": "Implemented feature successfully"
            },
            headers={"X-Agent-ID": agent_id}
        )
        if response.status_code == 200:
            result_data = response.json()
            print(f"Results stored: {result_data['result_id']}")
        else:
            print(f"Error: {response.text}")
Step 4: Update Task Statusβ
# After reporting results, mark task as complete
await update_task_status(task_id, "done", summary, learnings)
Verification Systemβ
How Verification Worksβ
- Initial State: Results start as unverified
- Validation Trigger: When task enters validation
- Validator Review: Validator agent examines results
- Evidence Check: Validator verifies claims against evidence
- Status Update: Results marked as verifiedordisputed
Verification Statesβ
| Status | Description | When Applied | 
|---|---|---|
| unverified | Default state | On creation | 
| verified | Claims validated | Validation passed | 
| disputed | Claims questioned | Validation failed | 
Integration with Validation Systemβ
# In validation review handler
if validation_passed:
    # Update all results for the task to verified
    for result in task.results:
        ResultService.verify_result(
            result_id=result.id,
            validation_review_id=review.id,
            verified=True
        )
Frontend Integration Guideβ
Fetching Resultsβ
// Get results for a specific task
async function getTaskResults(taskId) {
    const response = await fetch(`/api/results?task_id=${taskId}`);
    const results = await response.json();
    return results;
}
Displaying Resultsβ
function ResultCard({ result }) {
    return (
        <div className="result-card">
            <div className="result-header">
                <h3>{result.summary}</h3>
                <VerificationBadge status={result.verification_status} />
            </div>
            <div className="result-content">
                <MarkdownRenderer content={result.markdown_content} />
            </div>
            <div className="result-metadata">
                <span>Type: {result.result_type}</span>
                <span>Created: {result.created_at}</span>
                {result.verified_at && (
                    <span>Verified: {result.verified_at}</span>
                )}
            </div>
        </div>
    );
}
Verification Badge Componentβ
function VerificationBadge({ status }) {
    const badges = {
        unverified: { icon: 'β³', color: 'gray', text: 'Unverified' },
        verified: { icon: 'β
', color: 'green', text: 'Verified' },
        disputed: { icon: 'β οΈ', color: 'red', text: 'Disputed' }
    };
    const badge = badges[status];
    return (
        <span className={`badge badge-${badge.color}`}>
            {badge.icon} {badge.text}
        </span>
    );
}
Testing Proceduresβ
Unit Testsβ
Run unit tests for the result service:
pytest tests/test_result_service.py -v
Run unit tests for the MCP endpoint:
pytest tests/test_mcp_results_endpoint.py -v
Integration Testsβ
Run the full integration test:
python tests/mcp_integration/test_mcp_flow.py
Expected output:
β Server Health Check
β Create Task via MCP
β Get Tasks List
β Save Memory
β Update Task (Wrong Agent)
β Report Results
β Report Multiple Results
β Update Task (Correct Agent)
β Update Task (Missing Fields)
β Get Agent Status
All 10 tests passed!
Manual Testingβ
- Start the MCP server:
python run_server.py
- Create a test markdown file:
echo "# Test Results\n\nTest content" > /tmp/test_results.md
- Submit results via curl:
curl -X POST http://localhost:8000/report_results \
  -H "Content-Type: application/json" \
  -H "X-Agent-ID: test-agent-001" \
  -d '{
    "task_id": "task-123",
    "markdown_file_path": "/tmp/test_results.md",
    "result_type": "implementation",
    "summary": "Test implementation"
  }'
Troubleshooting Guideβ
Common Issues and Solutionsβ
Issue: "File not found" errorβ
Cause: Markdown file doesn't exist at specified path Solution:
- Verify file exists: ls -la /path/to/file.md
- Check file permissions: chmod 644 /path/to/file.md
- Use absolute paths instead of relative
Issue: "File too large" errorβ
Cause: Markdown file exceeds 100KB limit Solution:
- Check file size: du -h /path/to/file.md
- Split into multiple results if needed
- Compress images or move to external storage
Issue: "Not assigned to agent" errorβ
Cause: Agent trying to report results for wrong task Solution:
- Verify task assignment in database
- Check agent ID in request header
- Ensure task wasn't reassigned
Issue: Results not being verifiedβ
Cause: Validation system not running or misconfigured Solution:
- Check validation agent is spawned
- Verify validation_reviews table has entries
- Check worktree manager is initialized
Debug Loggingβ
Enable debug logging for troubleshooting:
import logging
logging.basicConfig(level=logging.DEBUG)
logger = logging.getLogger("src.services.result_service")
Database Queries for Debuggingβ
-- Check results for a task
SELECT * FROM agent_results WHERE task_id = 'task-123';
-- Check verification status
SELECT ar.*, vr.feedback
FROM agent_results ar
LEFT JOIN validation_reviews vr ON ar.verified_by_validation_id = vr.id
WHERE ar.task_id = 'task-123';
-- Count results by status
SELECT verification_status, COUNT(*)
FROM agent_results
GROUP BY verification_status;
End-to-End Examplesβ
Example 1: Simple Implementation Taskβ
# Agent completes implementation
async def complete_implementation_task(task_id, agent_id):
    # 1. Do the work
    implement_feature()
    run_tests()
    # 2. Create report
    report = """# Task Results: Implement User Authentication
## Summary
Implemented JWT-based authentication with refresh tokens.
## Detailed Achievements
- [x] JWT token generation
- [x] Token validation middleware
- [x] Refresh token mechanism
- [x] 15 unit tests passing
## Artifacts Created
| File Path | Type | Description |
|-----------|------|-------------|
| src/auth.py | Python | Authentication logic |
| tests/test_auth.py | Python | Unit tests |
## Validation Evidence
Ran 15 tests in 0.5s OK
"""
    # 3. Write to file
    report_path = f"/tmp/results_{task_id}.md"
    with open(report_path, 'w') as f:
        f.write(report)
    # 4. Submit results
    async with httpx.AsyncClient() as client:
        response = await client.post(
            "http://localhost:8000/report_results",
            json={
                "task_id": task_id,
                "markdown_file_path": report_path,
                "result_type": "implementation",
                "summary": "Implemented JWT authentication"
            },
            headers={"X-Agent-ID": agent_id}
        )
        result_id = response.json()["result_id"]
        print(f"Results reported: {result_id}")
    # 5. Mark task complete
    await update_task_status(task_id, "done")
Example 2: Multiple Results for Analysis Taskβ
async def report_analysis_results(task_id, agent_id):
    # First result: Initial analysis
    analysis_report = """# Task Results: Performance Analysis
## Summary
Analyzed application performance and identified bottlenecks.
## Detailed Achievements
- Profiled application under load
- Identified N+1 query problems
- Found memory leaks in cache layer
"""
    with open("/tmp/analysis.md", 'w') as f:
        f.write(analysis_report)
    await submit_result(task_id, "/tmp/analysis.md",
                       "analysis", "Performance analysis complete")
    # Second result: Optimization recommendations
    recommendations = """# Additional Results: Optimization Plan
## Summary
Detailed plan for addressing performance issues.
## Recommendations
1. Implement query batching
2. Add Redis caching layer
3. Optimize database indices
"""
    with open("/tmp/recommendations.md", 'w') as f:
        f.write(recommendations)
    await submit_result(task_id, "/tmp/recommendations.md",
                       "design", "Optimization plan created")
Example 3: Fix with Validationβ
async def fix_bug_with_validation(task_id, agent_id):
    # 1. Fix the bug
    apply_fix()
    # 2. Create detailed report with evidence
    report = """# Task Results: Fix Memory Leak in Cache Manager
## Summary
Fixed memory leak by implementing proper cleanup in cache eviction.
## Detailed Achievements
- Identified leak using memory profiler
- Implemented cleanup callbacks
- Added unit tests for cleanup
- Verified fix under load
## Validation Evidence
### Before Fix
Memory usage: 2.5GB after 1 hour Growth rate: 50MB/minute
### After Fix
Memory usage: 500MB after 1 hour Growth rate: 0MB/minute (stable)
## Test Results
pytest tests/test_cache.py::test_cleanup ===== 5 passed in 0.3s =====
"""
    # 3. Save and submit
    with open("/tmp/fix_results.md", 'w') as f:
        f.write(report)
    result = await submit_result(task_id, "/tmp/fix_results.md",
                                "fix", "Memory leak fixed")
    # 4. Mark task done (triggers validation if enabled)
    await update_task_status(task_id, "done")
    # 5. Result will be verified during validation
    print(f"Result {result['result_id']} awaiting verification")
Configuration Optionsβ
Current Implementationβ
The Results Reporting System uses hard-coded configuration values implemented in the validation layer:
File Size Limit:
- Maximum file size: 100 KB (hard-coded in src/services/validation_helpers.py)
- Enforced by validate_file_size()function
File Format:
- Allowed format: Markdown (.md) only (hard-coded in src/services/validation_helpers.py)
- Enforced by validate_markdown_format()function
Security Validation:
- Path traversal protection enabled (checks for ".." in paths)
- Task ownership verification required
- Results are immutable after creation
Database Indexingβ
Database indices are automatically managed by SQLAlchemy through foreign key relationships defined in the AgentResult model:
# src/core/database.py - AgentResult model
class AgentResult(Base):
    __tablename__ = "agent_results"
    id = Column(String, primary_key=True)
    agent_id = Column(String, ForeignKey("agents.id"), nullable=False)
    task_id = Column(String, ForeignKey("tasks.id"), nullable=False)
    # ... other fields
SQLAlchemy automatically creates indices for:
- Primary key (id)
- Foreign keys (agent_id,task_id,verified_by_validation_id)
Future Configuration Enhancements (Planned)β
The following configuration options are planned for future releases:
# Planned environment variables (not yet implemented)
HEPHAESTUS_MAX_RESULT_SIZE_KB=100
HEPHAESTUS_RESULTS_DIR=/var/hephaestus/results
HEPHAESTUS_ARCHIVE_RESULTS=true
HEPHAESTUS_ARCHIVE_RETENTION_DAYS=30
# Planned configuration class (not yet implemented)
class ResultsConfig:
    max_file_size_kb: int = 100
    allowed_file_extensions: List[str] = [".md"]
    results_directory: str = "/tmp"
    enable_archival: bool = False
    archive_retention_days: int = 30
    enable_duplicate_detection: bool = True
    duplicate_threshold: float = 0.95
Performance Considerationsβ
Optimization Tipsβ
- Batch Result Queries: Fetch all results for a task in one query
- Cache Markdown Content: Store rendered HTML to avoid re-parsing
- Index Frequently Queried Fields: Add database indices as needed
- Compress Old Results: Archive and compress results after 30 days
- Limit Result Size: Enforce 100KB limit to prevent storage issues
Monitoring Metricsβ
# Key metrics to track
metrics = {
    "results_per_task": "Average number of results per task",
    "result_size_avg": "Average result file size",
    "verification_rate": "Percentage of verified results",
    "submission_latency": "Time to store result",
    "retrieval_latency": "Time to fetch results",
}
Security Considerationsβ
Path Traversal Protectionβ
def validate_file_path(file_path: str) -> None:
    """Prevent directory traversal attacks."""
    if ".." in file_path:
        raise ValueError("Directory traversal detected")
    # Resolve to absolute path
    abs_path = os.path.abspath(file_path)
    # Ensure within allowed directories
    if not abs_path.startswith(ALLOWED_DIRS):
        raise ValueError("File outside allowed directories")
Input Sanitizationβ
- Validate all file paths
- Check file extensions
- Limit file sizes
- Sanitize markdown content
- Validate task ownership
Access Controlβ
- Agents can only report results for their tasks
- Results are immutable after creation
- Verification limited to validator agents
- Read access controlled by task visibility
Support and Resourcesβ
Documentationβ
- Git Worktree Isolation System
- Validation Agent System
- MCP Server Documentation (see main codebase)
Code Examplesβ
Troubleshootingβ
- Check logs in the console output or agent tmux sessions
- Enable debug mode with LOG_LEVEL=DEBUG
- Query database directly to inspect results:
SELECT * FROM agent_results WHERE task_id = 'your-task-id';
- Use the frontend dashboard to view results and validation status
Document Version: 1.0.0 Last Updated: January 2024 System Version: Hephaestus Results Reporting System v1.0.0