Worker Design Philosophy

Core Principle: Database as Single Source of Truth

Workers are designed as stateless executors that rely entirely on the database for state management, workflow decisions, and coordination. This approach ensures reliability, observability, and recovery capabilities in our distributed processing system.

What is a Worker?

All workers in our system follow the same foundational principles:

Stateless Execution

No memory between operations - Workers don't retain context from previous executions
No shared state - Workers don't communicate with each other directly
Database dependency - All state and decisions come from database queries
Crash resilient - Workers can be killed and restarted without losing anything

Database-Driven Architecture

Single source of truth - Database contains all processing state and business rules
Atomic operations - All database interactions are transactional
Complete audit trail - Every action is recorded for debugging and monitoring
Decision delegation - Workers ask database "what should I do?" rather than deciding

Core Worker Pattern

Every worker operation follows this pattern:

Query database for work or state information
Execute business logic based on database-provided context
Update database with results and new state
Create next job if part of multi-step workflow

Unified Worker Architecture

All workers in our system follow the same foundational pattern - they are stateless job processors that use the database as their single source of truth. Workers are differentiated by the job type they process, not by their execution pattern.

Core Worker Principles

Stateless execution - No memory between job executions
Database-driven - All state and decisions come from database queries
Job queue based - All work flows through processing_jobs table
Crash resilient - Workers can be killed and restarted without losing anything
Type-specific - Each worker only processes jobs of its designated type

Universal Job Processing Pattern

Every worker follows this identical 5-step pattern for every job:

"Give me a job to do" (poll database)
- Query database for available work: SELECT * FROM processing_jobs WHERE type = 'my_job_type' AND status = 'queued'
- Use atomic operations to claim jobs and prevent race conditions
- Natural load balancing - idle workers automatically pick up more work
"I'm starting this job" (update status to running)
- Immediately mark job as running with timestamp
- Record worker ID for tracking and debugging
- Establishes clear ownership and prevents duplicate processing
Do the actual work (business logic execution)
- Execute the business logic for this specific job type
- Handle all necessary external API calls, file operations, etc.
- Maintain focus on the single task at hand
"I finished, here's the result" (update status and result)
- Update job status to completed or failed
- Store all results, error messages, and metadata in database
- Provide complete audit trail of what happened
"Create next job if needed" (multi-step workflows)
- For multi-step workflows, create the next job in the sequence
- Example: document_collection worker creates ocr job when complete
- Single-step jobs (like basic_discovery) don't create follow-up jobs

User Experience vs Worker Implementation

From the worker perspective: All jobs are processed identically through the job queue system.

User experience differences are handled by the frontend:

Wait for results: Frontend polls job status until completion (basic discovery)
Background processing: Frontend shows "processing in background" and notifies when done (document collection, OCR, chunking)

Workers are completely unaware of how the frontend presents the job to users.

Implementation Benefits

Observability

Complete visibility into system state through database queries
Real-time monitoring of all job processing
Historical analysis of performance and error patterns across all worker types

Recovery and Debugging

Granular recovery - restart individual failed jobs
Complete audit trail of all processing steps
Manual intervention capabilities through database updates
Failed jobs remain in queue for retry

Scalability

Linear scaling - add workers without coordination overhead
Natural load balancing through database polling
No inter-worker communication or state synchronization needed
Each worker type can be scaled independently

Reliability

Single point of truth eliminates consistency problems
Atomic database operations ensure data integrity
Worker failures don't corrupt system state
Failed jobs remain in queue for retry

Trade-offs and Considerations

Performance Overhead

Additional database round-trips between job steps
Serialization/deserialization of intermediate results
Acceptable cost for operational benefits

Database Load

Constant polling creates steady database traffic
Requires proper indexing and connection pooling
Database becomes critical system component

Complexity

Requires building job scheduling and coordination infrastructure
More complex than simple in-memory processing
Justified by reliability and observability requirements

This philosophy prioritizes reliability, observability, and operational simplicity over raw performance, providing a consistent pattern for all types of processing workflows.

Worker Design Philosophy ​

Core Principle: Database as Single Source of Truth ​

What is a Worker? ​

Stateless Execution ​

Database-Driven Architecture ​

Core Worker Pattern ​

Unified Worker Architecture ​

Core Worker Principles ​

Universal Job Processing Pattern ​

User Experience vs Worker Implementation ​

Implementation Benefits ​

Observability ​

Recovery and Debugging ​

Scalability ​

Reliability ​

Trade-offs and Considerations ​

Performance Overhead ​

Database Load ​

Complexity ​