Processing Schema - Pipeline Orchestration
Access: Service role only (workers and admin)
processing_jobs Table
Manages the job pipeline for data processing tasks like basic discovery and document processing. It tracks job status, worker assignments, and workflow progression, serving as the core table for the stateless worker architecture.
Field Definitions
| Field | Type | Description | Enum Values |
|---|---|---|---|
id | uuid | Primary key for the job | - |
type | text | The type of job to be processed | basic_discovery, xml_parsing, document_collection, ocr, chunking |
status | processing_job_status_enum | Current processing stage of the job | queued, running, completed, failed |
entity_id | uuid | Foreign key to the entities table, if applicable | - |
data | jsonb | Input data required for the job to run | - |
result | jsonb | Output or result data from a completed job. For successful completion, it will contain {'data': '...'}. For failures, it will contain {'error': '...'}. | - |
worker_id | text | ID of the worker currently processing the job | - |
retry_count | integer | Number of times the job has been retried | - |
max_retries | integer | Maximum number of retries allowed for the job | - |
run_at | timestamp | Timestamp of when the job should be executed (used for delayed retries) | - |
created_at | timestamp | Timestamp of when the job was created | - |
started_at | timestamp | Timestamp of when the job processing started | - |
completed_at | timestamp | Timestamp of when the job completed | - |
search_id | uuid | Foreign key to a searches record for basic_discovery jobs | - |
search_term | text | The search term used for basic_discovery jobs | - |
job_step_history Table
Complete audit trail of job progression through workflow steps. It records when each step started, completed, failed, or was retried, enabling debugging, performance analysis, and system monitoring.
Field Definitions
| Field | Type | Description | Enum Values |
|---|---|---|---|
id | uuid | Primary key for the job step history record | - |
job_id | uuid | Foreign key to the processing_jobs table | - |
step_type | text | The type of step in the job workflow | - |
status | text | Current status of the job step | - |
attempt_number | integer | The retry attempt number for the step | - |
error_message | text | The descriptive error message if the step failed. | - |
error_type | text | A broad categorization of the error (e.g., api_error, validation_error). | - |
error_source | text | The origin of the error (e.g., worker, rpc_function). | - |
error_sub_type | text | A specific, machine-readable code for the error (e.g., api_timeout, job_not_found). | - |
error_severity | integer | Indicates the error's impact (1 low - 5 very high). | - |
worker_id | text | ID of the worker that processed this step | - |
created_at | timestamp | Timestamp of when the job step record was created | - |