Skip to content

Processing Schema - Pipeline Orchestration

Access: Service role only (workers and admin)

processing_jobs Table

Manages the job pipeline for data processing tasks like basic discovery and document processing. It tracks job status, worker assignments, and workflow progression, serving as the core table for the stateless worker architecture.

Field Definitions

FieldTypeDescriptionEnum Values
iduuidPrimary key for the job-
typetextThe type of job to be processedbasic_discovery, xml_parsing, document_collection, ocr, chunking
statusprocessing_job_status_enumCurrent processing stage of the jobqueued, running, completed, failed
entity_iduuidForeign key to the entities table, if applicable-
datajsonbInput data required for the job to run-
resultjsonbOutput or result data from a completed job. For successful completion, it will contain {'data': '...'}. For failures, it will contain {'error': '...'}.-
worker_idtextID of the worker currently processing the job-
retry_countintegerNumber of times the job has been retried-
max_retriesintegerMaximum number of retries allowed for the job-
run_attimestampTimestamp of when the job should be executed (used for delayed retries)-
created_attimestampTimestamp of when the job was created-
started_attimestampTimestamp of when the job processing started-
completed_attimestampTimestamp of when the job completed-
search_iduuidForeign key to a searches record for basic_discovery jobs-
search_termtextThe search term used for basic_discovery jobs-

job_step_history Table

Complete audit trail of job progression through workflow steps. It records when each step started, completed, failed, or was retried, enabling debugging, performance analysis, and system monitoring.

Field Definitions

FieldTypeDescriptionEnum Values
iduuidPrimary key for the job step history record-
job_iduuidForeign key to the processing_jobs table-
step_typetextThe type of step in the job workflow-
statustextCurrent status of the job step-
attempt_numberintegerThe retry attempt number for the step-
error_messagetextThe descriptive error message if the step failed.-
error_typetextA broad categorization of the error (e.g., api_error, validation_error).-
error_sourcetextThe origin of the error (e.g., worker, rpc_function).-
error_sub_typetextA specific, machine-readable code for the error (e.g., api_timeout, job_not_found).-
error_severityintegerIndicates the error's impact (1 low - 5 very high).-
worker_idtextID of the worker that processed this step-
created_attimestampTimestamp of when the job step record was created-