Custom Celery Workflows
The following diagram represents the setup of a typical Celery application which integrates with Riberry.
Custom Components
Entry Point
The bridge between a Celery application and Riberry is the entry point task. This is the only requirement for integration; everything else is either optional (streams, artifacts), handled by Celery (job execution status, error capturing) or exists outside of our Celery app (Riberry's background workers).
A Riberry Celery application should have at least one entry point which will receive the job's input data. The entry point itself is bound to one application form. Once the entry point begins execution, it will mark that job's execution as ACTIVE
.
Streams
Streams are an optional but recommended feature of Riberry which is used to track a logical grouping of tasks. They are most valuable when we have a large group
of chains
, and we want to track and quantify how many of these chains we've processed.
Events
All custom tasks can yield events. If a task fails, an error artifact event is created. If a stream begins or ends, a stream state event is created. When a workflow begins or completes, a notify event is created.
Events themselves are just tasks which are fed to the Events Worker.
Artifacts
Artifacts are nothing more than binary streams. They can be plain text, images, documents, videos or any binary-based format. Artifacts are stored as blobs in the database, and can be downloaded via Riberry's web application.
If a fatal error occurs during the execution of our workflow, an error artifact will be created, with the contents of that artifact being the full Python exception stack trace.
Custom Tasks
These are our plain-old Celery tasks with added behaviour. These tasks will be automatically revoked if our workflow's job execution has failed and all errors raised by the our custom Celery tasks will also be intercepted and stored as error artifacts.
Core Components
Riberry Tasks Scheduler
This is a stock-standard Celery beat instance which can schedule both Riberry and custom tasks. When we integrate with Riberry, some Riberry core task schedules are injected into the beat configuration.
Execution Polling Worker
This is a core Riberry Celery worker running within our application which is listening on a core queue. The Riberry Tasks Scheduler will yield core tasks onto the core queue for the core worker to consume. This is pretty much a black box and cannot be customized at the present time.
It is required for jobs to be executed.
Poll task
This tasks checks to see if there are any jobs for the current instance in the RECEIVED
state. If so, it will pick up the job, find the related entry point task and queue it with the job's input values.
Once queued, the job's execution is marked as READY
.
Heartbeat task
The heartbeat task is responsible for updating the HEARTBEAT_APP_INSTANCE
periodically with the current timestamp. This timestamp is used to monitor when the application was last alive.
If the last heartbeat is behind by a certain amount, then Riberry makes the assumption that the application is offline.
Dynamic-Parameters task
The dynamic-parameters task is used to deliver the application's parameter schedules to the application itself. For example, if the active
parameter is "N"
, it will trigger the appropriate handler with this value each time the task is executed. It is therefore recommended that our dynamic-parameter handlers are idempotent.
Events Worker
This is another core Riberry Celery worker running within our custom application. It has it's own core queue which is used exclusively for events processing. The events which are created by our custom application will be received by this worker, who will convert the event into an entry in the EVENT
table.
It is required for generating streams, artifacts and notifications. Without this worker, our custom Celery application will continue to process, but we won't be able to track its progress.