Restarting, Resubmitting, and Retrying Failed Jobs

Follow

 

Restarting, Resubmitting, and Retrying Failed Jobs

When a Job or Sequence Job fails in JAMS, it can be resubmitted or restarted manually, configured to recover automatically, or even ignore failures to continue with preset recurrence schedules.

In this article:

 

Manually Restart or Resubmit Failed Jobs

Failed Jobs in JAMS can be manually Restarted or manually Resubmitted.

Restarting a Job will release an existing Job instance to run again, and will preserve existing Parameters associated with that Job instance. Most users will wish to Restart their failed Jobs to preserve Parameters. 

Resubmitting a Job will create a new Job instance (with a new Entry ID number), and will not preserve the Parameters of the failed Job. This method would be useful in cases where a Job failed due to invalid Parameters, as it allows the user to set new Parameters on the Resubmit Job.

Manually Restart a failed Job:

  1. Right-Click on the failed Job from within the Monitor.
  2. Select Release from the menu that appears.
    V7ReleaseFailedJob.png
  3. Ensure Release to run again is checked, and set other options as needed.
    V7ReleaseFailedJobDialog.png
  4. Click OK. The existing Job instance will be restarted, preserving any existing Parameters.

Manually Restart a Failed Job within a Sequence

Jobs within Sequences may halt their Sequence upon failure.

If the released Job exists within a Sequence, ensure the Sequence is also released from its halted state.

  1. Right-Click on the failed Job from within the Monitor.
  2. Select Release from the menu that appears.
    V7ReleaseFailedJobinSequence.png
  3. Ensure Release to run again is checked, and set other options as needed.
    V7ReleaseFailedJobinSequenceDialog.png
  4. Click OK. The released Job will begin executing.

Manually Resubmit a failed Job:

  1. Select the Job from the Monitor or History view.
  2. Click the Submit button on the ribbon bar.
    V7ResubmitFailedJob.png
  3. Enter all applicable information and set Parameters as necessary in the Submit dialog.
  4. Click Submit Run Request. The new Job instance will be created. 

Return to Top

 

 

Recovery Options for Jobs

Using JAMS Recovery options, a Job can be configured to automatically retry on a failure. Like restarting a Job, a Job Recovery will re-run an existing Job instance, preserving any existing Parameters on the Job.

If a Job with Recovery options inside of a Sequence or Workflow fails, the Job will automatically execute its retry options within the Sequence or Workflow.

Settings to know

The Minimum Completion Severity is the level at which users want to take action on the Job.

Retry Count is the maximum number of retries if the Job fails. For instance, a Job with a Retry Count of 3 would attempt to execute a total of 4 times before ultimately failing - once from the initial submission, then three retry attempts.

Retry Interval is the wait time interval between attempts to run the Job. This is configured as dd.HH:MM:SS. Note that the interval is measured between the end time of the previous failure and the Scheduled time of the retry attempt.

Elements to know

Recovery Job Elements allow users to respond to a failure by running a secondary Job. For some, this may be a cleanup job to deal with the failure event. Note that users should NEVER attempt to configure a Job as its own recovery Job, as this could cause an endless loop of failed Jobs. If the Job should execute again on failure, set a Retry Count.

Documentation Elements contain the instructions that are included in the notification email sent on a Job failure. 

Configure Retry Properties at the Job Level

  1. Open the Job Properties dialog for the Job in question, then select the Properties tab.
  2. In the Completion section, adjust or add the acceptable Completion Severity. In the Retry section, adjust or add the Retry Count, and Retry Interval options as desired. In the example below, the Job is configured to Retry 3 times after Error or worse, with an interval of one minute between retries.
    V7RecoveryOptions.png
  3. Save and Close the Job. The Job will now retry in the event of a failure.

Return to Top

 

 

Recovery Options for Sequences

Sequences are defined as Jobs, so every recovery option available for an individual Job is available on a Sequence Job, including Minimum Completion Severity, Retry Count, Retry Interval, Recovery Job, and Documentation.

Jobs within a Sequence will automatically execute their recovery options, but a failed Job within a Sequence will not cause the Sequence itself to fail unless a FailureAction task is configured. 

 

Configure Retry Options on a Sequence

  1. Open the Sequence Properties dialog, then select the Properties tab.
  2. Adjust the Retry Count and Retry Interval options for the Sequence as desired.
    V7SequenceRetryProperties.png
  3. Select the Source tab, then add a Failure Action where Job failures should cause the Sequence to Fail.
    V7SequenceFailureAction.png
  4. Save and Close the Sequence.

Return to Top

 

 

Recovery Options for Workflows

A Workflow Job's Recovery options are the same as other JAMS Jobs, with Minimum Completion Severity, Retry Count, and Retry Interval. Like Sequences, Workflows will not fail unless they are configured to do so. 

Jobs and Sequences inside the Workflow

Jobs and Sequences inside of a Workflow exist as SubmitEntry activities. The Workflow can be configured to wait for a Job activity to complete, to wait after a Job activity fails, or to continue on in the Workflow, even before a Job completes.

To configure Job activity options, users can select the Job in question and use the Wait and WaitAfterFailure properties. Setting a Job to Wait will not allow the Workflow to continue past the Job activity that is waiting, until the Job activity initially completes. WaitAfterFailure, if checked, will cause the Workflow to halt on the failure of the Job activity. Halting the Workflow using WaitAfterFailure is a useful way to allow a Job or Sequence activity to attempt its own recovery options before the Workflow continues.

Causing a Workflow to Fail

A Workflow will not fail based on the failure of an individual Job activity unless the Completion Action for that Job activity has either a TerminateWorkflow or Throw activity configured. TerminateWorkflow and Throw activities are not limited to Completion Actions; they can also be configured in line with any other activity or logic that should cause a Workflow to fail. 

Configure Retry Options for Workflow Jobs

  1. Open the Job Properties dialog and select the Properties tab.
  2. Adjust the Retry Count and Retry Interval options as desired.
    V7WorkflowRetryProperties.png
  3. Select the Source tab to view the Workflow Editor.
  4. Ensure that Wait and WaitAfterFailure properties have been configured as needed on Job and Sequence activities.
    WorkflowSubmitEntryWaits.png
  5. Ensure that TerminateWorkflow or Throw activities have been configured for situations that should cause the entire Workflow to fail. In the image below, the Workflow will terminate if the Sleep60 Job fails.
    WorkflowTerminateWorkflowActivity.png
  6. Save and Close the Workflow Job. 

Return to Top

 

 

Failed JAMS Jobs with Recurrences

JAMS Jobs with recurrences will resubmit (creating a new Job instance with a unique Entry ID number) on configured intervals, until a set end time. When a recurring Job fails, it will either cease to create any further recurrences or continue normally depending on the Job's Resubmit on Error setting.

Configure Resubmit Options on a Recurring Job

  1. Open the Job Properties dialog and select the Elements tab.
  2. If a Resubmit Element already exists, edit it. Otherwise, add a new Resubmit Element.
    V7ResubmitAddElement.png
  3. In the Resubmit Element Properties, set the Resubmit options as desired. In the example below, the Job is configured to Resubmit every 90 seconds until 11:59 PM.
    V7ResubmitElement.png
  4. Set Resubmit on Error as desired. If a recurrence fails with this box unchecked, there will be no further recurrent submissions of the Job. Checking the box will cause the recurrent submissions to continue after a failure is observed.
  5. Save and Close the Job.

NOTE: If no Until: value is specified, the Job will resubmit until midnight.

Return to Top 

Have more questions? Submit a request

Comments