feat: add rollback_on_failure feature #88

Merged
Philipp merged 3 commits from feature/rollback-on-failure into main 2026-05-17 13:13:59 +02:00
Owner

This PR adds the ability to download existing code before deployment and automatically roll back if the post-deployment monitor detects a failure.

This PR adds the ability to download existing code before deployment and automatically roll back if the post-deployment monitor detects a failure.
Philipp added 1 commit 2026-05-16 23:20:29 +02:00
feat: add rollback_on_failure feature
Lint / pre-commit Linting (push) Successful in 50s
Test / Run Tests (push) Successful in 1m13s
f9920db232
CODEOWNERS rules requested review from AutoReview 2026-05-16 23:20:29 +02:00
AutoReview approved these changes 2026-05-16 23:20:42 +02:00
AutoReview left a comment
Member

Summary

  • Critical: None.
  • Testing: No new test files were added to verify the rollback logic. It is recommended to add a test case to simulate a monitor failure and verify api.code.set is called with the oldCode.
  • Improvement: The check for shouldFail manually duplicates logic that is partially handled by applyOnAction. Also, if rollback_on_failure is enabled but the initial code download fails, the action continues without a safety net; consider if it should fail early if a requested rollback capability cannot be guaranteed.
  • Nitpick: Added a note to GEMINI.md regarding pre-commit hooks, which seems slightly out of scope for this feature but harmless.

The PR correctly implements the requested functionality by backing up the existing code before deployment when the rollback_on_failure flag is set, and restoring it if the console monitor detects a failure condition.

**Summary** * **Critical:** None. * **Testing:** No new test files were added to verify the rollback logic. It is recommended to add a test case to simulate a monitor failure and verify `api.code.set` is called with the `oldCode`. * **Improvement:** The check for `shouldFail` manually duplicates logic that is partially handled by `applyOnAction`. Also, if `rollback_on_failure` is enabled but the initial code download fails, the action continues without a safety net; consider if it should fail early if a requested rollback capability cannot be guaranteed. * **Nitpick:** Added a note to `GEMINI.md` regarding pre-commit hooks, which seems slightly out of scope for this feature but harmless. The PR correctly implements the requested functionality by backing up the existing code before deployment when the `rollback_on_failure` flag is set, and restoring it if the console monitor detects a failure condition.
Member

The api.auth call is now explicitly awaited outside the conditional block. This is cleaner than the previous .then() chain.

The `api.auth` call is now explicitly awaited outside the conditional block. This is cleaner than the previous `.then()` chain.
Philipp marked this conversation as resolved
@@ -183,0 +210,4 @@
`Successfully downloaded existing code (modules: ${Object.keys(oldCode).join(", ")})`,
);
} else {
core.warning(
Member

If the user explicitly requested rollback_on_failure, and the download fails, should the action continue? Currently it results in a warning. If the deployment is critical, failing here might be safer.

If the user explicitly requested `rollback_on_failure`, and the download fails, should the action continue? Currently it results in a warning. If the deployment is critical, failing here might be safer.
Philipp marked this conversation as resolved
@@ -215,6 +248,28 @@ export async function postCode() {
core.setOutput("saw_error_log", String(result.sawErrorLog));
core.setOutput("saw_warning_log", String(result.sawWarningLog));
let shouldFail = false;
Member

This block duplicates the logic found inside applyOnAction. While necessary to determine if a rollback should occur before the final setFailed call, it might be worth refactoring to avoid inconsistent behavior if more failure conditions are added in the future.

This block duplicates the logic found inside `applyOnAction`. While necessary to determine if a rollback should occur before the final `setFailed` call, it might be worth refactoring to avoid inconsistent behavior if more failure conditions are added in the future.
Philipp marked this conversation as resolved
Philipp added 1 commit 2026-05-16 23:30:30 +02:00
refactor: apply PR feedback for rollback feature
Lint / pre-commit Linting (push) Successful in 46s
Test / Run Tests (push) Successful in 1m6s
dce79ba192
AutoReview approved these changes 2026-05-16 23:30:45 +02:00
AutoReview left a comment
Member

Summary

The PR introduces a useful rollback_on_failure feature that automatically reverts to the previous code version if the log monitor detects issues (tracebacks, errors, or warnings) based on user configuration. The implementation is generally sound and well-tested, though there are minor improvements to be made regarding configuration consistency.

  • Critical: None.
  • Testing: New tests have been added to __tests__/index.test.js to cover the rollback logic and mock the Screeps API. This provides good coverage for the feature.
  • Improvement:
    • In index.js, the default values for on_traceback, on_error_log, and on_warning_log are hardcoded in the postCode function. While these match action.yaml, it would be safer to rely on the core.getInput({ default: '...' }) or ensure the fallback logic is unified to prevent drifts.
    • The rollback_on_failure logic correctly aborts the deployment if the "old code" cannot be downloaded, which addresses previous concerns about safety.
  • Nitpick:
    • Documentation in GEMINI.md was updated with a specific note about pre-commit hook timing which seems specific to an LLM development workflow but doesn't hurt.

Approval Policy: The code is well-structured, incorporates previous feedback (explicitly awaiting auth), and includes necessary test cases.

Summary

  • Critical: None.
  • Testing: mocking and logic verification for rollback added to __tests__/index.test.js.
  • Improvement: Centralize default values for monitor actions to avoid duplication between action.yaml and index.js.
  • Nitpick: None.
**Summary** The PR introduces a useful `rollback_on_failure` feature that automatically reverts to the previous code version if the log monitor detects issues (tracebacks, errors, or warnings) based on user configuration. The implementation is generally sound and well-tested, though there are minor improvements to be made regarding configuration consistency. * **Critical:** None. * **Testing:** New tests have been added to `__tests__/index.test.js` to cover the rollback logic and mock the Screeps API. This provides good coverage for the feature. * **Improvement:** * In `index.js`, the default values for `on_traceback`, `on_error_log`, and `on_warning_log` are hardcoded in the `postCode` function. While these match `action.yaml`, it would be safer to rely on the `core.getInput({ default: '...' })` or ensure the fallback logic is unified to prevent drifts. * The `rollback_on_failure` logic correctly aborts the deployment if the "old code" cannot be downloaded, which addresses previous concerns about safety. * **Nitpick:** * Documentation in `GEMINI.md` was updated with a specific note about pre-commit hook timing which seems specific to an LLM development workflow but doesn't hurt. **Approval Policy:** The code is well-structured, incorporates previous feedback (explicitly awaiting `auth`), and includes necessary test cases. **Summary** * **Critical:** None. * **Testing:** mocking and logic verification for rollback added to `__tests__/index.test.js`. * **Improvement:** Centralize default values for monitor actions to avoid duplication between `action.yaml` and `index.js`. * **Nitpick:** None.
@@ -218,2 +255,2 @@
applyOnAction(
core.getInput("on_traceback"),
const fail1 = applyOnAction(
core.getInput("on_traceback") || "fail",
Member

Consider defining these defaults in a central place or ensuring they are strictly pulled from action.yaml. If action.yaml defaults change but these fallbacks don't, the behavior might become inconsistent.

Consider defining these defaults in a central place or ensuring they are strictly pulled from `action.yaml`. If `action.yaml` defaults change but these fallbacks don't, the behavior might become inconsistent.
Philipp marked this conversation as resolved
Philipp added 1 commit 2026-05-16 23:39:35 +02:00
refactor: centralize monitor action default values
Lint / pre-commit Linting (push) Successful in 54s
Test / Run Tests (push) Successful in 1m18s
0212889604
AutoReview approved these changes 2026-05-16 23:39:49 +02:00
AutoReview left a comment
Member

Summary

  • Critical: None.
  • Testing: The PR includes a new test case in __tests__/index.test.js covering the rollback logic, which is good.
  • Improvement:
    • The rollback logic uses api.code.set(branch, oldCode), which effectively overwrites the "failed" deploy. However, if the "failed" deploy was intended for a specific branch, ensure that oldCode (retrieved via api.code.get) doesn't accidentally revert to an even older version if multiple deploys happen in rapid succession, though this is likely out of scope for the action's current architecture.
    • Refactored applyOnAction to return a boolean effectively addresses the previous reviewer's concern about logic duplication.
  • Nitpick:
    • In index.js, the rollbackOnFailure constant is initialized inside a try-catch block but then used outside. While functional, core.getBooleanInput is generally reliable if the action.yaml default is set correctly.

The changes effectively implement a safety net for Screeps deployments by leveraging the console monitor to detect runtime failures immediately after push.

**Summary** * **Critical:** None. * **Testing:** The PR includes a new test case in `__tests__/index.test.js` covering the rollback logic, which is good. * **Improvement:** * The rollback logic uses `api.code.set(branch, oldCode)`, which effectively overwrites the "failed" deploy. However, if the "failed" deploy was intended for a specific branch, ensure that `oldCode` (retrieved via `api.code.get`) doesn't accidentally revert to an even older version if multiple deploys happen in rapid succession, though this is likely out of scope for the action's current architecture. * Refactored `applyOnAction` to return a boolean effectively addresses the previous reviewer's concern about logic duplication. * **Nitpick:** * In `index.js`, the `rollbackOnFailure` constant is initialized inside a `try-catch` block but then used outside. While functional, `core.getBooleanInput` is generally reliable if the `action.yaml` default is set correctly. The changes effectively implement a safety net for Screeps deployments by leveraging the console monitor to detect runtime failures immediately after push.
@@ -233,0 +284,4 @@
`Successfully rolled back to previous code on branch ${branch}.`,
);
} catch (err) {
core.error(`Rollback failed: ${err}`);
Member

Wait-for-ticks logic in monitorConsole handles the timing, but consider if a failed rollback should also setFailed the action explicitly if api.code.set fails during the rollback phase. Currently, it only logs an error.

Wait-for-ticks logic in `monitorConsole` handles the timing, but consider if a failed rollback should also `setFailed` the action explicitly if `api.code.set` fails during the rollback phase. Currently, it only logs an error.
Philipp marked this conversation as resolved
Philipp merged commit d0a08da728 into main 2026-05-17 13:13:59 +02:00
Philipp deleted branch feature/rollback-on-failure 2026-05-17 13:13:59 +02:00
Sign in to join this conversation.
No Reviewers
2 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: Screeps/screeps-deploy-action#88