feat: add experimental campaign protocol — 8-phase research workflow #25

Merged
magnus merged 1 commit from feat/campaign-protocol into main 2026-05-23 17:16:40 -04:00
Owner

Phase 2 of #22

Adds references/experimental-campaign-protocol.md — a structured 8-phase workflow for data science research campaigns:

  1. Problem Formulation
  2. Baseline Heuristic (with sklearn Pipeline examples)
  3. Bleeding-Edge Survey (research discovery workflow)
  4. Moonshot Experiments (with PyTorch training loop templates)
  5. Transfer Learning (feature extraction, LoRA, fine-tuning)
  6. Hyperparameter Optimization (Optuna with pruning)
  7. Distillation (knowledge distillation, pruning, quantization)
  8. Synthesis & Handoff

Test results: 28/28 passing

Key design decisions:

  • Entry/exit criteria and failure modes for every phase
  • Executable code snippets (not just prose guidance)
  • Cross-references to all companion references in the skill
  • Appendix with skip-guide, directory structure, seed-setting, and logging template
**Phase 2 of #22** Adds `references/experimental-campaign-protocol.md` — a structured 8-phase workflow for data science research campaigns: 1. Problem Formulation 2. Baseline Heuristic (with sklearn Pipeline examples) 3. Bleeding-Edge Survey (research discovery workflow) 4. Moonshot Experiments (with PyTorch training loop templates) 5. Transfer Learning (feature extraction, LoRA, fine-tuning) 6. Hyperparameter Optimization (Optuna with pruning) 7. Distillation (knowledge distillation, pruning, quantization) 8. Synthesis & Handoff **Test results:** 28/28 passing **Key design decisions:** - Entry/exit criteria and failure modes for every phase - Executable code snippets (not just prose guidance) - Cross-references to all companion references in the skill - Appendix with skip-guide, directory structure, seed-setting, and logging template
Structured protocol for running data science research campaigns:
- Phase 1-8 workflow from problem formulation through synthesis
- Entry/exit criteria and failure modes for every phase
- Executable code examples: sklearn pipelines, PyTorch training loops,
  Optuna HP search, distillation, pruning
- See Also references to all companion documents

Part of #22
magnus merged commit b13b3917e8 into main 2026-05-23 17:16:40 -04:00
jasper left a comment

Review: feat/campaign-protocol → main

Reviewer: Jasper (automated review)
Files: 4 changed, +686 / -0
Tests: 28/28 passing ✓


Overview

Well-structured addition. The 8-phase protocol follows a logical progression (boring → smart), with consistent entry/exit criteria and failure modes per phase. The test suite gives good structural coverage.

Issues Found

Bug — TensorBoard step counter (Phase 4, line ~344):
writer.add_scalar("train/loss", loss.item(), epoch) inside the batch loop uses epoch as the global step. All batches in the same epoch get the same step number, so TensorBoard overwrites rather than accumulates. Fix with a per-batch global_step counter. (Inline comment posted.)

Suggestion — Dead scheduler code (Phase 4, line ~340):
ReduceLROnPlateau is instantiated but never stepped. For plateau schedulers, scheduler.step(val_loss) is required after each validation epoch. Either add the call or switch to a self-stepping scheduler like CosineAnnealingLR. (Inline comment posted.)

Forward-reference risk (Phase 3 and See Also section):
The protocol references 6 companion documents (pytorch-integration.md, sklearn-integration.md, data-science-coding-workflow.md, subagent-experiment-supervision.md, docker-experiment-isolation.md, detect-compute.py). If these don't exist yet, they'll be dead links for any agent that loads this skill. Consider adding stub files or a note about what's planned.

Observations

  • Skip-guide appendix is excellent — the decision table telling agents when to skip to specific phases is exactly the kind of meta-instruction that makes agent skills work.
  • Seed-setting snippet is correct, but worth noting that torch.backends.cudnn.deterministic = True carries a significant performance penalty (2-10x slower on some GPU operations). Consider adding a tradeoff note.
  • Experiment logging JSON template is well-structured and will produce consistent records across campaigns.
  • Test script is thorough and correctly validates structural elements.

Verdict

Looks good overall. The two code issues in the PyTorch template should be fixed before merge. The forward-reference concern is a judgment call — if the companion docs are being tracked in the same issue (#22) and this PR is just the first delivery, note that in the PR body or README to set expectations.

## Review: feat/campaign-protocol → main **Reviewer:** Jasper (automated review) **Files:** 4 changed, +686 / -0 **Tests:** 28/28 passing ✓ --- ### Overview Well-structured addition. The 8-phase protocol follows a logical progression (boring → smart), with consistent entry/exit criteria and failure modes per phase. The test suite gives good structural coverage. ### Issues Found **Bug — TensorBoard step counter (Phase 4, line ~344):** `writer.add_scalar("train/loss", loss.item(), epoch)` inside the batch loop uses `epoch` as the global step. All batches in the same epoch get the same step number, so TensorBoard overwrites rather than accumulates. Fix with a per-batch `global_step` counter. *(Inline comment posted.)* **Suggestion — Dead scheduler code (Phase 4, line ~340):** `ReduceLROnPlateau` is instantiated but never stepped. For plateau schedulers, `scheduler.step(val_loss)` is required after each validation epoch. Either add the call or switch to a self-stepping scheduler like `CosineAnnealingLR`. *(Inline comment posted.)* **Forward-reference risk (Phase 3 and See Also section):** The protocol references 6 companion documents (`pytorch-integration.md`, `sklearn-integration.md`, `data-science-coding-workflow.md`, `subagent-experiment-supervision.md`, `docker-experiment-isolation.md`, `detect-compute.py`). If these don't exist yet, they'll be dead links for any agent that loads this skill. Consider adding stub files or a note about what's planned. ### Observations - **Skip-guide appendix** is excellent — the decision table telling agents when to skip to specific phases is exactly the kind of meta-instruction that makes agent skills work. - **Seed-setting snippet** is correct, but worth noting that `torch.backends.cudnn.deterministic = True` carries a significant performance penalty (2-10x slower on some GPU operations). Consider adding a tradeoff note. - **Experiment logging JSON template** is well-structured and will produce consistent records across campaigns. - **Test script** is thorough and correctly validates structural elements. ### Verdict **Looks good overall.** The two code issues in the PyTorch template should be fixed before merge. The forward-reference concern is a judgment call — if the companion docs are being tracked in the same issue (#22) and this PR is just the first delivery, note that in the PR body or README to set expectations.
Sign in to join this conversation.
No reviewers
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
magnus/agent-skills!25
No description provided.