Data-Scientist Skill v2: Experimental Campaign Pipeline + Infrastructure Awareness + Subagent Supervision #22
Labels
No labels
community-feedback
enhancement
skill-upgrade
No milestone
No project
No assignees
2 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
magnus/agent-skills#22
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Feature Suite: Data-Scientist Skill Upgrades
Based on feedback from neopabo (Nous Research Discord) who reviewed the existing skill. Three interlocking features that transform the skill from a statistical reference into an experimental research conductor.
1. Experimental Campaign Protocol
The current skill is strong on methodology ("which test do I use?") but silent on process ("how do I run a research campaign?"). Add a new protocol/workflow layer:
Deliverables:
references/experimental-campaign-protocol.md— structured workflow with phase gatesSKILL.md— add "Running a research campaign?" branch to the question classifier2. Infrastructure Awareness ("Know Your Compute")
neopabo: "You need to make the agent aware of its available compute. Like, check vram and GPU, CUDA availability, pytorch, etc."
The current skill lists Python dependencies but has no mechanism for the agent to understand what hardware it's working with. This is critical because:
Deliverables:
scripts/detect-compute.py— probes GPU model, VRAM, CUDA version, PyTorch availability, RAM, disk space, outputs structured JSON3. Subagent Supervision for Self-Healing Experiments
neopabo: "You can use subagents to supervise and auto-repair experiments. This saves a RIDICULOUS amount of time. And if things break in a way that requires human intervention, I have them text me on Telegram."
This is the most powerful and most novel feature. The pattern:
pip installmissing packageDeliverables:
references/subagent-experiment-supervision.md— pattern description, failure catalog with fixes, Telegram alert protocolSKILL.md— reference this pattern in the "Campaign" workflow pathdelegate_task, OpenCode subagents, etc.) — should be framed as a harness-specific patternRelated: Docker Isolation
neopabo: "Run experiments isolated in docker containers so they don't crash the entire PC."
A companion concern. Add guidance for Docker-based experiment isolation, container resource limits (memory, CPU, GPU device reservation), and log collection from containers.
Deliverable:
Implementation Order
scripts/detect-compute.py— standalone, no dependencies on other featuresreferences/experimental-campaign-protocol.md— the workflow layerreferences/subagent-experiment-supervision.md— the agentic automation layerOpen Questions
research-campaign) that the data-scientist skill references, or live inside the data-scientist skill?Triage by Jasper (automated)
Assigned to @magnus. No labels configured on this repo yet — consider adding a label taxonomy (e.g.
enhancement,skill,data-science,discussion) for future triage workflows.Assessment: Three well-scoped features with clear deliverables. The implementation order in the issue body is sound —
detect-compute.pyis standalone and delivers immediate value independent of the other two.Recommendations on the open questions:
Separate skill vs. within data-scientist? — The campaign protocol could live as its own
research-campaignskill that the data-scientist skill references, mirroring howgrokto-crawlis separate fromweb-search. This keeps the data-scientist skill focused on methodology (statistical decision tree) while the campaign skill owns workflow orchestration. They reference each other via SKILL.md cross-links.Harness abstraction for subagent supervision? — Given the user runs Hermes, start with a Hermes-native implementation using
delegate_task. Document that pattern concretely in the reference. If interest emerges from other harnesses, add a companion doc at that point. Premature abstraction is more costly than porting later.Bleeding-edge survey automation level? — The survey should auto-run as a step in the campaign protocol but present its findings to the user for approval before proceeding to moonshot experiments. Automated discovery, human-directed selection. The agent does the reading; the human does the prioritizing.
Docker isolation — Worth including as a section in the campaign protocol rather than a standalone reference, since it ties directly to the experimental runner workflow.
No blocking issues found. Ready for feature work.
Triage by Jasper (automated)
Assessment: Well-scoped feature suite with clear implementation order. All three features are self-consistent with the existing skill architecture.
Labels Applied
Implementation Order Confirmed
Open Questions for Magnus
Prioritization suggestion: detect-compute.py can land as a standalone PR immediately — zero-risk and unlocks downstream features.
All features implemented across 4 PRs:
Closing — all deliverables complete.