Scald
Scalable Collaborative Agents for Data Science
Scald automates machine learning workflows using collaborative AI agents and the Model Context Protocol. Unlike traditional AutoML frameworks that rely on exhaustive search or rigid pipelines, Scald employs two specialized agents—Actor and Critic—that iteratively refine solutions through feedback loops.
Core Approach
The Actor agent analyzes data, engineers features, and trains models using six specialized MCP servers as tools. The Critic agent evaluates each solution and provides targeted feedback. Through iterative refinement (typically 5 cycles), this collaboration produces optimized models while learning from past experiences via ChromaDB-based memory.
Scald supports classification and regression tasks using gradient boosting algorithms (CatBoost, LightGBM, XGBoost), with automatic EDA, preprocessing, and hyperparameter tuning.
Quick Start
from scald import Scald
scald = Scald(max_iterations=5)
predictions = await scald.run(
train_path="train.csv",
test_path="test.csv",
target="price",
task_type="regression"
)
Why Scald?
Traditional AutoML performs exhaustive grid searches or follows predefined strategies. Scald's agents reason about data characteristics, adapt strategies dynamically, and transfer knowledge between tasks. This results in higher quality solutions with fewer wasted iterations and transparent, interpretable decision-making throughout the process.
Architecture
The system orchestrates Actor-Critic loops with workspace isolation, comprehensive logging, and cost tracking. Each session produces artifacts, predictions, and detailed execution logs for full reproducibility.
Navigation
- Installation - Setup in minutes
- Quick Start - First AutoML task
- Architecture - System design
- Actor-Critic Pattern - Agent collaboration
- MCP Servers - Available tools