Python API
Use Scald programmatically for full control over AutoML workflows.
Basic Usage
import asyncio
from scald import Scald
async def main():
scald = Scald(max_iterations=5)
predictions = await scald.run(
train_path="data/train.csv",
test_path="data/test.csv",
target="target_column",
task_type="classification"
)
print(f"Generated {len(predictions)} predictions")
asyncio.run(main())
API Reference
Scald(max_iterations=5) creates an instance. The parameter controls Actor-Critic refinement cycles.
await scald.run(train_path, test_path, target, task_type) executes the AutoML workflow. Returns a list of predictions matching test data rows. Task type must be "classification" or "regression".
Examples
Classification:
async def classify():
scald = Scald(max_iterations=5)
predictions = await scald.run(
train_path="customers_train.csv",
test_path="customers_test.csv",
target="will_purchase",
task_type="classification"
)
return predictions
results = asyncio.run(classify())
Regression:
async def predict_prices():
scald = Scald(max_iterations=3)
predictions = await scald.run(
train_path="housing_train.csv",
test_path="housing_test.csv",
target="sale_price",
task_type="regression"
)
return predictions
results = asyncio.run(predict_prices())
Return Values
The run() method returns predictions as a list. For classification, elements are class labels (int or str). For regression, elements are numeric values (float). Length matches the number of test data rows.
Error Handling
try:
predictions = await scald.run(...)
except FileNotFoundError:
print("Data file missing")
except ValueError:
print("Invalid parameters")
except Exception as e:
print(f"Execution error: {e}")
Batch Processing
Process multiple datasets sequentially:
async def process_batch(datasets):
scald = Scald(max_iterations=5)
results = {}
for name, config in datasets.items():
predictions = await scald.run(**config)
results[name] = predictions
return results
datasets = {
"housing": {
"train_path": "housing_train.csv",
"test_path": "housing_test.csv",
"target": "price",
"task_type": "regression"
},
"churn": {
"train_path": "churn_train.csv",
"test_path": "churn_test.csv",
"target": "churned",
"task_type": "classification"
}
}
results = asyncio.run(process_batch(datasets))
Async Context
Scald uses async/await for non-blocking execution. Always use await with run() or wrap in asyncio.run() for top-level calls.
Continue to Configuration for advanced settings.