Skip to main content

OptimizationConfig

total_training_steps
int
required
Total number of training steps for the experiment
max_learning_rate
float
required
Maximum learning rate value
global_batch_size
int
required
Global batch size for training
learning_rate_schedule
str | callable
default:"constant"
Learning rate scheduling strategy:
  • String options: "constant", "linear", "cosine", "exponential"
  • Custom function with signature: (learning_rate, current_step, total_steps) → decayed_rate
Note: Warmup is applied after this schedule and must be disabled separately if not needed
warmup_steps
int
default:"0"
Number of learning rate warmup steps.
decay_steps
int
default:"0"
Number of learning rate decay steps. Must be set to 0 when using custom learning rate schedules.Constraint: warmup_steps + decay_steps ≤ total_training_steps
min_learning_rate
float | None
default:"max_learning_rate / 10"
Minimum learning rate value.
optimizer_type
str
default:"Adam"
Optimizer algorithm. Options: "Adam", "SGD", "Lion"
weight_decay
float
default:"0.01"
L2 regularization coefficient.
z_loss
float
default:"0.0"
Z-loss regularization coefficient. Set to 0.0 to disable.
load_balancing
float | None
default:"None"
Load balancing coefficient for Mixture of Experts (MoE) models. Only applicable for MoE architectures.
clip_grad
float
default:"1.0"
Gradient clipping threshold based on global L2 norm.

ExperimentConfig

data_configs
DataConfig | List[DataConfig]
required
Data configuration(s) for potentially multi-objective modeling
optimization_config
OptimizationConfig
required
Optimization and training parameters
model_config
ModelConfig
required
Model architecture and initialization parameters
meta_config
MetaConfig
default:"Uses MetaConfig defaults"
Metadata and run-specific parameters

MetaConfig

name
str
default:"trial-run"
Name identifier for this experimental run.
seed
int
default:"42"
Random seed for reproducible training.
save_path
str
default:"current working directory / run_name"
Directory path for saving model checkpoints.
model_save_frequency
int
default:"-1"
Frequency (in steps) for saving model checkpoints. Set to -1 to save only at the end of training.
max_checkpoints
int
default:"-1"
Maximum number of model checkpoints to retain. Set to -1 for no limit.

DataConfig

data_paths
str | List[str]
required
Path(s) to preprocessed data files
features
str | List[str] | callable | List[callable] | None
default:"None"
Feature engineering functions for lag tokens (historical lag features) and exogenous variables (external variables). Can be string identifier(s) or custom function(s).
sampling_weight
float | None
default:"Equal weight among all data configs"
Relative sampling weight for this data source (normalized to sum to 1 across all data configs).
training_objective
str | callable
default:"cross_entropy"
Loss function specification:
  • "cross_entropy": Chronos-style or text cross-entropy loss
  • "mse": Mean Squared Error (TimesFM-style)
  • "quantile" or "pinball": Quantile/Pinball loss (TiRex-style)
  • "multi_task": Multi-task learning (TimesFM 2.0-style)
  • Custom callable loss function
validation_split
float
default:"0.1"
Portion of the dataset to use as validation data (0.0-1.0, where 1.0 means all data is validation).
At least one DataConfig must have validation_split < 1.0 for training to proceed.