Skip to main content

Experiment Configuration

ExperimentConfig

The main configuration class that orchestrates all aspects of model training.
data_configs
DataConfig or List[DataConfig]
required
Data configuration(s) for potentially multi-objective modeling
optimization_config
OptimizationConfig
required
Optimization and training parameters
model_config
ModelConfig
required
Model architecture and initialization parameters
meta_config
MetaConfig
default:"Uses MetaConfig defaults"
Metadata and run-specific parameters

MetaConfig

Configuration for experiment metadata and checkpointing behavior.
name
str
default:"trial-run"
Name identifier for this experimental run
seed
int
default:"42"
Random seed for reproducible training
save_path
str
default:"current working directory / run_name"
Directory path for saving model checkpoints
model_save_frequency
int
default:"-1"
Frequency (in steps) for saving model checkpoints. Set to -1 to save only at the end of training.
max_checkpoints
int
default:"-1"
Maximum number of model checkpoints to retain. Set to -1 for no limit.
wandb_config
WandbConfig or None
default:"None"
Weights & Biases logging configuration for experiment tracking and visualization

WandbConfig

Configuration for Weights & Biases experiment tracking and logging.
project
str
required
Weights & Biases project name for organizing experiments
entity
str or None
default:"None"
Weights & Biases team/organization name. If None, uses the default entity associated with your API key.
run_name
str or None
default:"None"
Custom run name for the experiment. If None, uses the MetaConfig name or auto-generates one.
tags
List[str] or None
default:"None"
List of tags to associate with the run for easy filtering and organization
notes
str or None
default:"None"
Optional notes or description for the experiment run
log_model
bool
default:"False"
Whether to log the model as a Weights & Biases artifact for version control
log_frequency
int
default:"100"
Frequency (in steps) for logging metrics to Weights & Biases
log_gradients
bool
default:"False"
Whether to log gradient histograms (can impact performance)
log_parameters
bool
default:"False"
Whether to log parameter histograms (can impact performance)
watch_model
str or None
default:"None"
Model watching mode for logging gradients and parameters:
  • "gradients": Log gradient histograms
  • "parameters": Log parameter histograms
  • "all": Log both gradients and parameters
  • None: Disable model watching
config
dict or None
default:"None"
Additional configuration dictionary to log to Weights & Biases

DataConfig

Configuration for training data and objectives. Can be specified as a single instance or list for multi-task learning.
data_paths
str or List[str]
required
Path(s) to preprocessed data files
features
str, List[str], callable, List[callable], or None
default:"None"
Feature engineering functions for lag tokens (historical lag features) and exogenous variables (external variables). Can be string identifier(s) or custom function(s).
sampling_weight
float or None
default:"Equal weight among all data configs"
Relative sampling weight for this data source (normalized to sum to 1 across all data configs).
training_objective
str or callable
default:"cross_entropy"
Loss function specification:
  • "cross_entropy": Chronos-style or text cross-entropy loss
  • "mse": Mean Squared Error (TimesFM-style)
  • "quantile" or "pinball": Quantile/Pinball loss (TiRex-style)
  • "multi_task": Multi-task learning (TimesFM 2.0-style)
  • Custom callable loss function
validation_split
float
default:"0.1"
Portion of the dataset to use as validation data (0.0-1.0, where 1.0 means all data is validation).
At least one DataConfig must have validation_split < 1.0 for training to proceed.

OptimizationConfig

Configuration for training optimization parameters.
total_training_steps
int
required
Total number of training steps for the experiment
max_learning_rate
float
required
Maximum learning rate value
global_batch_size
int
required
Global batch size for training
learning_rate_schedule
str or callable
default:"constant"
Learning rate scheduling strategy:
  • String options: "constant", "linear", "cosine", "exponential"
  • Custom function with signature: (learning_rate, current_step, total_steps) → decayed_rate
Warmup is applied after this schedule and must be disabled separately if not needed
warmup_steps
int
default:"0"
Number of learning rate warmup steps
decay_steps
int
default:"0"
Number of learning rate decay steps. Must be set to 0 when using custom learning rate schedules.
Constraint: warmup_steps + decay_steps ≤ total_training_steps
min_learning_rate
float or None
default:"max_learning_rate / 10"
Minimum learning rate value
optimizer_type
str
default:"Adam"
Optimizer algorithm. Options: "Adam", "SGD", "Lion"
weight_decay
float
default:"0.01"
L2 regularization coefficient
z_loss
float
default:"0.0"
Z-loss regularization coefficient. Set to 0.0 to disable.
load_balancing
float or None
default:"None"
Load balancing coefficient for Mixture of Experts (MoE) models. Only applicable for MoE architectures.
clip_grad
float
default:"1.0"
Gradient clipping threshold based on global L2 norm

Example Configurations

from pynolano import ExperimentConfig, DataConfig, ModelConfig, OptimizationConfig

def build() -> ExperimentConfig:
    return ExperimentConfig(
        data_configs=DataConfig(data_paths="./prepared_data"),
        model_config=ModelConfig("Qwen/Qwen3-4B"),
        optimization_config=OptimizationConfig(
            total_training_steps=1000,
            max_learning_rate=3e-4,
            global_batch_size=32
        )
    )

Additional Features

Hyperparameter Sweep

Hyperparameter sweep functionality is currently in development and will be available in a future release.

convert_to_hf()

The convert_to_hf() function converts trained Nolano.AI models to Hugging Face format for easy sharing and deployment.
pynolano.convert_to_hf(
    input_dir: str,
    config_file: str,
    output_dir: str,
    upload: bool = False
)
input_dir
str
required
Path to the checkpoint directory (e.g., /path/to/checkpoint/global_step_XXXXX)
config_file
str
required
Path to the model configuration YAML file used during training
output_dir
str
required
Destination directory for the converted Hugging Face model
upload
bool
default:"False"
Whether to directly upload the converted model to Hugging Face Hub