> ## Documentation Index
> Fetch the complete documentation index at: https://internal.nolano.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# EvaluationConfig

> API reference for evaluation configuration

## EvaluationConfig

<ParamField path="model_path" type="str" required>
  Path to the trained model checkpoint directory (e.g., `/path/to/checkpoint/global_step_XXXXX`)
</ParamField>

<ParamField path="data_config" type="DataConfig" required>
  Configuration for evaluation data. Similar to training data config but typically with `validation_split=1.0`
</ParamField>

<ParamField path="eval_metrics" type="str | List[str] | callable" default="Auto-selected based on training objective">
  Evaluation metrics to compute:

  * For text/code models: `"perplexity"`, `"accuracy"`, `"bleu"`, `"rouge"`
  * For time series: `"mse"`, `"mae"`, `"mape"`, `"smape"`, `"quantile_loss"`
  * Custom callable functions with signature: `(predictions, targets) → metric_value`
</ParamField>

<ParamField path="batch_size" type="int" default="32">
  Batch size for evaluation.
</ParamField>

<ParamField path="output_predictions" type="bool" default="False">
  Whether to save predictions to file.
</ParamField>

<ParamField path="output_path" type="str | None" default="model_path + '/evaluation'">
  Directory to save evaluation results and predictions.
</ParamField>

<ParamField path="eval_steps" type="int | None" default="None">
  Maximum number of evaluation steps. Set to `None` for full dataset evaluation.
</ParamField>

## InferenceConfig

<ParamField path="batch_size" type="int" default="1">
  Batch size for inference.
</ParamField>

<ParamField path="max_new_tokens" type="int" default="512">
  Maximum number of new tokens to generate (for generative models).
</ParamField>

<ParamField path="temperature" type="float" default="1.0">
  Sampling temperature for text generation. Higher values increase randomness.
</ParamField>

<ParamField path="top_p" type="float" default="1.0">
  Nucleus sampling parameter. Only consider tokens with cumulative probability up to this value.
</ParamField>

<ParamField path="top_k" type="int | None" default="None">
  Only consider the k most likely tokens at each step.
</ParamField>

<ParamField path="do_sample" type="bool" default="True">
  Whether to use sampling for generation. If False, uses greedy decoding.
</ParamField>

<ParamField path="repetition_penalty" type="float" default="1.0">
  Penalty for token repetition. Values > 1.0 discourage repetition.
</ParamField>

<ParamField path="length_penalty" type="float" default="1.0">
  Penalty for sequence length. Values > 1.0 encourage longer sequences.
</ParamField>

<ParamField path="device" type="str" default="auto">
  Device for inference ('cuda', 'cpu', 'auto').
</ParamField>
