Documentation Index
Fetch the complete documentation index at: https://internal.nolano.ai/llms.txt
Use this file to discover all available pages before exploring further.
ModelConfig
Model architecture specification. Supports major dense and MoE Hugging Face architectures including Qwen, LLaMA, Gemma.
Weight initialization strategy:
"none": Load from pre-trained model (Qwen/LLaMA/Gemma)"normal": Normal distribution initialization"xavier_uniform": Xavier uniform initialization"wang_init": Wang initialization method
Path to pre-trained model for continual training. Must be
None if init_method is not "none".Whether to load optimizer state from checkpoint. Set to
True for continual training from checkpoint.Model precision configuration:
"binary": Binary precision (1-bit)"ternary": Ternary precision (1.58-bit)"int2": 2-bit integer precision"fp8": 8-bit floating point"mxfp4": 4-bit microscaling floating point"mxfp6": 6-bit microscaling floating point"ue8m0": 8-bit unsigned integer with 0 exponent bits"fp16": 16-bit floating point (default)"fp32": 32-bit floating point

