Custom Configuration

Advanced model and training parameters can be configured via a YAML file passed with the --custom_config CLI argument. When not specified, defaults are used.

Example usage:

eir_auto_gp_multi_task \
    --genotype_data_path data/ \
    --label_file_path data/labels.csv \
    --global_output_folder runs/my_run \
    --output_con_columns trait_a trait_b \
    --custom_config my_config.yaml

Example YAML file:

use_fc0_skips: true
use_lcl_to_output_skips: false
use_lcl_fusion_skips: true
fusion_model_type: mlp-residual-sum
batch_size: 64
optimize_model: true

Reference

class eir_auto_gp.multi_task.custom_config.CustomConfig(use_lcl_to_output_skips: bool | str = False, use_fc0_to_output_skips: bool = False, use_fc0_to_fusion_skips: bool = False, weighted_sampling: str = 'auto', optimize_model: bool = False, modelling_data_format: str = 'disk', n_fusion_layers: int | None = None, fusion_dim: int | None = None, skip_to_every_n_fusion_layers: int | None = None, n_output_layers: int | None = None, output_dim: int | str | None = None, batch_size: int | None = None, fusion_model_type: str = 'mlp-residual-sum', mgmoe_num_experts: int = 8, output_num_experts: int | None = None, expert_groups_file: str | None = None, informed_moe_fusion_factor: int = 1, adversarial_enabled: bool = True, adversarial_lambda: float = 0.5, channel_exp_base: int = 3)

Advanced configuration for model architecture and training.

These parameters can be set via a YAML file passed with --custom_config. When not specified, defaults are used.

Parameters:

use_lcl_to_output_skips – Controls LCL block skip connections to output heads. When True, fc_1 and fc_2 intermediate features are cached and sent to output heads alongside fc_0_output. When "fc_1_only", only fc_1 is used (more parameter-efficient). When False, output heads receive only fc_0_output.
weighted_sampling – Controls weighted sampling during training. "auto" enables it only when there are categorical targets but no continuous targets. "true"/"false" force it on/off.
optimize_model – Enables model optimizations including torch.compile and mixed precision (bf16) training when supported by hardware.
modelling_data_format – Storage format for data during modelling. "disk" reads from disk (lower memory), "memory" loads all data into RAM (faster), "auto" decides based on dataset size.
n_fusion_layers – Number of fusion layers. When set, all granular architecture parameters (fusion_dim, skip_to_every_n_fusion_layers, n_output_layers, output_dim) must also be specified. Cannot be combined with model_size.
fusion_dim – Dimension of fusion layers.
skip_to_every_n_fusion_layers – Tensor broker skip connection frequency to fusion layers.
n_output_layers – Number of layers in shared MLP residual output heads.
output_dim – Dimension of shared MLP residual output head layers. When "auto", scales per output group based on number of targets: ≤20 targets → 512, >20 targets → 1024.
batch_size – Training batch size. When None, automatically determined based on dataset size.
use_fc0_to_output_skips – When True, the fc_0 layer output is cached and sent via tensor broker to output heads. For informed MoE, routes each expert’s fc_0 to its corresponding output group.
use_fc0_to_fusion_skips – When True, the fc_0 layer output is cached and sent via tensor broker to fusion layers. For informed MoE, distributes each expert’s fc_0 round-robin across fusion layers.
fusion_model_type – Fusion module architecture type. "mlp-residual-sum" uses a standard MLP-residual fusion. "mgmoe" uses Multi-Gate Mixture of Experts fusion.
mgmoe_num_experts – Number of experts when fusion_model_type is "mgmoe". Ignored for other fusion model types.
output_num_experts – If set, splits the shared branch in shared_mlp_residual output heads into this many expert sub-branches (each with output_dim // num_experts width). Each target learns a static gating weight over the experts. Only used when output groups are enabled (i.e. shared_mlp_residual output head). If None, uses a single shared branch.
adversarial_enabled – Enables adversarial disentanglement training when tabular inputs and output groups are both present. The adversarial head encourages the genotype encoder to learn features that are independent of tabular covariates.
adversarial_lambda – Weight of the adversarial loss term. Higher values enforce stronger disentanglement between genotype and tabular features.
channel_exp_base – Base exponent for the number of channels in the genome-local-net. The number of channel feature sets is 2**channel_exp_base.

adversarial_enabled: bool = True

adversarial_lambda: float = 0.5

batch_size: int | None = None

channel_exp_base: int = 3

expert_groups_file: str | None = None

classmethod from_yaml(path: str | Path) → CustomConfig

fusion_dim: int | None = None

fusion_model_type: str = 'mlp-residual-sum'

informed_moe_fusion_factor: int = 1

mgmoe_num_experts: int = 8

modelling_data_format: str = 'disk'

n_fusion_layers: int | None = None

n_output_layers: int | None = None

optimize_model: bool = False

output_dim: int | str | None = None

output_num_experts: int | None = None

skip_to_every_n_fusion_layers: int | None = None

to_dict() → dict[str, Any]

use_fc0_to_fusion_skips: bool = False

use_fc0_to_output_skips: bool = False

use_lcl_to_output_skips: bool | str = False

weighted_sampling: str = 'auto'