Custom Configuration
Advanced model and training parameters can be configured
via a YAML file passed with the --custom_config CLI argument.
When not specified, defaults are used.
Example usage:
eir_auto_gp_multi_task \
--genotype_data_path data/ \
--label_file_path data/labels.csv \
--global_output_folder runs/my_run \
--output_con_columns trait_a trait_b \
--custom_config my_config.yaml
Example YAML file:
use_fc0_skips: true
use_lcl_to_output_skips: false
use_lcl_fusion_skips: true
fusion_model_type: mlp-residual-sum
batch_size: 64
optimize_model: true
Reference
- class eir_auto_gp.multi_task.custom_config.CustomConfig(use_lcl_to_output_skips: bool | str = False, use_fc0_to_output_skips: bool = False, use_fc0_to_fusion_skips: bool = False, weighted_sampling: str = 'auto', optimize_model: bool = False, modelling_data_format: str = 'disk', n_fusion_layers: int | None = None, fusion_dim: int | None = None, skip_to_every_n_fusion_layers: int | None = None, n_output_layers: int | None = None, output_dim: int | str | None = None, batch_size: int | None = None, fusion_model_type: str = 'mlp-residual-sum', mgmoe_num_experts: int = 8, output_num_experts: int | None = None, expert_groups_file: str | None = None, informed_moe_fusion_factor: int = 1, adversarial_enabled: bool = True, adversarial_lambda: float = 0.5, channel_exp_base: int = 3)
Advanced configuration for model architecture and training.
These parameters can be set via a YAML file passed with
--custom_config. When not specified, defaults are used.- Parameters:
use_lcl_to_output_skips – Controls LCL block skip connections to output heads. When
True, fc_1 and fc_2 intermediate features are cached and sent to output heads alongside fc_0_output. When"fc_1_only", only fc_1 is used (more parameter-efficient). WhenFalse, output heads receive only fc_0_output.weighted_sampling – Controls weighted sampling during training.
"auto"enables it only when there are categorical targets but no continuous targets."true"/"false"force it on/off.optimize_model – Enables model optimizations including
torch.compileand mixed precision (bf16) training when supported by hardware.modelling_data_format – Storage format for data during modelling.
"disk"reads from disk (lower memory),"memory"loads all data into RAM (faster),"auto"decides based on dataset size.n_fusion_layers – Number of fusion layers. When set, all granular architecture parameters (
fusion_dim,skip_to_every_n_fusion_layers,n_output_layers,output_dim) must also be specified. Cannot be combined withmodel_size.fusion_dim – Dimension of fusion layers.
skip_to_every_n_fusion_layers – Tensor broker skip connection frequency to fusion layers.
n_output_layers – Number of layers in shared MLP residual output heads.
output_dim – Dimension of shared MLP residual output head layers. When
"auto", scales per output group based on number of targets: ≤20 targets → 512, >20 targets → 1024.batch_size – Training batch size. When
None, automatically determined based on dataset size.use_fc0_to_output_skips – When
True, the fc_0 layer output is cached and sent via tensor broker to output heads. For informed MoE, routes each expert’s fc_0 to its corresponding output group.use_fc0_to_fusion_skips – When
True, the fc_0 layer output is cached and sent via tensor broker to fusion layers. For informed MoE, distributes each expert’s fc_0 round-robin across fusion layers.fusion_model_type – Fusion module architecture type.
"mlp-residual-sum"uses a standard MLP-residual fusion."mgmoe"uses Multi-Gate Mixture of Experts fusion.mgmoe_num_experts – Number of experts when
fusion_model_typeis"mgmoe". Ignored for other fusion model types.output_num_experts – If set, splits the shared branch in
shared_mlp_residualoutput heads into this many expert sub-branches (each withoutput_dim // num_expertswidth). Each target learns a static gating weight over the experts. Only used when output groups are enabled (i.e.shared_mlp_residualoutput head). IfNone, uses a single shared branch.adversarial_enabled – Enables adversarial disentanglement training when tabular inputs and output groups are both present. The adversarial head encourages the genotype encoder to learn features that are independent of tabular covariates.
adversarial_lambda – Weight of the adversarial loss term. Higher values enforce stronger disentanglement between genotype and tabular features.
channel_exp_base – Base exponent for the number of channels in the genome-local-net. The number of channel feature sets is
2**channel_exp_base.
- adversarial_enabled: bool = True
- adversarial_lambda: float = 0.5
- batch_size: int | None = None
- channel_exp_base: int = 3
- expert_groups_file: str | None = None
- classmethod from_yaml(path: str | Path) CustomConfig
- fusion_dim: int | None = None
- fusion_model_type: str = 'mlp-residual-sum'
- informed_moe_fusion_factor: int = 1
- mgmoe_num_experts: int = 8
- modelling_data_format: str = 'disk'
- n_fusion_layers: int | None = None
- n_output_layers: int | None = None
- optimize_model: bool = False
- output_dim: int | str | None = None
- output_num_experts: int | None = None
- skip_to_every_n_fusion_layers: int | None = None
- to_dict() dict[str, Any]
- use_fc0_to_fusion_skips: bool = False
- use_fc0_to_output_skips: bool = False
- use_lcl_to_output_skips: bool | str = False
- weighted_sampling: str = 'auto'