Skip to content

FLTest

Configuration reference

Configuration reference¶

An experiment is a single test_conf.yaml. Any top-level FL knob may be a scalar or a list — a list marks it for fuzzing. The runs: block lists the frameworks to execute (this is what enables cross-framework differential testing).

Full example¶

name: my_eval                    # report name
device: cpu                      # cpu | mps | cuda
seed: 786
deterministic: true

# ---- data ----
dataset: [mnist, cifar10]        # list => fuzzed
data_distribution: [iid, dirichlet]   # iid | dirichlet | pathological
dirichlet_alpha: 0.5             # lower = more non-IID (only for dirichlet)
classes_per_partition: 2         # only for pathological
dataset_partitions: 100          # how finely to shard before taking num_clients shards

# ---- model ----
model_name: LeNet                # LeNet | ConvNet | MLP

# ---- FL parameters ----
num_clients: 10
num_rounds: 10
client_epochs: 1
client_lr: 0.01
client_batch_size: 32
server_batch_size: 256
max_test_data_size: 2048
optimizer: SGD                   # SGD | Adam
loss_fn: CrossEntropyLoss

# ---- plugins ----
attacks:
  - {name: backdoor, params: {target_label: 0, infection_rate: 0.3}, target_clients: [0, 1]}
defenses:
  - {name: median}
metrics: [accuracy, loss, per_client]

# ---- which frameworks to run ----
runs:
  - {framework: reference, name: reference}
  - {framework: flwr,      name: flower}
  - {framework: nvflare,   name: nvflare}

# ---- what to assert ----
testing:
  differential: {enabled: true, mode: cross_framework, metric: accuracy, tolerance: 0.05}
  metamorphic:
    - {relation: clients_scale, values: [10, 20], metric: accuracy, tolerance: 0.05}

Knob reference¶

Reproducibility / hardware¶

Key	Default	Notes
`seed`	786	seeds python/numpy/torch
`device`	`cpu`	`cpu` is deterministic; `mps`/`cuda` for speed
`deterministic`	`true`	load cached identical initial weights for all clients/frameworks
`total_cpus` / `total_gpus`	4 / 0	Flower/Ray resource pool

Data¶

Key	Default	Notes
`dataset`	`mnist`	see Datasets
`data_distribution`	`iid`	`iid`, `dirichlet` (label skew), `pathological` (N classes/client)
`dirichlet_alpha`	0.5	only used when distribution is `dirichlet`; lower ⇒ more heterogeneous
`classes_per_partition`	2	only used when distribution is `pathological`
`dataset_partitions`	100	the dataset is split into this many shards; the first `num_clients` are used. Keep it fixed while sweeping `num_clients` so per-client data size is comparable.
`max_test_data_size`	2048	size of the central test subset (keeps eval fast)

Model¶

Key	Default	Options
`model_name`	`LeNet`	`LeNet` (32×32 conv), `ConvNet` (smooth activations, for DLG), `MLP` (fast)

FL parameters¶

Key	Default	Notes
`num_clients`	10	participants per round (all participate; `fraction_fit=1`)
`num_rounds`	10	global aggregation rounds
`client_epochs`	1	local epochs per round
`client_lr`	0.01	local learning rate
`client_batch_size`	32	local batch size
`optimizer`	`SGD`	`SGD`, `Adam`

Plugins¶

attacks: / defenses: — lists of {name, params, target_clients?}. See Attacks and Defenses for each plugin's parameters. target_clients (attacks only) restricts the attack to those client ids; omit for all.
metrics: — list of metric-listener names. accuracy and loss are always produced; add per_client for personalized evaluation. See Metrics.

`runs:`¶

A list of {framework, name?, ...overrides}. One entry per framework you want to execute. A run entry may also override any top-level knob and may carry its own attacks/defenses/ metrics. Frameworks: reference, flwr/flower, nvflare/flare.

`testing:`¶

differential: — see Differential testing.
metamorphic: — a list of relations; see Metamorphic testing.

Where parameters come from in code¶

Defaults and validation live in fltest/core/config.py (TestConfig and RunSpec). The fuzzable knob list is FUZZABLE_KNOBS in the same file.