# ABX-based evaluation ABX is used to evaluate the quality of the obtained discrete units. The life cycle of the ABX-based evaluation for the Speech-to-Unit contains the following steps: 1. Training an acoustic model (or use an existing acoustic model) ([description](./../..)) 2. Perform quantization of speech by learning a K-means clustering model ([description](./../..)) 3. Compute discrete features for ABX computation using the learned clusters 4. Compute the ABX score over the discrete features taking advantage of [libri-light's ABX evaluation script][ll-abx] Here we assume that you already went throught the first two steps and focus solely on extracting features and computing ABX scores. ## Libri-light setup Follow [libri-light's instructions][ll-instructions] for installation and [ABX evaluation setup][ll-abx] (including the download of the data items required for ABX computation). ## Computing ABX ### Dumping quantized features The first step for the ABX computation is to dump the quantized representations corresponding to the test files. ```shell TYPE="hubert" LAYER=6 CKPT_PATH="" KM_MODEL_PATH="" SUBSET="dev-clean" MANIFEST="" DATA_DIR="/$SUBSET" PYTHONPATH=. python examples/textless_nlp/gslm/metrics/abx_metrics/dump_abx_feats.py \ --feature_type $TYPE \ --kmeans_model_path $KM_MODEL_PATH \ --checkpoint_path $CKPT_PATH \ --layer $LAYER \ --manifest_path $MANIFEST \ --out_dir_path $DATA_DIR \ --extension ".flac" ``` Again the manifest file follows the same structure than elsewhere in the codebase. ### Compute ABX with Libri-light Use libri-light's `eval_ABX.py` script (within the appropriate environment set up) as followed: ```shell LIBRILIGHT_ROOT="" SUBSET="dev-clean" DATA_DIR="/$SUBSET" ITEM_FILE_PATH="$LIBRILIGHT_ROOT/eval/ABX_data/$SUBSET.item" OUT_DIR="/$SUBSET" FILE_EXTENSION=".npy" FEATURE_SIZE=0.02 # depends on the model used PYTHONPATH=$LIBRILIGHT_ROOT \ python $LIBRILIGHT_ROOT/eval/eval_ABX.py \ $DATA_DIR \ $ITEM_FILE_PATH \ --file_extension $FILE_EXTENSION \ --feature_size $FEATURE_SIZE \ --out $OUT_DIR \ --mode "all" ``` Note that `FEATURE_SIZE` will depend on the model type you are using to extract the acoustic features: * For HuBERT and Wav2Vec2.0, use `FEATURE_SIZE=0.02` * For CPC and Log Mel, use `FEATURE_SIZE=0.01` If you have a gpu available, make sure you add the `--cuda` flag for faster computation. [ll-instructions]: https://github.com/facebookresearch/libri-light [ll-abx]: https://github.com/facebookresearch/libri-light/tree/master/eval#abx