Model Details

This Sentence-BERT model maps sentences and paragraphs to a 768-dimensional dense vector space. It was fine-tuned for semantic search using the multi-qa-mpnet-base-cos-v1 model as a base on 2,917 question-answer pairs observed during the Question Period in the Canadian House of Commons from the 39^th to the 43^rd legislatures. The model can be used to evaluate the quality of responses in political Q&A sessions, including parliamentary questions.

Developed by: R. Michael Alvarez and Jacob Morrier
Model Type: Sentence-BERT
Language: English
License: MIT
Fine-tuned from: multi-qa-mpnet-base-cos-v1

Uses

The model identifies the most relevant answer to a question and evaluates the quality of responses in political Q&A sessions.

Bias, Risks, and Limitations

Our article discusses the model’s biases, risks, and limitations, along with its application in evaluating the quality of responses in political Q&A settings. In particular, we emphasize the need for caution when applying the model outside the original context of the Question Period, due to potential domain drift.

How to Get Start with the Model

Inference with this model is straightforward using the sentence-transformers library. You can use the following code to compute the cosine similarity between questions and answers:

from sentence_transformers import SentenceTransformer, util

model = SentenceTransformer('jacobmorrier/political-answer-quality')

questions_emb = model.encode(questions)

answers_emb = model.encode(answers)

cos_sim = util.cos_sim(questions_emb, answers_emb).cpu()

Training Details

Training Data

The training data consists of 2,917 question-answer pairs from the Question Period in the Canadian House of Commons collected between the 39^th and 43^rd legislatures, spanning fifteen years from the January 23, 2006, election to the September 20, 2021, election.

Training Hyperparameters

Parameter	Value
Loss Function	Multiple Negatives Ranking Loss (with questions as anchors)
Epochs	10
Batch Size	8
Optimizer	AdamW
Learning Rate	2e-5
Learning Rate Scheduler	Warm-up Linear
Warm-up Steps	10,000
Weight Decay	0.01
Maximum Gradient Norm	1

Citation

Alvarez, R. Michael and Jacob Morrier (2025). Measuring the Quality of Answers in Political Q&As with Large Language Models. https://doi.org/10.48550/arXiv.2404.08816

jacobmorrier
/

political-answer-quality