Model Card for fvossel/t5-base-nl-to-fol

This model is a fully fine-tuned version of google-t5/t5-base. It was trained to translate natural language statements into First-Order Logic (FOL) representations.

Model Details

Model Description

  • Developed by: Vossel et al. at Osnabrück University
  • Funded by: Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) 456666331
  • Model type: Encoder-decoder sequence-to-sequence model (T5 architecture)
  • Language(s) (NLP): English, FOL
  • License: This model was fine-tuned from google/t5-base, which is released under the Apache 2.0 License, and is itself released under the Apache 2.0 License.
  • Finetuned from model: google/t5-base

Uses

Direct Use

This model is designed to translate natural language (NL) sentences into corresponding first-order logic (FOL) expressions. Use cases include:

  • Automated semantic parsing and formalization of NL statements into symbolic logic.
  • Supporting explainable AI systems that require symbolic reasoning based on language input.
  • Research in neurosymbolic AI, logic-based natural language understanding, and formal verification.
  • Integration into pipelines for natural language inference, question answering, or knowledge base population.

Users should verify and validate symbolic formulas generated by the model for correctness depending on the application.

Downstream Use

This model can be further fine-tuned or adapted for domain-specific formalization tasks (e.g., legal, biomedical). Suitable for interactive systems requiring formal reasoning.

Out-of-Scope Use

  • Not designed for general natural language generation.
  • May struggle with ambiguous, highly figurative, or out-of-domain input.
  • Outputs should not be used as final decisions in critical areas without expert review.

Recommendations

  • Validate outputs carefully before use in critical applications.
  • Be aware of possible biases from training data and synthetic data sources.
  • Specialized for English NL and FOL; may not generalize to other languages or logics.
  • Use human-in-the-loop workflows for sensitive tasks.
  • Intended for research and prototyping, not standalone critical systems.

How to Get Started with the Model

import torch
from transformers import T5Tokenizer, T5ForConditionalGeneration

# Load tokenizer and model
model_path = "fvossel/t5-base-nl-to-fol"
tokenizer = T5Tokenizer.from_pretrained(model_path)
model = T5ForConditionalGeneration.from_pretrained(model_path).to("cuda")

# Example NL input
nl_input = "All dogs are animals."

# Preprocess prompt
input_text = "translate English natural language statements into first-order logic (FOL): " + nl_input
inputs = tokenizer(input_text, return_tensors="pt", padding=True).to("cuda")

# Generate prediction
with torch.no_grad():
    outputs = model.generate(
        inputs["input_ids"],
        max_length=256,
        min_length=1,
        num_beams=5,
        length_penalty=2.0,
        early_stopping=True,
    )

# Decode and print result
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Training Details

Training Data

The model was fine-tuned on two datasets:

  • WillowNLtoFOL: Contains over 16,000 NL-FOL pairs. Published in:
    Deveci, İ. E. (2024). Transformer models for translating natural language sentences into formal logical expressions.
    Licensed under CC BY-NC-ND 4.0.

  • MALLS-v0: 34,000 NL-FOL pairs generated by GPT-4, syntactically checked.
    Licensed under Attribution-NonCommercial 4.0, subject to OpenAI terms.

Training Procedure

The model was fully fine-tuned (no LoRA) from google/t5-base with:

  • Prompt-based instruction tuning
  • Single-GPU training with float32 precision
  • Preprocessing replaced FOL quantifiers (e.g., ∀) with tokens like FORALL
  • Maximum input/output sequence length was 250 tokens

Training Hyperparameters

  • Training regime: float32 precision
  • Batch size: 8 (per device)
  • Learning rate: 0.001
  • Number of epochs: 12
  • Optimizer: AdamW
  • Adam epsilon: 1e-8
  • Scheduler: Linear warmup with 500 steps
  • Gradient accumulation steps: 1
  • Weight decay: 0.01
  • LoRA: Not used (full fine-tuning)
  • Task type: SEQ_2_SEQ_LM
  • Early stopping patience: 4 epochs
  • Evaluation strategy: per epoch
  • Save strategy: per epoch
  • Save total limit: 5 checkpoints
  • Best model selection metric: eval_loss
Downloads last month
21
Safetensors
Model size
223M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for fvossel/t5-base-nl-to-fol

Base model

google-t5/t5-base
Finetuned
(624)
this model

Datasets used to train fvossel/t5-base-nl-to-fol

Collection including fvossel/t5-base-nl-to-fol