Model Card for CDT-Task-Tagger

This model is a component of the Cognition-Domain-Task (CDT) framework, a comprehensive capability framework for Large Language Models presented in our paper CDT: A Comprehensive Capability Framework for Large Language Models Across Cognition, Domain, and Task. It has been specifically fine-tuned to classify a given instruction into one of 33 domains as defined by the CDT framework.

Model Details

Model Description

This model categorizes any given instruction into one of 33 predefined knowledge domains, pinpointing the subject area of the request.

Model type: Qwen2ForCausalLM
Language(s) (NLP): English
License: Apache 2.0
Finetuned from model: Qwen2.5-7B-Base

Model Sources

Repository: https://github.com/Alessa-mo/CDT
Paper Link: https://arxiv.org/abs/2509.24422

Basic Usage

Please refer to https://github.com/Alessa-mo/CDT. You can run the following scripts to tag the cognition labels.

cd tag_annotate
export CUDA_VISIBLE_DEVICES=0
python annotate.py \
    --data_path path/to/your/data \
    --output_dir path/to/output/dir \
    --model_path CDT-Domain-Tagger \
    --prompt_file ./prompt/annotation_prompt.jsonl \
    --cognition_skill_file ./prompt/cognition.json \
    --domain_skill_file ./prompt/domain.json \
    --task_skill_file ./prompt/task.json \
    --tag_type "Domain" \
    --batch_size 32

Note: Make sure your data is a JSON file and has the following format:

[
    {
        "messages": [
            {
                "role": "user",
                "content": "xxxx"
            },
            {
                "role": "assistant",
                "content": "xxxx"
            }
        ]
    },
]

Citation

If you find this model useful, please cite:

@misc{mo2025cdtcomprehensivecapabilityframework,
      title={CDT: A Comprehensive Capability Framework for Large Language Models Across Cognition, Domain, and Task}, 
      author={Haosi Mo and Xinyu Ma and Xuebo Liu and Derek F. Wong and Yu Li and Jie Liu and Min Zhang},
      year={2025},
      eprint={2509.24422},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2509.24422}, 
}

Downloads last month: 14

Safetensors

Model size

8B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Alessamo/CDT-Domain-Tagger

Base model

Qwen/Qwen2.5-7B

Finetuned

(691)

this model

Quantizations

1 model