Model Card for CDT-Task-Tagger

This model is a component of the Cognition-Domain-Task (CDT) framework, a comprehensive capability framework for Large Language Models presented in our paper CDT: A Comprehensive Capability Framework for Large Language Models Across Cognition, Domain, and Task. It has been specifically fine-tuned to classify a given instruction into one of 33 domains as defined by the CDT framework.

Model Details

Model Description

This model categorizes any given instruction into one of 33 predefined knowledge domains, pinpointing the subject area of the request.

  • Model type: Qwen2ForCausalLM
  • Language(s) (NLP): English
  • License: Apache 2.0
  • Finetuned from model: Qwen2.5-7B-Base

Model Sources

Basic Usage

Please refer to https://github.com/Alessa-mo/CDT. You can run the following scripts to tag the cognition labels.

cd tag_annotate
export CUDA_VISIBLE_DEVICES=0
python annotate.py \
    --data_path path/to/your/data \
    --output_dir path/to/output/dir \
    --model_path CDT-Domain-Tagger \
    --prompt_file ./prompt/annotation_prompt.jsonl \
    --cognition_skill_file ./prompt/cognition.json \
    --domain_skill_file ./prompt/domain.json \
    --task_skill_file ./prompt/task.json \
    --tag_type "Domain" \
    --batch_size 32

Note: Make sure your data is a JSON file and has the following format:

[
    {
        "messages": [
            {
                "role": "user",
                "content": "xxxx"
            },
            {
                "role": "assistant",
                "content": "xxxx"
            }
        ]
    },
]

Citation

If you find this model useful, please cite:

@misc{mo2025cdtcomprehensivecapabilityframework,
      title={CDT: A Comprehensive Capability Framework for Large Language Models Across Cognition, Domain, and Task}, 
      author={Haosi Mo and Xinyu Ma and Xuebo Liu and Derek F. Wong and Yu Li and Jie Liu and Min Zhang},
      year={2025},
      eprint={2509.24422},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2509.24422}, 
}
Downloads last month
14
Safetensors
Model size
8B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for Alessamo/CDT-Domain-Tagger

Base model

Qwen/Qwen2.5-7B
Finetuned
(691)
this model
Quantizations
1 model