🦉 CodeModernBERT-Owl v1.0: 高精度なコード検索 & コード理解モデル

CodeModernBERT-Owl v1.0 is a pretrained model designed from scratch for code search and code understanding tasks.

This model now supports Rust and improves search accuracy in Python, PHP, Java, JavaScript, Go, and Ruby.

🛠️ 主な特徴 / Key Features

パラメータ / Parameter	値 / Value
vocab_size	50,004
hidden_size	768
num_hidden_layers	12
num_attention_heads	12
intermediate_size	3,072
max_position_embeddings	8,192 (trained with 2048)
type_vocab_size	2
hidden_dropout_prob	0.1
attention_probs_dropout_prob	0.1
local_attention_window	128
rope_theta	160,000
local_attention_rope_theta	10,000

言語 / Language	CodeModernBERT-Owl-1.0	CodeT5+	GraphCodeBERT	CodeBERTa-small	CodeBERT
Python	0.8936	0.8048	0.3496	0.6123	0.0927
Java	0.8479	0.7853	0.3299	0.4738	0.0816
JavaScript	0.7711	0.7111	0.2581	0.3593	0.0692
PHP	0.8056	0.7893	0.2507	0.4533	0.0623
Ruby	0.7993	0.7201	0.3186	0.4418	0.0762
Go	0.8426	0.7577	0.4453	0.5338	0.0856

✅ CodeModernBERT-Owl-1.0 (Mean Pooling) achieves the best MRR across all evaluated languages.

📄 Apache-2.0

📩 For any questions, please contact: 📧 shun0212114@outlook.jp