arxiv:2512.11192
Pedro Ortiz Suarez
AI & ML interests
Language modeling, parsing, sequence tagging, NER, historical languages.
Recent Activity
published
a dataset
about 8 hours ago
commoncrawl/CommonLID
updated
a dataset
about 22 hours ago
commoncrawl/CommonLID
authored
a paper
12 days ago
SciLaD: A Large-Scale, Transparent, Reproducible Dataset for Natural Scientific Language Processing