File size: 3,987 Bytes

---
license: apache-2.0
language:
- pl
base_model:
- sdadas/polish-roberta-large-v2
pipeline_tag: text-classification
library_name: transformers
tags:
- news
---


### Description
`polarity3c` is a classification model that is specialized for determining the polarity of texts from news portals. It was learned mostly on texts in Polish.

<center><img src="https://cdn-uploads.huggingface.co/production/uploads/644addfe9279988e0cbc296b/v6pz2sBwc3GCPL1Il8wVP.png" width=20%></center>

Annotations from the plWordnet were used as the basis for the data. A pre-learned model on these annotations, served as the model in Human in the loop,
to support the annotations for teaching the final model.  The final model was learned on web content; data was manually collected and annotated.

As a model base, the `sdadas/polish-roberta-large-v2` model was used with a classification head. More about model construction is on our [blog](https://radlab.dev/2025/06/01/polaryzacja-3c-model-z-plg-na-hf/).

### Architecture
```
RobertaForSequenceClassification(
  (roberta): RobertaModel(
    (embeddings): RobertaEmbeddings(
      (word_embeddings): Embedding(128001, 1024, padding_idx=1)
      (position_embeddings): Embedding(514, 1024, padding_idx=1)
      (token_type_embeddings): Embedding(1, 1024)
      (LayerNorm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
      (dropout): Dropout(p=0.1, inplace=False)
    )
    (encoder): RobertaEncoder(
      (layer): ModuleList(
        (0-23): 24 x RobertaLayer(
          (attention): RobertaAttention(
            (self): RobertaSdpaSelfAttention(
              (query): Linear(in_features=1024, out_features=1024, bias=True)
              (key): Linear(in_features=1024, out_features=1024, bias=True)
              (value): Linear(in_features=1024, out_features=1024, bias=True)
              (dropout): Dropout(p=0.1, inplace=False)
            )
            (output): RobertaSelfOutput(
              (dense): Linear(in_features=1024, out_features=1024, bias=True)
              (LayerNorm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
              (dropout): Dropout(p=0.1, inplace=False)
            )
          )
          (intermediate): RobertaIntermediate(
            (dense): Linear(in_features=1024, out_features=4096, bias=True)
            (intermediate_act_fn): GELUActivation()
          )
          (output): RobertaOutput(
            (dense): Linear(in_features=4096, out_features=1024, bias=True)
            (LayerNorm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
            (dropout): Dropout(p=0.1, inplace=False)
          )
        )
      )
    )
  )
  (classifier): RobertaClassificationHead(
    (dense): Linear(in_features=1024, out_features=1024, bias=True)
    (dropout): Dropout(p=0.1, inplace=False)
    (out_proj): Linear(in_features=1024, out_features=3, bias=True)
  )
)
```

### Usage
Example of use with transformers pipeline:
```[python]
from transformers import pipeline

classifier = pipeline(model="radlab/polarity-3c", task="text-classification")

classifier("Text to classification")
```

with sample data and `top_k=3`:
```[python]
classifier("""
  Po upadku reżimu Asada w Syrii, mieszkańcy, borykający się z ubóstwem,
  zaczęli tłumnie poszukiwać skarbów, zachęceni legendami o zakopanych
  bogactwach i dostępnością wykrywaczy metali, które stały się popularnym
  towarem. Mimo, że działalność ta jest nielegalna, rząd przymyka oko,
  a sprzedawcy oferują urządzenia nawet dla dzieci. Poszukiwacze skupiają
  się na obszarach historycznych, wierząc w legendy o skarbach ukrytych
  przez starożytne cywilizacje i wojska osmańskie, choć eksperci ostrzegają
  przed fałszywymi monetami i kradzieżą artefaktów z muzeów.""",
  top_k=3
)
```
the output is:
```
[{'label': 'ambivalent', 'score': 0.9995126724243164},
 {'label': 'negative', 'score': 0.00024663121439516544},
 {'label': 'positive', 'score': 0.00024063512682914734}]
```