File size: 3,987 Bytes
168592a
 
 
 
 
8f0d850
168592a
 
 
 
017acf2
 
 
 
 
 
 
 
 
 
 
91d3d1a
017acf2
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
---
license: apache-2.0
language:
- pl
base_model:
- sdadas/polish-roberta-large-v2
pipeline_tag: text-classification
library_name: transformers
tags:
- news
---


### Description
`polarity3c` is a classification model that is specialized for determining the polarity of texts from news portals. It was learned mostly on texts in Polish.

<center><img src="https://cdn-uploads.huggingface.co/production/uploads/644addfe9279988e0cbc296b/v6pz2sBwc3GCPL1Il8wVP.png" width=20%></center>

Annotations from the plWordnet were used as the basis for the data. A pre-learned model on these annotations, served as the model in Human in the loop,
to support the annotations for teaching the final model.  The final model was learned on web content; data was manually collected and annotated.

As a model base, the `sdadas/polish-roberta-large-v2` model was used with a classification head. More about model construction is on our [blog](https://radlab.dev/2025/06/01/polaryzacja-3c-model-z-plg-na-hf/).

### Architecture
```
RobertaForSequenceClassification(
  (roberta): RobertaModel(
    (embeddings): RobertaEmbeddings(
      (word_embeddings): Embedding(128001, 1024, padding_idx=1)
      (position_embeddings): Embedding(514, 1024, padding_idx=1)
      (token_type_embeddings): Embedding(1, 1024)
      (LayerNorm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
      (dropout): Dropout(p=0.1, inplace=False)
    )
    (encoder): RobertaEncoder(
      (layer): ModuleList(
        (0-23): 24 x RobertaLayer(
          (attention): RobertaAttention(
            (self): RobertaSdpaSelfAttention(
              (query): Linear(in_features=1024, out_features=1024, bias=True)
              (key): Linear(in_features=1024, out_features=1024, bias=True)
              (value): Linear(in_features=1024, out_features=1024, bias=True)
              (dropout): Dropout(p=0.1, inplace=False)
            )
            (output): RobertaSelfOutput(
              (dense): Linear(in_features=1024, out_features=1024, bias=True)
              (LayerNorm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
              (dropout): Dropout(p=0.1, inplace=False)
            )
          )
          (intermediate): RobertaIntermediate(
            (dense): Linear(in_features=1024, out_features=4096, bias=True)
            (intermediate_act_fn): GELUActivation()
          )
          (output): RobertaOutput(
            (dense): Linear(in_features=4096, out_features=1024, bias=True)
            (LayerNorm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
            (dropout): Dropout(p=0.1, inplace=False)
          )
        )
      )
    )
  )
  (classifier): RobertaClassificationHead(
    (dense): Linear(in_features=1024, out_features=1024, bias=True)
    (dropout): Dropout(p=0.1, inplace=False)
    (out_proj): Linear(in_features=1024, out_features=3, bias=True)
  )
)
```

### Usage
Example of use with transformers pipeline:
```[python]
from transformers import pipeline

classifier = pipeline(model="radlab/polarity-3c", task="text-classification")

classifier("Text to classification")
```

with sample data and `top_k=3`:
```[python]
classifier("""
  Po upadku re偶imu Asada w Syrii, mieszka艅cy, borykaj膮cy si臋 z ub贸stwem,
  zacz臋li t艂umnie poszukiwa膰 skarb贸w, zach臋ceni legendami o zakopanych
  bogactwach i dost臋pno艣ci膮 wykrywaczy metali, kt贸re sta艂y si臋 popularnym
  towarem. Mimo, 偶e dzia艂alno艣膰 ta jest nielegalna, rz膮d przymyka oko,
  a sprzedawcy oferuj膮 urz膮dzenia nawet dla dzieci. Poszukiwacze skupiaj膮
  si臋 na obszarach historycznych, wierz膮c w legendy o skarbach ukrytych
  przez staro偶ytne cywilizacje i wojska osma艅skie, cho膰 eksperci ostrzegaj膮
  przed fa艂szywymi monetami i kradzie偶膮 artefakt贸w z muze贸w.""",
  top_k=3
)
```
the output is:
```
[{'label': 'ambivalent', 'score': 0.9995126724243164},
 {'label': 'negative', 'score': 0.00024663121439516544},
 {'label': 'positive', 'score': 0.00024063512682914734}]
```