File size: 3,980 Bytes
c4bc283 9c5efad c4bc283 fcb0eed c4bc283 91bc6c6 c4bc283 5cda6d2 c4bc283 91bc6c6 c4bc283 93206e8 c4bc283 91bc6c6 c4bc283 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 |
---
license: apache-2.0
language:
- en
base_model:
- microsoft/deberta-v3-large
- HuggingFaceTB/SmolLM2-135M-Instruct
pipeline_tag: token-classification
tags:
- NER
- encoder
- decoder
- GLiNER
- information-extraction
library_name: gliner
---

**GLiNER** is a Named Entity Recognition (NER) model capable of identifying *any* entity type in a **zero-shot** manner.
This architecture combines:
* An **encoder** for representing entity spans
* A **decoder** for generating label names
This hybrid approach enables new use cases such as **entity linking** and expands GLiNER’s capabilities.
By integrating large modern decoders—trained on vast datasets—GLiNER can leverage their **richer knowledge capacity** while maintaining competitive inference speed.
---
## Key Features
* **Open ontology**: Works when the label set is unknown
* **Multi-label entity recognition**: Assign multiple labels to a single entity
* **Entity linking**: Handle large label sets via constrained generation
* **Knowledge expansion**: Gain from large decoder models
* **Efficient**: Minimal speed reduction on GPU compared to single-encoder GLiNER
---
## Installation
Update to the latest version of GLiNER:
```bash
# until the new pip release, install from main to use the new architecture
pip install git+https://github.com/urchade/GLiNER.git
```
---
## Usage
If you need an open ontology entity extraction use tag `label` in the list of labels, please check example below:
```python
from gliner import GLiNER
model = GLiNER.from_pretrained("knowledgator/gliner-decoder-large-v1.0")
text = "Hugging Face is a company that advances and democratizes artificial intelligence through open source and science."
labels = ["label"]
model.predict_entities(text, labels, threshold=0.3, num_gen_sequences=1)
```
If you need to run a model on many text and/or set some labels constraints, please check example below:
```python
from gliner import GLiNER
model = GLiNER.from_pretrained("knowledgator/gliner-decoder-large-v1.0")
text = (
"Apple was founded as Apple Computer Company on April 1, 1976, "
"by Steve Wozniak, Steve Jobs (1955–2011) and Ronald Wayne to "
"develop and sell Wozniak's Apple I personal computer."
)
labels = ["person", "company", "date"]
model.run([text], labels, threshold=0.3, num_gen_sequences=1)
```
---
### Example Output
```json
[
[
{
"start": 21,
"end": 26,
"text": "Apple",
"label": "company",
"score": 0.6795641779899597,
"generated labels": ["Organization"]
},
{
"start": 47,
"end": 60,
"text": "April 1, 1976",
"label": "date",
"score": 0.44296327233314514,
"generated labels": ["Date"]
},
{
"start": 65,
"end": 78,
"text": "Steve Wozniak",
"label": "person",
"score": 0.9934439659118652,
"generated labels": ["Person"]
},
{
"start": 80,
"end": 90,
"text": "Steve Jobs",
"label": "person",
"score": 0.9725918769836426,
"generated labels": ["Person"]
},
{
"start": 107,
"end": 119,
"text": "Ronald Wayne",
"label": "person",
"score": 0.9964536428451538,
"generated labels": ["Person"]
}
]
]
```
---
### Restricting the Decoder
You can limit the decoder to generate labels only from a predefined set:
```python
model.run(
text, labels,
threshold=0.3,
num_gen_sequences=1,
gen_constraints=[
"organization", "organization type", "city",
"technology", "date", "person"
]
)
```
---
## Performance Tips
Two label trie implementations are available.
For a **faster, memory-efficient C++ version**, install **Cython**:
```bash
pip install cython
```
This can significantly improve performance and reduce memory usage, especially with millions of labels.
|