Spaces:
Sleeping
Sleeping
File size: 1,412 Bytes
ac901c7 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
# Indic NLP Library
This repository is a _de-bloated_ fork of the original [Indic NLP Library](https://github.com/anoopkunchukuttan/indic_nlp_library) and integrates [UrduHack](https://github.com/urduhack/urduhack) submodule and [Indic NLP Resources](https://github.com/anoopkunchukuttan/indic_nlp_resources) directly. This allows to work with Urdu normalization and tokenization without needing to install [urduhack](https://pypi.org/project/urduhack/) and `indic_nlp_resources` separately, which can be an issue sometimes as it is `TensorFlow` based. This repository is mainly created and mainted for [IndicTrans2](https://github.com/AI4Bharat/IndicTrans2) and [IndicTransTokenizer](https://github.com/VarunGumma/IndicTransTokenizer)
For any queries, please get in touch with the original authors/maintainers of the respective libraries:
- `Indic NLP Library`: [anoopkunchukuttan](https://github.com/anoopkunchukuttan)
- `Indic NLP Resources`: [anoopkunchukuttan](https://github.com/anoopkunchukuttan)
- `UrduHack`: [UrduHack](https://github.com/urduhack)
## Usage:
```
git clone https://github.com/VarunGumma/indic_nlp_library.git
cd indic_nlp_library
pip install --editable ./
```
## Updates:
- Integrated `urduhack` directly into the repository.
- Renamed `master` branch as `main`.
- Integrated `indic_nlp_resources` directly into the repository.
- _De-bloated_ the repository.
|