license: mit language: - en base_model: - google/gemma-3-27b-it pipeline_tag: text-generation
We have quantised the model in 2-bit to make it inferenceable in low-end GPU cards at scale. It was achieved thanks to llama.cpp library.