yosefw
/

natural_questions_3m_splade_index

Model card Files Files and versions

natural_questions_3m_splade_index / README.md

yosefw's picture

Update README.md

8b108d3 verified 1 day ago

|

history blame contribute delete

1.25 kB

	---
	language: en
	library_name: splade-index
	tags:
	- splade
	- splade-index
	- retrieval
	- search
	- sparse
	---

	# Splade-Index

	This is an index created with the [splade-index](https://github.com/rasyosef/splade-index) library (version `0.1.1`)

	## Installation

	You can install the `splade-index` library with `pip`:

	```bash
	pip install "splade-index==0.1.1"

	# Include extra dependencies like stemmer
	pip install "splade-index[full]==0.1.1"

	# For huggingface hub usage
	pip install huggingface_hub
	```

	## Load this Index

	You can use the following code to load this SPLADE index from Hugging Face hub:

	```python
	from sentence_transformers import SparseEncoder
	from splade_index import SPLADE

	# Download the SPLADE model that was used to create the index from the HuggingFace Hub
	model_id = "rasyosef/splade-tiny" # The splade model id
	model = SparseEncoder(model_id)

	repo_id = "yosefw/natural_questions_3m_splade_index"

	# Load a SPLADE index from the Hugging Face model hub
	retriever = SPLADE.load_from_hub(repo_id, model=model)
	```

	## Stats

	This dataset was created using the following data:

	\| Statistic \| Value \|
	\| --- \| --- \|
	\| Number of documents \| 2681468 \|
	\| Number of tokens \| 464573223 \|
	\| Average tokens per document \| 173.25 \|