You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

Haiku β€” Trimodal (CODEX + H&E + Text) Retrieval Model

This repo bundles a fine-tuned Haiku checkpoint together with the tokenizer and marker assets needed to run inference without any additional downloads from xiangjx/musk or microsoft/BiomedNLP-BiomedBERT-*.

Contents

  • haiku_state_dict.pt β€” model weights (CODEX + H&E + Text encoders + projections)
  • config.json β€” architecture config + marker lists
  • tokenizer/ β€” BiomedBERT tokenizer files (+ bert config)
  • esm_embeddings/ β€” per-biomarker ESM embeddings (also embedded in state_dict; kept here for downstream use)
  • vocab.pkl β€” marker vocabulary

Quick start

from models import Haiku

model, tokenizer, marker_embedding = Haiku.from_pretrained(
    "zhihuanglab/Haiku",
    device="cuda",
    token="hf_...",  # omit if HF_TOKEN / hf auth login is set
)
model.eval()
Downloads last month
2
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support