Apply for community grant: Academic project (gpu)

#1
by orpatashnik - opened

Hi, I will be happy to have GPU for this demo :)
this is the paper: https://snap-research.github.io/NestedAttention/
I had to use docker env because I am using dlib and it takes a lot of time to build it and I get timeout

Hi @orpatashnik , we've assigned L4 to this Space for now. Would it be possible to remove dlib from the dependencies or prebuild a wheel of it and install it at startup so that your Space can run on ZeroGPU?

Regarding the dlib installation, I'm wondering if this wheel works in your Space.
I built it using the following Dockerfile:

FROM python:3.10.13-slim

ENV DEBIAN_FRONTEND=noninteractive
RUN apt-get update && \
    apt-get upgrade -y && \
    apt-get install -y --no-install-recommends \
    git \
    git-lfs \
    wget \
    curl \
    build-essential \
    cmake && \
    apt-get clean && \
    rm -rf /var/lib/apt/lists/*

RUN pip install build

USER 1000
WORKDIR /work

CMD ["python", "-m", "build", "--wheel"]

I removed the dlib dependency, so we can change to ZeroGPU. Thanks!

Awesome! That works too. Thanks! I've just assigned ZeroGPU to this Space.

I get an error that there are no gpus available

I changed and I still get the same error

BTW, before I removed the dlib dependency and tried the L40, it worked

Thanks for checking. I think the error is raised because you didn't import the spaces package before instantiating the model.

Thanks! Looks like it recognizes the gpu now. I now get this error:

OSError: Unable to load weights from pytorch checkpoint file for '/home/user/.cache/huggingface/hub/models--orpatashnik--NestedAttentionEncoder/snapshots/daad334d5a022d66fe4d710d94694ef1aa4a0fa4/image_encoder/pytorch_model.bin' at '/home/user/.cache/huggingface/hub/models--orpatashnik--NestedAttentionEncoder/snapshots/daad334d5a022d66fe4d710d94694ef1aa4a0fa4/image_encoder/pytorch_model.bin'

On my GPU the exact same code works

From this error, looks like it's related to ZeroGPU, but I haven't seen it in other Spaces and I'm not sure why it's raised in this Space. I've asked internally and I'll let you know once I get a reply.

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/transformers/modeling_utils.py", line 556, in load_state_dict
    return torch.load(
  File "/usr/local/lib/python3.10/site-packages/torch/serialization.py", line 1351, in load
    return _load(
  File "/usr/local/lib/python3.10/site-packages/torch/serialization.py", line 1848, in _load
    result = unpickler.load()
  File "/usr/local/lib/python3.10/site-packages/torch/_weights_only_unpickler.py", line 385, in load
    self.append(self.persistent_load(pid))
  File "/usr/local/lib/python3.10/site-packages/torch/serialization.py", line 1812, in persistent_load
    typed_storage = load_tensor(
  File "/usr/local/lib/python3.10/site-packages/torch/serialization.py", line 1784, in load_tensor
    wrap_storage=restore_location(storage, location),
  File "/usr/local/lib/python3.10/site-packages/torch/serialization.py", line 1685, in restore_location
    return default_restore_location(storage, map_location)
  File "/usr/local/lib/python3.10/site-packages/torch/serialization.py", line 601, in default_restore_location
    result = fn(storage, location)
  File "/usr/local/lib/python3.10/site-packages/torch/serialization.py", line 472, in _meta_deserialize
    return torch.UntypedStorage(obj.nbytes(), device="meta")
  File "/usr/local/lib/python3.10/site-packages/spaces/zero/torch/patching.py", line 253, in _untyped_storage_new_register
    if (device := kwargs.get('device')) is not None and device.type == 'cuda':
AttributeError: 'str' object has no attribute 'type'

I ran into an error locally when using torch==2.5.1.

Traceback (most recent call last):
  File "/tmp/temp/NestedAttentionEncoder/app.py", line 32, in <module>
    ip_model = NestedAdapterInference(
  File "/tmp/temp/NestedAttentionEncoder/nested_attention_pipeline.py", line 64, in __init__
    self.image_encoder = CLIPVisionModelWithProjection.from_pretrained(
  File "/tmp/temp/NestedAttentionEncoder/.venv/lib/python3.10/site-packages/transformers/modeling_utils.py", line 315, in _wrapper
    return func(*args, **kwargs)
  File "/tmp/temp/NestedAttentionEncoder/.venv/lib/python3.10/site-packages/transformers/modeling_utils.py", line 5001, in from_pretrained
    ) = cls._load_pretrained_model(
  File "/tmp/temp/NestedAttentionEncoder/.venv/lib/python3.10/site-packages/transformers/modeling_utils.py", line 5262, in _load_pretrained_model
    load_state_dict(checkpoint_files[0], map_location="meta", weights_only=weights_only).keys()
  File "/tmp/temp/NestedAttentionEncoder/.venv/lib/python3.10/site-packages/transformers/modeling_utils.py", line 560, in load_state_dict
    check_torch_load_is_safe()
  File "/tmp/temp/NestedAttentionEncoder/.venv/lib/python3.10/site-packages/transformers/utils/import_utils.py", line 1606, in check_torch_load_is_safe
    raise ValueError(
ValueError: Due to a serious vulnerability issue in `torch.load`, even with `weights_only=True`, we now require users to upgrade torch to at least v2.6 in order to use the function. This version restriction does not apply when loading files with safetensors.
See the vulnerability report here https://nvd.nist.gov/vuln/detail/CVE-2025-32434

Looks like the problem is that the models are not saved as safetensors.
safetensors docs: https://huggingface.co/docs/safetensors/main/en/torch_shared_tensors

When I converted the model to safetensors and added a @spaces.GPU decorator to the generate_images function, it seemed to work fine on ZeroGPU.

Could you give this a try?

Thanks so much for the help! It works now!

Sign up or log in to comment