Does it run on a CPU instance in sagemaker (ml.m5.2xlarge)?

by arviii - opened Aug 28, 2023

Discussion

arviii

Aug 28, 2023

•

edited Aug 29, 2023

Hey, I am trying to deploy a model on a CPU instance(`ml.m5.2xlarge`) on sagemaker, but it overflows the storage and best way to resolve this might be to mount a storage volume (EBS I suppose)

To do so, ideally should pass `volume_size=80` in `huggingface_model.deploy` parameters. But it doesn't seem to work in my case and it still throws same error about storage running out.

Model: https://huggingface.co/NumbersStation/nsql-llama-2-7B
Instance: ml.m5.2xlarge (it works perfectly fine on ml.g5.2xlarge)

error: "Error: Download 
Error safetensors_rust.SafetensorError: Error while serializing: IoError(Os { code: 28, kind: StorageFull, message: ""No space left on device"" })"
code: predictor = huggingface_model.deploy(
    initial_instance_count=1,
    # instance_type="ml.g5.2xlarge",
    instance_type="ml.m5.2xlarge",
    container_startup_health_check_timeout=300,
    volume_size=80,
)

rest code is just fine as it gets deployed successfully on ml.g5.2xlarge

senwu

NumbersStation org Aug 28, 2023

Thank you for sharing this information! It will be helpful for others who are interested in deploying on Sagemaker.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Does it run on a CPU instance in sagemaker (ml.m5.2xlarge)?

Hey, I am trying to deploy a model on a CPU instance(ml.m5.2xlarge) on sagemaker, but it overflows the storage and best way to resolve this might be to mount a storage volume (EBS I suppose)

To do so, ideally should pass volume_size=80 in huggingface_model.deploy parameters. But it doesn't seem to work in my case and it still throws same error about storage running out.

rest code is just fine as it gets deployed successfully on ml.g5.2xlarge

Hey, I am trying to deploy a model on a CPU instance(`ml.m5.2xlarge`) on sagemaker, but it overflows the storage and best way to resolve this might be to mount a storage volume (EBS I suppose)

To do so, ideally should pass `volume_size=80` in `huggingface_model.deploy` parameters. But it doesn't seem to work in my case and it still throws same error about storage running out.