runtime error
Exit code: 1. Reason: [A model-00001-of-000001.safetensors: 17%|█▋ | 1.15G/6.67G [00:12<00:36, 151MB/s][A model-00001-of-000001.safetensors: 29%|██▊ | 1.91G/6.67G [00:13<00:18, 254MB/s][A model-00001-of-000001.safetensors: 37%|███▋ | 2.46G/6.67G [00:14<00:13, 304MB/s][A model-00001-of-000001.safetensors: 45%|████▍ | 3.00G/6.67G [00:15<00:10, 351MB/s][A model-00001-of-000001.safetensors: 53%|█████▎ | 3.54G/6.67G [00:16<00:07, 393MB/s][A model-00001-of-000001.safetensors: 61%|██████ | 4.06G/6.67G [00:24<00:16, 161MB/s][A model-00001-of-000001.safetensors: 69%|██████▉ | 4.59G/6.67G [00:25<00:10, 205MB/s][A model-00001-of-000001.safetensors: 83%|████████▎ | 5.52G/6.67G [00:26<00:03, 303MB/s][A model-00001-of-000001.safetensors: 100%|██████████| 6.67G/6.67G [00:27<00:00, 243MB/s] Downloading shards: 100%|██████████| 1/1 [00:27<00:00, 27.55s/it][A Downloading shards: 100%|██████████| 1/1 [00:27<00:00, 27.55s/it] Traceback (most recent call last): File "/home/user/app/app.py", line 18, in <module> model = AutoModel.from_pretrained( File "/usr/local/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 559, in from_pretrained return model_class.from_pretrained( File "/usr/local/lib/python3.10/site-packages/transformers/modeling_utils.py", line 4091, in from_pretrained config = cls._autoset_attn_implementation( File "/usr/local/lib/python3.10/site-packages/transformers/modeling_utils.py", line 1617, in _autoset_attn_implementation cls._check_and_enable_flash_attn_2( File "/usr/local/lib/python3.10/site-packages/transformers/modeling_utils.py", line 1756, in _check_and_enable_flash_attn_2 raise ValueError( ValueError: FlashAttention2 has been toggled on, but it cannot be used due to the following error: Flash Attention 2 is not available on CPU. Please make sure torch can access a CUDA device.
Container logs:
Fetching error logs...