Flash attention support
#4
by
jiosephlee
- opened
Hi, I seem to be getting different performance based on the attention implementation. Is flash attention 2 also supported? Or just flash attention 1.
InternS1 follow the attention interface introduced in huggingface:https://huggingface.co/docs/transformers/attention_interface,you can set attn_implementation to control the attention implementation.