Flash attention support

#4
by jiosephlee - opened

Hi, I seem to be getting different performance based on the attention implementation. Is flash attention 2 also supported? Or just flash attention 1.

Intern Large Models org
edited 4 days ago

InternS1 follow the attention interface introduced in huggingface:https://huggingface.co/docs/transformers/attention_interface,you can set attn_implementation to control the attention implementation.

Sign up or log in to comment