Implement make_opt_flags function for XPU
This PR submits intel-xpu-backend-for-triton PR content to support running the MXFP4 version of GPT-OSS using transformers on XPU. The current torch2.8+triton3.4 version for XPU does not support running directly with kernels-community/triton_kernels
. Therefore, this PR is implemented based on torch2.9+triton3.5 for XPU.
I tested this PR on A100 with no errors.
Hey @YangKai0616 , thanks for the PR ! Happy to merge this but I would be better to upstream the modification in triton directly as you implemented this based on triton 3.5 and torch 2.9. When the version will be out, I will most likely update the kernels, so any modification done previously will be overwritten and you will have to make a new PR. Another solution would be to wait until I do the changes when the new version are released wdyt ? Right now, most users won't be able to use this unless they manage to build triton from source.
Thx very much @marcsun13 for your kind review. The thing a bit tricky is that intel triton backend hasn't upstreamed to triton yet, so currently triton xpu backend can work compatible with triton, but not built-in in triton. This reality leads that it's hard(or impossible) for us to upstream it to triton directly. So, to make transformers GPT OSS can work on XPU, we need do the labor to upstreaming xpu related opt_flag manually into this kernel repo. I think it's inevitable for Intel to pay the pain before we upstreamed xpu into triton. So, is it possible to do below:
- You help merge this PR
- You update per upstream triton for the 3.5 changes when needed (without need to consider XPU)
- We Intel engineers will monitor(actually we already did) model healthiness on xpu, and will submit new PRs if we found XPU not work for your review.
Thx very much