I need help plws ;-;

#1
by nyabeaam - opened

im doing this web app MTL feom jp to en and I cant use this model idk whyy it jus keep running I saw this guy on yt jus doing the MTL by writing translate en to de and it worked.. welp
Screenshot 2025-04-18 at 16.04.18.png

Hi, can you be more specific about what is happening?
The screenshot is not showing any errors.
And what hardware are you running it on?

Also, please note you need to add print(response) at the end of the code to actually see the translation.

Hi, can you be more specific about what is happening?
The screenshot is not showing any errors.
And what hardware are you running it on?

thank you for answering!! I first imported lots then added the Transformers like shown in pics but it keeps on running
I use Mac air ram8 and it jus told me it crashed when running this line

Use a pipeline as a high-level helper

from transformers import pipeline
pipe = pipeline("translation", model="TechnoByte/Qwen2.5-7B-VNTL-JP-EN")

Screenshot 2025-04-18 at 16.57.38.png

Screenshot 2025-04-18 at 16.57.17.png

for reference im tryna do like this https://youtu.be/feA-H6blwr4?si=lvQFxUGqQLcai4wI

I'm not sure what is causing that problem, the code seems correct.

But you appear to be using a Jupyter notebook which I have not tested this in.

I have created a HuggingFace space which has a gradio UI just like shown in the video you sent, you can look at its code and try to get it working for yourself.

Space: https://huggingface.co/spaces/TechnoByte/Qwen2.5-7B-VNTL-JP-EN-Demo

Code: https://huggingface.co/spaces/TechnoByte/Qwen2.5-7B-VNTL-JP-EN-Demo/blob/main/app.py

Tell me if you need any more help!

I think the problem is due to you only having 8 GB ram, the full model requires 16 GB ram.

You can download this 4-bit quantized version of the model which only needs 5 GB ram.

But to use that you'll need to use either llama.cpp to run it on your commandline, or llama-cpp-python to use it in your python code.

Space: https://huggingface.co/spaces/TechnoByte/Qwen2.5-7B-VNTL-JP-EN-Demo

this GREAT I jus used the demo on LN im reading and it worked! but it doesn't translate more than one to two lines :(
Screenshot 2025-04-18 at 18.11.57.png

Thank you, and please try it again now, I just made a change that should fix the issue with multiple lines :)

I think the problem is due to you only having 8 GB ram, the full model requires 16 GB ram.

You can download this 4-bit quantized version of the model which only needs 5 GB ram.

But to use that you'll need to use either llama.cpp to run it on your commandline, or llama-cpp-python to use it in your python code.

you see the thing is I wanted to make mini project for me to translate novel in ( chat gpt jus refuses bec too smutty...Dx) and also make use of it as my nlp project this semester

is there a limit to how much lines I can translate?
also do you know a way I can do the webapp jus like the vid? he added translation_pipeline = pipeline('translation_en_to_de') and it translated directly but when I change it to (jp_to_en ) it says error ;-; errrm

The model is only trained to translate one line at a time, to translate more than one you'll need to split the input text by lines and translate them one by one and then join the translations back together.

That is what my space does as well, so you can just use my code for it.

This means there is no limit to how many lines you can translate, but more lines will take longer.


I recommend you try Ollama, it will make it a lot easier for you to run the model on your hardware.

You can download the model with ollama pull technobyte/Qwen2.5-7B-VNTL-JP-EN:q4_k_m (this is the 4-bit quantized version which is only 5 GB).

You can try running it then with ollama run technobyte/Qwen2.5-7B-VNTL-JP-EN:q4_k_m and just paste the Japanese line in to get English output.

And to use it in your project, in Python code use the ollama-python library.

And my model does not support the translation pipeline, if you want a simple webapp you can copy the code from my space and adjust it to either use node-llama-cpp or ollama-python to use the 4-bit quantized version.

Sign up or log in to comment