fix: update LAST_METRICS type annotation and remove unused demo.load call 594495c tasal9 commited on 25 days ago
feat: add smoke_test and generation metrics (latency/token counts) 2e0bc05 tasal9 commited on 26 days ago
Refactor build_ui function to register model load event inside context for improved clarity 6e7d264 tasal9 commited on 26 days ago
Fix demo.load call in predict function to ensure proper loading of model interface c1f2553 tasal9 commited on 26 days ago
Remove unsupported 'every' parameter from demo.load in build_ui function 09655e9 tasal9 commited on 26 days ago
Refactor predict function to streamline prompt building and enhance output formatting; update UI components for better user experience and status reporting 7c58507 tasal9 commited on 26 days ago
Enhance ECHO_MODE functionality and logging for improved testing and environment configuration b7d1634 tasal9 commited on 26 days ago
Add echo/useless mode to improve testing efficiency without loading models cd06a48 tasal9 commited on 26 days ago
Add tokenizer config sanitization to ensure valid file references in model snapshots 7acf44a tasal9 commited on 27 days ago
Refactor tokenizer loading in get_generator function to prioritize fast tokenizer and improve error handling 0cdbb5c tasal9 commited on 27 days ago
Add huggingface_hub to requirements.txt for improved model support d9f0120 tasal9 commited on 27 days ago
Enhance tokenizer loading in get_generator function with local model snapshot support and improved error handling d565b55 tasal9 commited on 27 days ago
Improve tokenizer loading in get_generator function with enhanced error handling and logging 1e44b13 tasal9 commited on 27 days ago
Fix logging format in health server to ensure proper message output 5d40b6a tasal9 commited on 27 days ago
Refactor app.py to enhance configuration management, add health server, and improve prompt generation 3a8259f tasal9 commited on Aug 18
Refactor response handling in predict function to simplify text extraction 66ede10 tasal9 commited on Aug 14
Refactor prompt generation to use English instructions for better clarity and context 47a5462 tasal9 commited on Aug 14
Refactor get_generator and predict functions to simplify device handling and improve performance ef803a6 tasal9 commited on Aug 13
Update README and app.py for improved demo instructions and UI enhancements 0ef0a95 tasal9 commited on Aug 13
Refactor Gradio demo to support zero-GPU inference with dynamic settings 09c6768 tasal9 commited on Aug 13