--- MachineLearningML: Continued Pretraining Language Models on Millions of Synthetic Tabular Prediction Tasks Scales In-Context ML license: apache-2.0 base_model: - Qwen/Qwen2.5-7B-Instruct --- # MachineLearningLM ## model summary Can LLMs learn from 1,000 in-context examples? Introducing **MachineLearningLM** 🧪📊 — a model continuously pretrained on millions of synthetic tabular ML tasks, enabling robust many-shot in-context learning. 📈 **Scales from 8 to 1,024 examples** 📈 ​**​~15% improvement​**​ on unseen tabular tasks compared to o3-mini / GPT-5-mini / Qwen-2.5-7B 🌲 ​**​Random-Forest–level robustness​**​ 🧠 ​**​MMLU score: 75.4%​**​ 📄 Read the paper: https://huggingface.co/papers/2509.06806 GitHub: https://github.com/HaoAreYuDong/MachineLearningLM ## evaluation and validation We have developed an automated evaluation framework — simply configure the parameters to easily perform validation and evaluation. **The code is now open-sourced at our GitHub.** **Quick Start** ```bash pip install -r requirements.txt python ./src/evaluation/model_pred/dl_model_pred.py \ --input_dir ./demo_input.jsonl \ --output_dir ./demo_output.jsonl \ --model_name MachineLearningLM/MachineLearningLM-7B-v1 ``` **pipeline** ```bash # modify the evaluate_parameters.sh file source evaluate_parameters.sh # Option 1 End-to-End Pipeline ./scripts/evaluate_pipeline.sh # Option 2 Parallel Processing ./scripts/multi_process/data_prep.sh ./scripts/multi_process/prompt_gen.sh # For deep learning only ./scripts/multi_process/model_pred.sh ./scripts/multi_process/evaluation.sh ./scripts/multi_process/report.sh # Option3 Sequential Processing ./scripts/single_process/data_prep.sh ./scripts/single_process/prompt_gen.sh # For deep learning only ./scripts/single_process/model_pred.sh ./scripts/single_process/evaluation.sh ./scripts/single_process/report.sh ``` **Quants** https://huggingface.co/mradermacher/MachineLearningLM-7B-v1-GGUF For more usage details, please visit our GitHub.