Spaces:
Runtime error
Runtime error
File size: 7,249 Bytes
1bac1ed 21bc425 1bac1ed 3a2ac99 1bac1ed 3a2ac99 1bac1ed 21bc425 1bac1ed 21bc425 1bac1ed 21bc425 1bac1ed 21bc425 1bac1ed 21bc425 1bac1ed 3a2ac99 1bac1ed 21bc425 1bac1ed 21bc425 1bac1ed 21bc425 1bac1ed 21bc425 1bac1ed 21bc425 1bac1ed 21bc425 1bac1ed 21bc425 1bac1ed 21bc425 1bac1ed 21bc425 1bac1ed 21bc425 1bac1ed 21bc425 1bac1ed 21bc425 3a2ac99 21bc425 1bac1ed 21bc425 1bac1ed |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 |
NCHMARK_COLS: ['Perplexity'] === END COLUMN SETUP === π§ CHECKING MODEL TRACING AVAILABILITY... - Model tracing path: /home/user/app/src/evaluation/../../model-tracing - Path exists: True - main.py exists: True π― Final MODEL_TRACING_AVAILABLE = True .gitattributes: 0%| | 0.00/2.46k [00:00<?, ?B/s] .gitattributes: 100%|ββββββββββ| 2.46k/2.46k [00:00<00:00, 10.1MB/s] (β¦)therAI_gpt-neo-1.3B_20250726_010247.json: 0%| | 0.00/202 [00:00<?, ?B/s] (β¦)therAI_gpt-neo-1.3B_20250726_010247.json: 100%|ββββββββββ| 202/202 [00:00<00:00, 748kB/s] (β¦)s_facebook_opt-125m_20250726_020655.json: 0%| | 0.00/205 [00:00<?, ?B/s] (β¦)s_facebook_opt-125m_20250726_020655.json: 100%|ββββββββββ| 205/205 [00:00<00:00, 909kB/s] (β¦)s_facebook_opt-350m_20250726_021737.json: 0%| | 0.00/205 [00:00<?, ?B/s] (β¦)s_facebook_opt-350m_20250726_021737.json: 100%|ββββββββββ| 205/205 [00:00<00:00, 850kB/s] (β¦)ommunity_gpt2-large_20250726_013038.json: 0%| | 0.00/214 [00:00<?, ?B/s] (β¦)ommunity_gpt2-large_20250726_013038.json: 100%|ββββββββββ| 214/214 [00:00<00:00, 1.03MB/s] (β¦)mmunity_gpt2-medium_20250726_015555.json: 0%| | 0.00/216 [00:00<?, ?B/s] (β¦)mmunity_gpt2-medium_20250726_015555.json: 100%|ββββββββββ| 216/216 [00:00<00:00, 730kB/s] (β¦)enai-community_gpt2_20250725_231201.json: 0%| | 0.00/209 [00:00<?, ?B/s] (β¦)enai-community_gpt2_20250725_231201.json: 100%|ββββββββββ| 209/209 [00:00<00:00, 533kB/s] (β¦)enai-community_gpt2_20250725_233155.json: 0%| | 0.00/209 [00:00<?, ?B/s] (β¦)enai-community_gpt2_20250725_233155.json: 100%|ββββββββββ| 209/209 [00:00<00:00, 905kB/s] (β¦)enai-community_gpt2_20250725_235115.json: 0%| | 0.00/209 [00:00<?, ?B/s] (β¦)enai-community_gpt2_20250725_235115.json: 100%|ββββββββββ| 209/209 [00:00<00:00, 801kB/s] (β¦)enai-community_gpt2_20250725_235748.json: 0%| | 0.00/209 [00:00<?, ?B/s] (β¦)enai-community_gpt2_20250725_235748.json: 100%|ββββββββββ| 209/209 [00:00<00:00, 856kB/s] (β¦)enai-community_gpt2_20250726_000358.json: 0%| | 0.00/209 [00:00<?, ?B/s] (β¦)enai-community_gpt2_20250726_000358.json: 100%|ββββββββββ| 209/209 [00:00<00:00, 696kB/s] (β¦)enai-community_gpt2_20250726_000650.json: 0%| | 0.00/209 [00:00<?, ?B/s] (β¦)enai-community_gpt2_20250726_000650.json: 100%|ββββββββββ| 209/209 [00:00<00:00, 792kB/s] (β¦)enai-community_gpt2_20250726_015147.json: 0%| | 0.00/209 [00:00<?, ?B/s] (β¦)enai-community_gpt2_20250726_015147.json: 100%|ββββββββββ| 209/209 [00:00<00:00, 1.12MB/s] π STARTING GRADIO APP INITIALIZATION π Initializing allowed models... π INITIALIZING ALLOWED MODELS π Models to initialize: ['lmsys/vicuna-7b-v1.5', 'ibm-granite/granite-7b-base', 'EleutherAI/llemma_7b'] π§Ή CLEANING NON-ALLOWED RESULT FILES ποΈ Removing non-allowed model result: ./eval-results/EleutherAI/results_EleutherAI_gpt-neo-1.3B_20250726_010247.json (model: EleutherAI/gpt-neo-1.3B) ποΈ Removing non-allowed model result: ./eval-results/facebook/results_facebook_opt-125m_20250726_020655.json (model: facebook/opt-125m) ποΈ Removing non-allowed model result: ./eval-results/facebook/results_facebook_opt-350m_20250726_021737.json (model: facebook/opt-350m) ποΈ Removing non-allowed model result: ./eval-results/openai-community/results_openai-community_gpt2-large_20250726_013038.json (model: openai-community/gpt2-large) ποΈ Removing non-allowed model result: ./eval-results/openai-community/results_openai-community_gpt2-medium_20250726_015555.json (model: openai-community/gpt2-medium) ποΈ Removing non-allowed model result: ./eval-results/openai-community/results_openai-community_gpt2_20250725_231201.json (model: openai-community/gpt2) ποΈ Removing non-allowed model result: ./eval-results/openai-community/results_openai-community_gpt2_20250725_233155.json (model: openai-community/gpt2) ποΈ Removing non-allowed model result: ./eval-results/openai-community/results_openai-community_gpt2_20250725_235115.json (model: openai-community/gpt2) ποΈ Removing non-allowed model result: ./eval-results/openai-community/results_openai-community_gpt2_20250725_235748.json (model: openai-community/gpt2) ποΈ Removing non-allowed model result: ./eval-results/openai-community/results_openai-community_gpt2_20250726_000358.json (model: openai-community/gpt2) ποΈ Removing non-allowed model result: ./eval-results/openai-community/results_openai-community_gpt2_20250726_000650.json (model: openai-community/gpt2) ποΈ Removing non-allowed model result: ./eval-results/openai-community/results_openai-community_gpt2_20250726_015147.json (model: openai-community/gpt2) β Removed 12 non-allowed result files π§ CREATING RESULT FILE FOR: lmsys/vicuna-7b-v1.5 π Result file path: ./eval-results/lmsys_vicuna_7b_v1.5_float16.json β Created result file: ./eval-results/lmsys_vicuna_7b_v1.5_float16.json π§ CREATING RESULT FILE FOR: ibm-granite/granite-7b-base π Result file path: ./eval-results/ibm_granite_granite_7b_base_float16.json β Created result file: ./eval-results/ibm_granite_granite_7b_base_float16.json π§ CREATING RESULT FILE FOR: EleutherAI/llemma_7b π Result file path: ./eval-results/EleutherAI_llemma_7b_float16.json β Created result file: ./eval-results/EleutherAI_llemma_7b_float16.json β Initialized 3 model result files π Creating initial results DataFrame... π CREATE_RESULTS_DATAFRAME CALLED === GET_LEADERBOARD_DF DEBUG === Starting leaderboard creation... Looking for results in: ./eval-results Expected columns: ['T', 'Model', 'Average β¬οΈ', 'Perplexity', 'Match P-Value β¬οΈ', 'Type', 'Architecture', 'Precision', 'Hub License', '#Params (B)', 'Hub β€οΈ', 'Available on the hub', 'Model sha'] Benchmark columns: ['Perplexity'] Searching for result files in: ./eval-results Found 0 result files Processing 0 evaluation results Returning 0 processed results Found 0 raw results No raw data found, creating empty DataFrame Creating empty fallback DataFrame... Empty DataFrame created with columns: ['T', 'Model', 'Average β¬οΈ', 'Perplexity', 'Match P-Value β¬οΈ', 'Type', 'Architecture', 'Precision', 'Hub License', '#Params (B)', 'Hub β€οΈ', 'Available on the hub', 'Model sha'] π Retrieved leaderboard df: (0, 13) β οΈ DataFrame is None or empty, returning empty DataFrame β Initial DataFrame created with shape: (0, 6) π Columns: ['Model', 'Perplexity', 'Match P-Value', 'Average Score', 'Type', 'Precision'] π¨ Creating Gradio interface... π― GRADIO INTERFACE SETUP COMPLETE π LAUNCHING GRADIO APP WITH MODEL TRACING INTEGRATION π Features enabled: - Perplexity evaluation - Model trace p-value computation (vs GPT-2 base) - Match statistic with alignment π Ready to accept requests! * Running on local URL: http://0.0.0.0:7860, with SSR β‘ (experimental, to disable set `ssr=False` in `launch()`) |