Spaces:

ahmedsqrd
/

model_trace

Runtime error

File size: 7,249 Bytes

1bac1ed
 
 
 
 
 
 
21bc425
1bac1ed
 
3a2ac99
1bac1ed
 
3a2ac99
1bac1ed
 
21bc425
1bac1ed
 
21bc425
1bac1ed
 
21bc425
1bac1ed
 
21bc425
1bac1ed
 
21bc425
1bac1ed
 
3a2ac99
1bac1ed
 
21bc425
1bac1ed
 
21bc425
1bac1ed
 
21bc425
1bac1ed
 
21bc425
1bac1ed
 
21bc425
1bac1ed
 
21bc425
1bac1ed
 
21bc425
1bac1ed
 
 
 
 
 
 
 
 
 
 
 
 
 
21bc425
1bac1ed
 
 
21bc425
1bac1ed
 
 
21bc425
1bac1ed
 
 
 
 
21bc425
1bac1ed
21bc425
3a2ac99
 
21bc425
1bac1ed
21bc425
 
 
1bac1ed

NCHMARK_COLS: ['Perplexity']
=== END COLUMN SETUP ===
🔧 CHECKING MODEL TRACING AVAILABILITY...
   - Model tracing path: /home/user/app/src/evaluation/../../model-tracing
   - Path exists: True
   - main.py exists: True
🎯 Final MODEL_TRACING_AVAILABLE = True

.gitattributes:   0%|          | 0.00/2.46k [00:00<?, ?B/s]
.gitattributes: 100%|██████████| 2.46k/2.46k [00:00<00:00, 10.1MB/s]

(…)therAI_gpt-neo-1.3B_20250726_010247.json:   0%|          | 0.00/202 [00:00<?, ?B/s]
(…)therAI_gpt-neo-1.3B_20250726_010247.json: 100%|██████████| 202/202 [00:00<00:00, 748kB/s]

(…)s_facebook_opt-125m_20250726_020655.json:   0%|          | 0.00/205 [00:00<?, ?B/s]
(…)s_facebook_opt-125m_20250726_020655.json: 100%|██████████| 205/205 [00:00<00:00, 909kB/s]

(…)s_facebook_opt-350m_20250726_021737.json:   0%|          | 0.00/205 [00:00<?, ?B/s]
(…)s_facebook_opt-350m_20250726_021737.json: 100%|██████████| 205/205 [00:00<00:00, 850kB/s]

(…)ommunity_gpt2-large_20250726_013038.json:   0%|          | 0.00/214 [00:00<?, ?B/s]
(…)ommunity_gpt2-large_20250726_013038.json: 100%|██████████| 214/214 [00:00<00:00, 1.03MB/s]

(…)mmunity_gpt2-medium_20250726_015555.json:   0%|          | 0.00/216 [00:00<?, ?B/s]
(…)mmunity_gpt2-medium_20250726_015555.json: 100%|██████████| 216/216 [00:00<00:00, 730kB/s]

(…)enai-community_gpt2_20250725_231201.json:   0%|          | 0.00/209 [00:00<?, ?B/s]
(…)enai-community_gpt2_20250725_231201.json: 100%|██████████| 209/209 [00:00<00:00, 533kB/s]

(…)enai-community_gpt2_20250725_233155.json:   0%|          | 0.00/209 [00:00<?, ?B/s]
(…)enai-community_gpt2_20250725_233155.json: 100%|██████████| 209/209 [00:00<00:00, 905kB/s]

(…)enai-community_gpt2_20250725_235115.json:   0%|          | 0.00/209 [00:00<?, ?B/s]
(…)enai-community_gpt2_20250725_235115.json: 100%|██████████| 209/209 [00:00<00:00, 801kB/s]

(…)enai-community_gpt2_20250725_235748.json:   0%|          | 0.00/209 [00:00<?, ?B/s]
(…)enai-community_gpt2_20250725_235748.json: 100%|██████████| 209/209 [00:00<00:00, 856kB/s]

(…)enai-community_gpt2_20250726_000358.json:   0%|          | 0.00/209 [00:00<?, ?B/s]
(…)enai-community_gpt2_20250726_000358.json: 100%|██████████| 209/209 [00:00<00:00, 696kB/s]

(…)enai-community_gpt2_20250726_000650.json:   0%|          | 0.00/209 [00:00<?, ?B/s]
(…)enai-community_gpt2_20250726_000650.json: 100%|██████████| 209/209 [00:00<00:00, 792kB/s]

(…)enai-community_gpt2_20250726_015147.json:   0%|          | 0.00/209 [00:00<?, ?B/s]
(…)enai-community_gpt2_20250726_015147.json: 100%|██████████| 209/209 [00:00<00:00, 1.12MB/s]

🚀 STARTING GRADIO APP INITIALIZATION
📊 Initializing allowed models...

🚀 INITIALIZING ALLOWED MODELS
📋 Models to initialize: ['lmsys/vicuna-7b-v1.5', 'ibm-granite/granite-7b-base', 'EleutherAI/llemma_7b']

🧹 CLEANING NON-ALLOWED RESULT FILES
🗑️ Removing non-allowed model result: ./eval-results/EleutherAI/results_EleutherAI_gpt-neo-1.3B_20250726_010247.json (model: EleutherAI/gpt-neo-1.3B)
🗑️ Removing non-allowed model result: ./eval-results/facebook/results_facebook_opt-125m_20250726_020655.json (model: facebook/opt-125m)
🗑️ Removing non-allowed model result: ./eval-results/facebook/results_facebook_opt-350m_20250726_021737.json (model: facebook/opt-350m)
🗑️ Removing non-allowed model result: ./eval-results/openai-community/results_openai-community_gpt2-large_20250726_013038.json (model: openai-community/gpt2-large)
🗑️ Removing non-allowed model result: ./eval-results/openai-community/results_openai-community_gpt2-medium_20250726_015555.json (model: openai-community/gpt2-medium)
🗑️ Removing non-allowed model result: ./eval-results/openai-community/results_openai-community_gpt2_20250725_231201.json (model: openai-community/gpt2)
🗑️ Removing non-allowed model result: ./eval-results/openai-community/results_openai-community_gpt2_20250725_233155.json (model: openai-community/gpt2)
🗑️ Removing non-allowed model result: ./eval-results/openai-community/results_openai-community_gpt2_20250725_235115.json (model: openai-community/gpt2)
🗑️ Removing non-allowed model result: ./eval-results/openai-community/results_openai-community_gpt2_20250725_235748.json (model: openai-community/gpt2)
🗑️ Removing non-allowed model result: ./eval-results/openai-community/results_openai-community_gpt2_20250726_000358.json (model: openai-community/gpt2)
🗑️ Removing non-allowed model result: ./eval-results/openai-community/results_openai-community_gpt2_20250726_000650.json (model: openai-community/gpt2)
🗑️ Removing non-allowed model result: ./eval-results/openai-community/results_openai-community_gpt2_20250726_015147.json (model: openai-community/gpt2)
✅ Removed 12 non-allowed result files

🔧 CREATING RESULT FILE FOR: lmsys/vicuna-7b-v1.5
📁 Result file path: ./eval-results/lmsys_vicuna_7b_v1.5_float16.json
✅ Created result file: ./eval-results/lmsys_vicuna_7b_v1.5_float16.json

🔧 CREATING RESULT FILE FOR: ibm-granite/granite-7b-base
📁 Result file path: ./eval-results/ibm_granite_granite_7b_base_float16.json
✅ Created result file: ./eval-results/ibm_granite_granite_7b_base_float16.json

🔧 CREATING RESULT FILE FOR: EleutherAI/llemma_7b
📁 Result file path: ./eval-results/EleutherAI_llemma_7b_float16.json
✅ Created result file: ./eval-results/EleutherAI_llemma_7b_float16.json
✅ Initialized 3 model result files
📊 Creating initial results DataFrame...

📊 CREATE_RESULTS_DATAFRAME CALLED

=== GET_LEADERBOARD_DF DEBUG ===
Starting leaderboard creation...
Looking for results in: ./eval-results
Expected columns: ['T', 'Model', 'Average ⬆️', 'Perplexity', 'Match P-Value ⬇️', 'Type', 'Architecture', 'Precision', 'Hub License', '#Params (B)', 'Hub ❤️', 'Available on the hub', 'Model sha']
Benchmark columns: ['Perplexity']

Searching for result files in: ./eval-results
Found 0 result files

Processing 0 evaluation results

Returning 0 processed results

Found 0 raw results
No raw data found, creating empty DataFrame
Creating empty fallback DataFrame...
Empty DataFrame created with columns: ['T', 'Model', 'Average ⬆️', 'Perplexity', 'Match P-Value ⬇️', 'Type', 'Architecture', 'Precision', 'Hub License', '#Params (B)', 'Hub ❤️', 'Available on the hub', 'Model sha']
📋 Retrieved leaderboard df: (0, 13)
⚠️ DataFrame is None or empty, returning empty DataFrame
✅ Initial DataFrame created with shape: (0, 6)
📋 Columns: ['Model', 'Perplexity', 'Match P-Value', 'Average Score', 'Type', 'Precision']
🎨 Creating Gradio interface...
🎯 GRADIO INTERFACE SETUP COMPLETE
🚀 LAUNCHING GRADIO APP WITH MODEL TRACING INTEGRATION
📊 Features enabled:
   - Perplexity evaluation
   - Model trace p-value computation (vs GPT-2 base)
   - Match statistic with alignment
🎉 Ready to accept requests!
* Running on local URL:  http://0.0.0.0:7860, with SSR ⚡ (experimental, to disable set `ssr=False` in `launch()`)