Spaces:

doncamilom
/

sac-evals

Sleeping

File size: 1,825 Bytes

a781bc6

# Predictions for the procedures to be reproduced in the lab


The prediction set consist of the combination of the following atoms, support materials and synthesis methods:

- atoms = ["Pt", "Pd", "Fe", "Co", "Ni", "Cu", "Rh", "Ru", "Mn", "Ir"]
- support_materials = ["Nitrogen-doped carbon", "Graphene", "Carbon nitride", "CeO2", "TiO2"]
- synthesis_methods = ["Hybrid high temperature solution phase", "Solution Phase", "Electrochemical"]

The final dataset consist of 150 inputs that our model takes to generate the SAC procedures.

The folder contains 2 subfolder:

- `for-experiments`
- `for-experiments-no-lora`

the 2 folders cotains the results for the previous introduced dataset. In the `for-experiments` prediction we use the model is the model previously introduced as just 3.1.2, while `for-experiments-no-lora` use the no-lora models. 

In each subfolder you will find `base`, `multi-base`, `synthesis` and `multi-synthesis`. The `base` and `multi-base` predictions use only atom and support material as input, while `synthesis` and `multi-synthesis` use all the 3 input features. With `multi` we denote the model that was trained multitask, i.e. that it was trained with all different inputs composition.

Of course the `base` and `multi-base` are only 50 predictions, since we remove the synthesis method as constraint. I left the same id of the others though, so it is easy to compare.

## Human Evaluation Streamlit App

A Streamlit-based interface is provided to enable side-by-side comparison of model outputs and collect human preferences.

1. Install Streamlit if not already installed:
   ```bash
   pip install streamlit
   ```
2. Run the app from the project root:
   ```bash
   streamlit run app.py
   ```

Feedback from evaluations will be appended to `feedback.csv` in the project root directory.