Predictions for the procedures to be reproduced in the lab

The prediction set consist of the combination of the following atoms, support materials and synthesis methods:

atoms = ["Pt", "Pd", "Fe", "Co", "Ni", "Cu", "Rh", "Ru", "Mn", "Ir"]
support_materials = ["Nitrogen-doped carbon", "Graphene", "Carbon nitride", "CeO2", "TiO2"]
synthesis_methods = ["Hybrid high temperature solution phase", "Solution Phase", "Electrochemical"]

The final dataset consist of 150 inputs that our model takes to generate the SAC procedures.

The folder contains 2 subfolder:

for-experiments
for-experiments-no-lora

the 2 folders cotains the results for the previous introduced dataset. In the for-experiments prediction we use the model is the model previously introduced as just 3.1.2, while for-experiments-no-lora use the no-lora models.

In each subfolder you will find base, multi-base, synthesis and multi-synthesis. The base and multi-base predictions use only atom and support material as input, while synthesis and multi-synthesis use all the 3 input features. With multi we denote the model that was trained multitask, i.e. that it was trained with all different inputs composition.

Of course the base and multi-base are only 50 predictions, since we remove the synthesis method as constraint. I left the same id of the others though, so it is easy to compare.

Human Evaluation Streamlit App

A Streamlit-based interface is provided to enable side-by-side comparison of model outputs and collect human preferences.

Install Streamlit if not already installed:
```
pip install streamlit
```
Run the app from the project root:
```
streamlit run app.py
```

Feedback from evaluations will be appended to feedback.csv in the project root directory.