Spaces:
Runtime error
Runtime error
| .. _examples: | |
| Examples | |
| ************************ | |
| In the ``examples`` folder you can find several example training tasks. Check | |
| the configs folder for the associated configs files. ``examples.randomwalks`` | |
| does offline reinforcement on a set of graph random walks to stitch shortest | |
| paths to some destination. ``examples.simulacra`` optimizes prompts by using | |
| prompts-ratings dataset (https://github.com/JD-P/simulacra-aesthetic-captions). | |
| ``examples.architext`` tries to optimize designs represented textually by | |
| minimazing number of rooms (pretrained model is under a license on hf). | |
| ``examples.ilql_sentiments`` and ``examples.ppo_sentiments`` train to generate | |
| movie reviews with a positive sentiment, in offline setting β by fitting to IMDB | |
| dataset sentiment scores, and in online setting β by sampling finetuned on IMDB | |
| model and rating samples with learned sentiment reward model, You can tweak | |
| these scripts to your liking and tune hyperparameters to your problem if you | |
| wish to use trlx for some custom task. | |