Spaces:
Running
Running
Update README.md
Browse files
README.md
CHANGED
@@ -11,6 +11,9 @@ pinned: false
|
|
11 |
|
12 |
Model/Data associated with research project *Autonomous Evaluation and Refinement of Digital Agents*.
|
13 |
|
|
|
|
|
|
|
14 |
[Jiayi Pan](https://www.jiayipan.me/), [Yichi Zhang](https://sled.eecs.umich.edu/author/yichi-zhang/), [Nicholas Tomlin](https://people.eecs.berkeley.edu/~nicholas_tomlin/), [Yifei Zhou](https://yifeizhou02.github.io/), [Sergey Levine](https://people.eecs.berkeley.edu/~svlevine/), [Alane Suhr](https://www.alanesuhr.com/)
|
15 |
|
16 |
UC Berkeley, University of Michigan
|
|
|
11 |
|
12 |
Model/Data associated with research project *Autonomous Evaluation and Refinement of Digital Agents*.
|
13 |
|
14 |
+
TLDR: We explore the design and use of model-based evaluators to both evaluate and autonomously refine the performance of digital agents. Experiments show that domain-general automated evaluators can significantly improve the performance of digital agents, without any extra supervision.
|
15 |
+
|
16 |
+
|
17 |
[Jiayi Pan](https://www.jiayipan.me/), [Yichi Zhang](https://sled.eecs.umich.edu/author/yichi-zhang/), [Nicholas Tomlin](https://people.eecs.berkeley.edu/~nicholas_tomlin/), [Yifei Zhou](https://yifeizhou02.github.io/), [Sergey Levine](https://people.eecs.berkeley.edu/~svlevine/), [Alane Suhr](https://www.alanesuhr.com/)
|
18 |
|
19 |
UC Berkeley, University of Michigan
|