Amazing! I have one query though, If the task is just code retrieval based on function names then wouldn't just doing grep or ripgrep will likely also give much higher accuracy. Even in the HashHop task using a regex search for * -> * will likely churn out a couple of matches which we can then pass to an LM for filtering and final answer generation. My perspective is that the more structure we impose on a retrieval task the more incentives we have to offload it to specialized tools (which can perform the structure task with 95% accuracy). Needle in haystack seems to be hard because it is semantic retrieval and there is not structure it in
Again awesome work! looking forward to what you cook next