Post
1397
I've made some improvements to my custom Deep_Research tool in the
Nymbo/Tools MCP server. I've added a second LLM process and it still takes less than 1 minute to complete!
The original version of my Deep_Research tool would basically dump up to 50 fetched webpages onto the Researcher model (
# New "Filterer" Process
The new process includes another LLM call before the researcher process. The Filterer (also
# Researcher Context
The Researcher now gets only the relevant webpages, then begins writing the report. When testing with 50 initial results, the researcher would often end up with 10-20 results of relevant context.
It still takes less than a minute to accomplish everything, thanks entirely to Cerebras inference. It now takes about 35-45 seconds to complete once the tool is run.
It's also worth noting that both the Filterer and Researcher now are provided the current time/date before they see the content, reducing hallucinations caused by knowledge cutoffs.
The original version of my Deep_Research tool would basically dump up to 50 fetched webpages onto the Researcher model (
Qwen3-235B
), with only a little bit of context shown from each page.# New "Filterer" Process
The new process includes another LLM call before the researcher process. The Filterer (also
Qwen3-235B
) gets the query summary and the original 50 pages with low context, and decides which pages are most relevant to the research topic. The Filterer then outputs the URLs to the relevant pages, which are then re-fetched (with more context) and sent to the Researcher.# Researcher Context
The Researcher now gets only the relevant webpages, then begins writing the report. When testing with 50 initial results, the researcher would often end up with 10-20 results of relevant context.
It still takes less than a minute to accomplish everything, thanks entirely to Cerebras inference. It now takes about 35-45 seconds to complete once the tool is run.
It's also worth noting that both the Filterer and Researcher now are provided the current time/date before they see the content, reducing hallucinations caused by knowledge cutoffs.