Using LLMs for web search — Ankur Sethi's Internet Website

I have two main use-cases for LLMs: writing code and searching the web. There’s a lot of discussion online about LLM-assisted programming but very little about LLM-assisted search, so here are some unstructured thoughts about just that.

OpenAI calls their web search feature ChatGPT Deep Research, Google has Gemini Deep Research, and Anthropic just uses the word Research. Regardless of the label, all these products work the same way:

You enter a prompt, just like in any other LLM workflow.
The LLM may ask you a clarifying question (for some reason it’s always a single clarifying question, never more than that).
The LLM searches the web for pages matching your prompt using a traditional search engine.
It uses the web pages it found to generate a long report, citing its sources wherever it makes a claim within the text of the report.

I like this feature a lot, and I lean on it heavily when I’m looking for high-quality long-form writing by actual human beings on a topic that’s new to me. I use Kagi for most of my search queries, but it helps to turn to an LLM when I’m completely unfamiliar with a topic, when I’m not even sure what keywords to search for.

Or, sometimes, when I just don’t feel like sifting through thousands of words of SEO content to find a few good results.

I rarely ask LLMs factual questions because I don’t trust the answers they return. Unless I have a way to verify their output, I consider anything an LLM says to be a hallucination. This is doubly true when I’m learning something new. If I’m new to Rust, how can I be sure that the Rust best practices Claude is telling me about are truly the best of all Rust practices?

I find it much easier to trust LLM-generated output when it cites web pages I can read myself. This lets me verify that the information comes from an entity I can trust, an entity that’s (hopefully) a real person or institution with real expertise. Grounding the LLM in web search also means I’m likely to find more up to date information that’s not in its training data yet (though, in my experience, this is not always guaranteed).

Whenever I get curious about something these days, I write down a detailed question, submit it to Claude, and go off to get some work done while it churns away browsing hundreds of web pages looking for answers. Sometimes I end up with multiple browser tabs with a different search query running in each of them.

For most questions, Claude is able to write me a detailed report with citations in five to ten minutes, and it looks at about a hundred pages in the process. Sometimes it decides to spend a lot more time browsing, reading multiple hundreds of web pages. For one question, it churned away for almost seventeen minutes and looked at five hundred and fifty sources to generate its report.

I don’t actually care for the report Claude produces at the end of its research process. I almost never read it. It’s a a whole lot of LLM slop: unreadable prose, needlessly verbose, often misrepresenting the very sources it quotes. I only care about the links it finds, which are usually entirely different from what I get out of Kagi.

My workflow is to skim the report to get an idea of its general structure, open all the links in new tabs, and close the report itself. I wish there was a mode where the “report” could just be a list of useful links, though I suppose I could accomplish that with some clever prompting.

The web pages Claude surfaces always surprise me. I can’t tell what search index it uses, what search keywords it uses under the hood, or how it decides what links are worth clicking. Whatever the secret sauce is, I regularly end up on web pages that I’d never be able to find through a regular search engine: personal websites that were last updated 20 years ago, columns from long-defunct magazines, ancient Blogger and LiveJournal blogs, pages hidden deep inside some megcorporation’s labyrinthian support website, lecture notes hosted on university websites, PDFs from exposed wp-content directories, and other unexpected wonders from a web that search engines try their best to hide away.

Sometimes Claude links to pages that aren’t even online anymore! How is it able to cite these pages if it can’t actually read them? Unclear. I often have to pull up a cached version of such pages using the Internet Archive. For example, one of the reports it produced linked to On Outliners and Outline Processing and Getting Started Blogging with Drummer, both of which no longer exist on the web.

I can’t tell whether any major LLM providers take their web search products seriously (outside of Perplexity, which is not technically an LLM provider). I certainly haven’t seen any new changes made to them since they were introduced, and nobody seems to talk about them very much. For me, though, web search is one of the main reasons I use LLMs at all. That’s partly why I’m giving Anthropic that $20 every month.

I have a long wishlist of features I want to see in LLM-powered search products:

Stop hiding away web search inside a menu! Let me directly click “New search” in the same way I click “New chat” or “Code” in the Claude sidebar.
Let me edit the LLM’s research plan before it starts searching (Gemini lets me do this to some degree).
Let me edit the keywords the LLM will use to start its research process. Or, let me ask it to automatically refine those keywords before it begins.
What if the LLM finds new information online that re-contextualizes my original query? In those cases, allow it to interrupt its research process and ask for clarifications.
Let me look at the raw search results for each keyword the LLM searched for.
Add a mode where the LLM picks the best search results for me and only returns a list of links, like a traditional search engine.
Let me use “lenses” in the same way I use them on Kagi. Allow me to place limits on my sources (e.g social media, personal blogs, news websites, or academic journals).
Let me uprank/downrank/ban certain sources in the same way Kagi allows.

Maybe Kagi Assistant will grow into this in the future? Maybe I should try using Perplexity? I’ve had meh experiences with both these products, and I’m not sure whether they can compete with the quality of results ChatGPT/Claude/Gemini surface.

Anyway, yeah. I like LLM-powered search.