(RAG and Boolean)

Retrieval Augmented Generation and academic search engines – some suggestions for system builders

Aaron Tay persists with his awful Blogger theme, but luckily his detailed expert posts are worth it. This one is takes us through the ins and outs of scholarly search tools based on Retrieval Augmented Generation (RAG). This is the thing that drives the common “summary with sources” approach, which is sometimes claimed to be “free of hallucinations”. Tay explains why this isn't really true, and provides some interesting suggestions for how tools designed for academic researchers could be better designed.

Boolean is Dead AND I feel fine

Aaron Tay's post pairs well with this one from Mita Williams back in February. Here we get a crash course in the basics of LLM tokenisation, and an explanation of what Tay meant when he said boolean searching doesn't really work any more with AI-driven semantic search tools. Also, apparently DIALOG still exists??!!


Libraries and Learning Links of the Week is published every week by Hugh Rundle.

Subscribe by following @fedi@lllotw.hugh.run on the fediverse or sign up for email below.