LLM error rates

 

I worked on LLMs, and now I got opinions. Today, let’s talk about when LLMs make mistakes. On AI Slop You’ve already heard of LLM mistakes, because you’ve seen them in the news.

[…]

LLMs could be used to summarize sources. Something that’s fairly obvious in my journal club is that many researchers are just citing papers they found on Google, and can’t always be bothered to actually read the things. So, an LLM could read the things. Clearly, this is a task that has some error tolerance–insofar as we don’t fire the researchers who inaccurately summarize their sources, we just complain about them in journal clubs. Could researchers do better with LLM assistance? Or would they become overreliant and do even worse? We don’t know until we test it.

I can’t tell you what LLMs will be useful for, I’m only here to help you think about the question. If you expect LLMs to have god-like reasoning skills, or to magically know things they were never taught, it’s not going to work. If you expect them to perform well on a task that realistically speaking requires 99+% accuracy, they don’t do that. But if the LLM is trying to complete a task that would otherwise be done humans, we’re also extremely prone to error, so there must be some level of error tolerance. In this case the LLM doesn’t need to be perfect, it just needs to do better than our sorry human asses.

Or maybe the LLM doesn’t need to be better than humans, it just needs to be cheaper. Sorry to say, LLMs often aren’t competing with humans on quality, but on price. —freethoughtblogs.com

Leave a Reply

Your email address will not be published. Required fields are marked *