Can LLMs find 0day? Adventures in cybersecurity evals

May 21, 2024

In this article

May 21, 2024

Yoni Rozenshein's BlueHat IL 2024 talk is about our philosophy for evaluating AI dangerous cyber capabilities, how we actually do it (let's make an LLM play CTF!), and who cares about it (governments and frontier AI labs).

Watch the full presentation: https://www.youtube.com/watch?v=05-zL4f9V-Y

May 21, 2024

To cite this article, please credit Irregular with a link to this page, or click to view the BibTeX citation.

BACK TO PUBLICATIONS