Can LLMs find 0day? Adventures in cybersecurity evals

May 21, 2024

In this article

    May 21, 2024

    Yoni Rozenshein's BlueHat IL 2024 talk is about our philosophy for evaluating AI dangerous cyber capabilities, how we actually do it (let's make an LLM play CTF!), and who cares about it (governments and frontier AI labs).

    Watch the full presentation: https://www.youtube.com/watch?v=05-zL4f9V-Y

    May 21, 2024

    Yoni Rozenshein's BlueHat IL 2024 talk is about our philosophy for evaluating AI dangerous cyber capabilities, how we actually do it (let's make an LLM play CTF!), and who cares about it (governments and frontier AI labs).

    Watch the full presentation: https://www.youtube.com/watch?v=05-zL4f9V-Y

    To cite this article, please credit Irregular with a link to this page, or click to view the BibTeX citation.

    To cite this article, please credit Irregular with a link to this page, or click to view the BibTeX citation.