Webinars / Webinars / Benchmarking AI models: Metrics, evaluations & leaderboards

Benchmarking AI models: Metrics, evaluations & leaderboards

Measuring whether your AI security systems are actually getting better.

Presented by Hack The Box

Benchmarking AI models: Metrics, evaluations & leaderboards

HACK THE BOX WEBINAR

HACK THE BOX WEBINAR

25 March 2026

clock

4 PM GMT / 11 AM EST

pin

Online

tag

Free

seats

Limited Spaces Available

Overview

As organizations experiment with AI for security, a key question arises – how do we measure an AI agent’s skills and improvements?

This webinar dives deeper into benchmarking and standard evaluations for AI in cyber operations. We will discuss what performance metrics actually matter: Accuracy in detecting threats? Success rate in exploiting vulnerabilities? speed of response? Mean time between failure?

The session will highlight the approach and methodology of HTB’s AI Range in providing board‑ready scorecards and leaderboards to compare AI models on common security problems. For instance, we will walk through a sample leaderboard of AI agents tasked with an OWASP Top 10 Web App framework – illustrating how the main foundational models stack up in terms of vulnerabilities found, time taken, and more. By establishing accurate comparisons, attendees will learn how to prove whether a security AI tool is effective or getting better over time, crucial for justifying investments in AI.

Key Takeaways:

  • Learn which KPIs and metrics are most useful for evaluating AI in security and depth of telemetry that HTB AI Range can provide with its benchmarks.
  • See how the HTB methodology can drive improvement by benchmarking AI agents on a continuously updated pool of challenges.
  • Get an update on industry efforts toward standardized AI security evaluations to align your team’s testing with broader best practices, and speak the same language of performance as your peers/competitors.

Speakers

Hack The Box Webinar Speaker

Giacomo Bertollo

Strategic Product Marketing Manager, AI Solutions @ Hack The Box

Hack The Box Webinar Speaker

Niko Maroulis

VP Artificial Intelligence @ Hack The Box

HACK THE BOX WEBINAR

BENCHMARKING AI MODELS

25 March 2026

clock

4 PM GMT / 11 AM EST

pin

Online

tag

Free

seats

Limited Spaces Available

Never miss another webinar

Share with us your best email and we will make sure you know about our next webinar right on time.