Modelbench Bend Tutorial

add-a-sut.md

Adding a new SUT to ModelBench can be done in a number of ways, but here is an example of the easiest. In this example, the assumption is that you want to create your own SUT -- a process that is ...

GitHub

Run safety benchmarks against AI models and view detailed reports showing how well they performed.

This is a MLCommons project, part of the AI Risk & Reliability Working Group. The project is at an early stage. You can see sample benchmarks here and our 0.5 white paper here. This project now ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results

add-a-sut.md

Run safety benchmarks against AI models and view detailed reports showing how well they performed.

Trending now