| Name | |||||
|---|---|---|---|---|---|
| UMA Small 1.1 (OMol task) | 4.38 | 0.21 | 0.14 | 0.90 | 1.52 |
| OMol25's eSEN Conserving Small | 4.40 | 0.21 | 0.14 | 0.91 | 1.52 |
| UMA Medium 1.1 (OMol task) | 4.65 | 0.21 | 0.14 | 0.88 | 1.60 |
| OrbMol (Orb-v3 Conservative OMol) | 4.66 | 0.22 | 0.15 | 0.94 | 1.61 |
| B97-3c | 10.16 | 0.30 | 0.35 | 2.32 | 3.46 |
| AIMNet2 (ωB97M-D3, new) | 14.66 | 0.54 | 0.39 | 2.35 | 4.96 |
| AIMNet2-NSE | 17.32 | 0.55 | 0.41 | 3.03 | 5.83 |
| Prescient's StrainRelief MACE | 19.55 | 0.57 | 0.61 | 4.46 | 6.64 |
| GFN2-xTB | 18.65 | 0.72 | 0.73 | 14.60 | 6.82 |
| Orb-v3 (Conservative Inf. OMat) | 21.33 | 0.88 | 0.97 | 7.70 | 7.53 |
| eSEN-OAM | 22.16 | 0.84 | 0.69 | 7.05 | 7.67 |
| MACE-MP-0b2(Large)-D3BJ | 26.76 | 0.81 | 1.12 | 14.60 | 9.49 |
The "Overall" scores were calculated using a multi-step process. First, we assigned difficulty scores to each component benchmark by computing the ratio of MAEs between GFN2-xTB and B3LYP. We then weighted each benchmark result by multiplying it by two factors: its difficulty score and the number of systems under study. Finally, we computed a weighted average of these adjusted values to generate the "Overall" performance metric.
| Benchmark | Weight |
|---|---|
| GMTKN55 | 0.31 |
| Folmsbee | 0.38 |
| TorsionNet206 | 0.27 |
| Wiggle150 | 0.04 |
| Name | ||||
|---|---|---|---|---|
| AIMNet2 (ωB97M-D3, new) | 25 | 12.88 | 21 | 0.16 |
| AIMNet2-NSE | 24 | 100.88 | 14 | 0.67 |
| eSEN-OAM | 24 | 109.46 | 19 | 0.21 |
| OMol25's eSEN Conserving Small | 24 | 100.88 | 18 | 0.25 |
| UMA Small 1.1 (OMol task) | 24 | 101.21 | 17 | 0.38 |
| OrbMol (Orb-v3 Conservative OMol) | 14 | 88.71 | 9 | 0.50 |
| Name | |||
|---|---|---|---|
| r²SCAN-3c | 0.97 | 3.68 | 1.71 |
| UMA Small 1.1 (OMol task) | 1.43 | 2.70 | 1.78 |
| UMA Medium 1.1 (OMol task) | 2.23 | 3.02 | 2.45 |
| UMA Small 1.1 (OMC task) | 1.57 | 4.79 | 2.46 |
| Egret-1 | 2.61 | 4.38 | 3.09 |
| MACE-MP-0b2(Large)-D3BJ | 3.47 | 3.03 | 3.35 |
| GFN2-xTB | 5.38 | 7.76 | 6.03 |
| UMA Medium 1.1 (OMC task) | 9.21 | 8.78 | 9.09 |
| AIMNet2 (ωB97M-D3, new) | 9.32 | 19.41 | 12.10 |
| eSEN-OAM | 9.16 | 32.35 | 15.54 |
| Orb-v3 (Conservative Inf. OMat) | 26.79 | 12.30 | 22.80 |
The "Overall" scores were calculated using a multi-step process. First, we assigned difficulty scores to each component benchmark by computing the ratio of MAEs between GFN2-xTB and r2SCAN-3c. We then weighted each benchmark result by multiplying it by two factors: its difficulty score and the number of systems under study. Finally, we computed a weighted average of these adjusted values to generate the "Overall" performance metric.
| Benchmark | Weight |
|---|---|
| X23b Lattice Energies | 0.72 |
| X23b Cell Volumes | 0.28 |
| Name | ||
|---|---|---|
| AIMNet2 (ωB97M-D3, new) | 0.02 | 4.67 |
| AIMNet2-NSE | 0.02 | 4.52 |
| OrbMol (Orb-v3 Conservative OMol) | 0.03 | 2.51 |
| OMol25's eSEN Conserving Small | 0.10 | 0.85 |
| UMA Small 1.1 (OMol task) | 0.12 | 0.71 |
| MACE-MP-0b2(Large)-D3BJ | 0.13 | 0.68 |
| eSEN-OAM | 0.41 | 0.21 |
| UMA Medium 1.1 (OMol task) | 0.55 | 0.16 |
This benchmark measures the speed of running molecular dynamics (MD) simulations on tacrolimus (126 atoms) through ASE with a 1 fs timestep at 300 K for 50 steps. All calculations were run on A10G GPUs through Modal. See all speed results.