Run #65

qwen3:14b unknown · Ollama · gestartet 2026-05-12 11:39:32
14.8B Q4_K_M ctx 41k qwen3 🧠 reasoning 🔧 tools
aborted
reaper:worker_dead
Aktueller Adapter lm_eval_harness.ifeval
Samples 2819 / 2819 (100%)
Errors 0
Letzter Heartbeat 14:40:48
Beendet 2026-05-12 14:41:13

Live-View

elapsed
Event-Stream

KI-Bewertung

KI-Bewertung wird im Hintergrund erstellt — Seite in ~30 s neu laden.

Stärken & Schwächen

Auf Basis der Pass-Raten dieses Runs

Stärken

Keine Sub-Benchmarks im "good"-Bereich.

Schwächen

  • humaneval (0%)
  • GSM8K — Grundschulmathe (1.6%)

Telemetrie

GPU-Auslastung (%)
VRAM (MB)

Snapshots

Konfiguration
7 Felder
{
    "name": "LM-Eval ALL",
    "provider_id": null,
    "model_id": null,
    "benchmarks": [
        {
            "adapter_key": "lm_eval_harness",
            "sub_benchmarks": [
                "gsm8k",
                "humaneval",
                "ifeval"
            ],
            "threshold_override": null
        }
    ],
    "tags": [],
    "notes": null,
    "model": {
        "base_name": "qwen3:14b",
        "quantization": "unknown",
        "format": "gguf",
        "source_url": null,
        "build_notes": "selbst kompiliertes llmama mit TurboQuant3 für Gewichte",
        "checksum": null
    }
}
Provider
7 Felder
{
    "name": "Ollama",
    "type": "ollama",
    "endpoint_url": "http://100.64.0.4:11434/",
    "api_key_env_var": null,
    "sampling_params": [],
    "provider_specific": [],
    "telemetry_sample_interval_ms": 1000
}
Hardware
1 Felder
[
    {
        "name": "kim",
        "hostname": "100.64.0.4",
        "gpu_description": "RTX 5080 16GB",
        "cpu": "Ryzen 9800 X3D",
        "ram": "64GB DDR5",
        "storage": "1TB+4TB SSD",
        "network": null,
        "notes": null
    }
]
System
6 Felder
{
    "php_version": "8.4.21",
    "os": "Linux",
    "os_release": "6.8.0-111-generic",
    "symfony_version": "7.4.10",
    "provider_version_hint": null,
    "recorded_at": "2026-05-12T11:39:32+02:00"
}

Log-Verzeichnis

/home/webuser/htdocs/llmbench.mandarin.dev/dev/app/var/logs/runs/65