Samples · lm_eval_harness.gsm8k
Run #75 · Adapter v1.0.0+humaneval-removed+gen-kwargs-pairing · 200/1319 Samples angezeigt
· Score 90.4%
KI-Auswertung
Generiert 2026-05-13 21:38 · claude-sonnet-4-6Zusammenfassung
Das Modell Qwen3-Coder-Next erreicht auf GSM8K eine Pass-Rate von 92,5 % (Score 90,4 %), was ein solides, aber nicht herausragendes Ergebnis für mehrstufige Grundschulmathematik darstellt.
Stärken
- Einfache und mittelschwere Rechenaufgaben werden zuverlässig und mit sauberem Rechenweg gelöst.
- Umrechnung von Einheiten sowie lineare Mehrstufenprobleme (Groceries, Pool-Füllungskosten, Prozentsätze) gelingen konsistent.
- Null Fehler (errors=0), das Modell bricht nie ab oder liefert ungültige Ausgaben.
Schwächen
- Aufgaben mit indirekten oder impliziten Bezügen werden falsch interpretiert, z. B. „10 % schneller laufen" wird als Zeitreduktion durch Divisor 1,1 statt als direkte Subtraktion behandelt.
- Off-by-one-Fehler bei inklusiven Zeiträumen (z. B. Gene-Quiltblock-Aufgabe: 12 statt 11 Jahre).
- Mehrdeutige Problemformulierungen verleiten zu Überanalyse, wodurch das Modell teils falsche Relationen (z. B. Lylah's Gehalt) einführt.
- Wahrscheinlichkeitsaufgaben: Das Modell berechnet korrekt, interpretiert die Frage jedoch falsch (relative statt absolute Differenz).
Auffälligkeiten
Wiederkehrendes Muster: Bei Aufgaben, die eine eindeutige, kurze Antwort erfordern, produziert das Modell ausführliche Alternativüberlegungen und verfehlt dabei das gesuchte einfache Ergebnis. Dies deutet auf eine Tendenz zur Überantwortung (verbosity bias) hin.
Empfehlung
Sampling-Temperatur senken (z. B. auf 0.0 oder greedy decoding), um das Modell bei klaren Zahlenaufgaben von spekulativen Alternativpfaden abzuhalten und die Pass-Rate weiter in Richtung 95 %+ zu treiben.
Übersicht
1319 SamplesVerteilung
Score-Histogramm
0.0 ────── 1.0
| Frage-ID | Status | Score | Prompt | Latenz | Tokens/s | TTFT | |
|---|---|---|---|---|---|---|---|
| 1000 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Rose bou… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1001 | failed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Thomas h… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1002 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Nala fou… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1003 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: John's g… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1004 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Ray buys… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1005 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Calvin a… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1006 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Jacob ca… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1007 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Lidia bo… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1008 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: The aver… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1009 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: A man de… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1010 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Jason is… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1011 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Rebecca … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1012 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: The Peri… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1013 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Quinn ca… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1014 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: A footba… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1015 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Nancy bu… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1016 | failed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Ian is l… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1017 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Carrie i… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1018 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Carla is… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1019 | failed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: A farmer… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1020 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Emma buy… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1021 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Tom orig… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1022 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Ken crea… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1023 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: A pie sh… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1024 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: A car us… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1025 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: It\u2019… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1026 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Milly is… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1027 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Barney c… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1028 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Mr. Bodh… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1029 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Daragh h… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1030 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Christia… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1031 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Josie's … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1032 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: On a far… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1033 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: A baker … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1034 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: The age … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1035 | failed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Kobe and… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1036 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: There ar… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1037 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: James ca… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1038 | failed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: A bag of… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1039 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Last yea… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1040 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: The tota… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1041 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: It takes… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1042 | failed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Kobe and… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1043 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Lilith i… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1044 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Nine hun… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1045 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Daria is… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1046 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Megan pa… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1047 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Every ye… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1048 | failed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: John cli… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1049 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Michelle… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1050 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Joseph a… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1051 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: After a … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1052 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: LaKeisha… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1053 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: One of t… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1054 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Brenda v… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1055 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Chantal … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1056 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Celina e… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1057 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Mr. Mitc… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1058 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: In 5 yea… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1059 | failed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Smaug th… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1060 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Martha h… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1061 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: If Sally… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1062 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: At a gym… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1063 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Matt is … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1064 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Two-thir… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1065 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Janice h… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1066 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Phoebe e… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1067 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: It takes… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1068 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: John sco… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1069 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: After sh… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1070 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: A crayon… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1071 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: At a CD … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1072 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: John buy… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1073 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Christin… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1074 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: In five … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1075 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Uncle Ju… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1076 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Marianne… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1077 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Helen cu… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1078 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Jackie w… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1079 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Hattie a… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1080 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Pima inv… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1081 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Beatrice… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1082 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Samara a… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1083 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: A train … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1084 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: There ar… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1085 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: On a roa… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1086 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Austin h… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1087 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: There ar… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1088 | failed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Mark buy… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1089 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Archie i… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1090 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: John buy… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1091 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Wilson g… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1092 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Thomas h… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1093 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: There ar… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1094 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: A craft … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1095 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Isabella… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1096 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: At the B… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1097 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: 20 birds… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1098 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: James sp… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1099 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Neeley b… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1100 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Jill spe… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1101 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Next yea… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1102 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Jason's … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1103 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: There ar… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1104 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Amber is… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1105 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: The cost… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1106 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: John dec… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1107 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Kelsey h… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1108 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Tom rent… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1109 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Wanda we… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1110 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: A clinic… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1111 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Bobby ne… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1112 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: A school… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1113 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Paul is … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1114 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Ian used… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1115 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: John goe… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1116 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Grace ju… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1117 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Victoria… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1118 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Bill's r… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1119 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Each mem… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1120 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Bob orde… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1121 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Stormi i… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1122 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Jason go… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1123 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Dino doe… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1124 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: After sh… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1125 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Carl has… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1126 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: A hotel … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1127 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Hansel h… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1128 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Megan is… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1129 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Mr. Caid… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1130 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Ittymang… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1131 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Building… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1132 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Mark has… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1133 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: The bala… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1134 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: A city h… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1135 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Veronica… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1136 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Darcy wa… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1137 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: An autho… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1138 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Kate sav… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1139 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Steve ow… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1140 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Nurse Mi… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1141 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: John has… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1142 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: James do… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1143 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: The numb… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1144 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Tom read… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1145 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Winston … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1146 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: James pa… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1147 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Borgnine… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1148 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Milena i… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1149 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Michael\… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1150 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Juan bou… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1151 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: In a com… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1152 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: When Mic… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1153 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Myrtle\u… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1154 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Jake has… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1155 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Janet ha… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1156 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Marcus c… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1157 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: A school… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1158 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: John eat… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1159 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: During t… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1160 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Evan\u20… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1161 | failed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Mike beg… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1162 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: The ski … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1163 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Jimmy is… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1164 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Louise i… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1165 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: To make … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1166 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: A class … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1167 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: There we… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1168 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: A bag wa… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1169 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Vanessa … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1170 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Ginger l… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1171 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Jason is… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1172 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Jerry wa… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1173 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Clementi… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1174 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Paul nee… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1175 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Marnie o… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1176 | failed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: James bu… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1177 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: The outd… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1178 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: A class … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1179 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Mary Ann… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1180 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Willie c… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1181 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Last nig… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1182 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Care and… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1183 | failed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: While ch… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1184 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Cora sta… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1185 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Stuart i… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1186 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: John ass… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1187 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Frank ha… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1188 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Ellen is… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1189 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Hans boo… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1190 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: An apart… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1191 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Josh has… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1192 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Patricia… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1193 | failed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Hershel … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1194 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: James wa… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1195 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Mr. Sanc… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1196 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Janice a… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1197 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Yulia wa… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1198 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: To make … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1199 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Quentavi… | — | — | — | ||
|
Lade Detail …
|
|||||||