Samples · lm_eval_harness.gsm8k
Run #75 · Adapter v1.0.0+humaneval-removed+gen-kwargs-pairing · 200/1319 Samples angezeigt
· Score 90.4%
KI-Auswertung
Generiert 2026-05-13 21:38 · claude-sonnet-4-6Zusammenfassung
Das Modell Qwen3-Coder-Next erreicht auf GSM8K eine Pass-Rate von 92,5 % (Score 90,4 %), was ein solides, aber nicht herausragendes Ergebnis für mehrstufige Grundschulmathematik darstellt.
Stärken
- Einfache und mittelschwere Rechenaufgaben werden zuverlässig und mit sauberem Rechenweg gelöst.
- Umrechnung von Einheiten sowie lineare Mehrstufenprobleme (Groceries, Pool-Füllungskosten, Prozentsätze) gelingen konsistent.
- Null Fehler (errors=0), das Modell bricht nie ab oder liefert ungültige Ausgaben.
Schwächen
- Aufgaben mit indirekten oder impliziten Bezügen werden falsch interpretiert, z. B. „10 % schneller laufen" wird als Zeitreduktion durch Divisor 1,1 statt als direkte Subtraktion behandelt.
- Off-by-one-Fehler bei inklusiven Zeiträumen (z. B. Gene-Quiltblock-Aufgabe: 12 statt 11 Jahre).
- Mehrdeutige Problemformulierungen verleiten zu Überanalyse, wodurch das Modell teils falsche Relationen (z. B. Lylah's Gehalt) einführt.
- Wahrscheinlichkeitsaufgaben: Das Modell berechnet korrekt, interpretiert die Frage jedoch falsch (relative statt absolute Differenz).
Auffälligkeiten
Wiederkehrendes Muster: Bei Aufgaben, die eine eindeutige, kurze Antwort erfordern, produziert das Modell ausführliche Alternativüberlegungen und verfehlt dabei das gesuchte einfache Ergebnis. Dies deutet auf eine Tendenz zur Überantwortung (verbosity bias) hin.
Empfehlung
Sampling-Temperatur senken (z. B. auf 0.0 oder greedy decoding), um das Modell bei klaren Zahlenaufgaben von spekulativen Alternativpfaden abzuhalten und die Pass-Rate weiter in Richtung 95 %+ zu treiben.
Übersicht
1319 SamplesVerteilung
Score-Histogramm
0.0 ────── 1.0
| Frage-ID | Status | Score | Prompt | Latenz | Tokens/s | TTFT | |
|---|---|---|---|---|---|---|---|
| 800 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Bill get… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 801 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Danny ma… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 802 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Katie ha… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 803 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Joan is … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 804 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Alexande… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 805 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Troy mak… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 806 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: In a hou… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 807 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Randy ha… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 808 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: A 1000 c… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 809 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Brittany… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 810 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Aaron is… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 811 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Marta is… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 812 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Tall Tun… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 813 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: On the i… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 814 | failed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Paddy's … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 815 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: A single… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 816 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Lance ha… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 817 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: A soccer… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 818 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Marcel g… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 819 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Laura to… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 820 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Jerry ha… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 821 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Johnson … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 822 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: A charit… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 823 | failed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: In four … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 824 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Andy had… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 825 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Penn ope… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 826 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Porter e… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 827 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: A flower… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 828 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Three ti… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 829 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: A blind … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 830 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: An emplo… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 831 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: The scho… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 832 | failed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Elsa sta… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 833 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Hans boo… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 834 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Alyana h… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 835 | failed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Matthias… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 836 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: A compan… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 837 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: The area… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 838 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Annie ha… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 839 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: The pric… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 840 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: George w… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 841 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Joe like… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 842 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Two whit… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 843 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: 40% of t… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 844 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Alan bou… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 845 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Madeline… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 846 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Vincent'… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 847 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: The numb… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 848 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Paula wa… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 849 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Julia ha… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 850 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Charles … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 851 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Sandra h… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 852 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Cristine… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 853 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: John pla… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 854 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Ben want… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 855 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Bailey b… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 856 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Marites … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 857 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: The clas… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 858 | failed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: In a cer… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 859 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: John inj… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 860 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: James cr… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 861 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: John bui… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 862 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: An apple… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 863 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: There ar… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 864 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: One hund… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 865 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Martin i… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 866 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Trent cr… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 867 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Tom deci… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 868 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: James ge… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 869 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Jessica … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 870 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Susan ha… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 871 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Avery op… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 872 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Jerry is… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 873 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Tamara i… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 874 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: At a bus… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 875 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Candy th… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 876 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: It takes… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 877 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Duke was… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 878 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Michael … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 879 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: There ar… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 880 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Jean nee… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 881 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Tom and … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 882 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Emmanuel… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 883 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: While bi… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 884 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: A bag of… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 885 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Malcolm … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 886 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: In a cer… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 887 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: A mad sc… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 888 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Hash has… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 889 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: There ar… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 890 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: A buildi… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 891 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Noah is … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 892 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Grant sp… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 893 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Emma tra… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 894 | failed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: A clerk … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 895 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Elizabet… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 896 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Ali had … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 897 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Marla ha… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 898 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: You have… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 899 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Tony has… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 900 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Jason wo… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 901 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Davante … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 902 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Elsie ha… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 903 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: A baker … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 904 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: In a gro… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 905 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Nancy bu… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 906 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Carla wo… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 907 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: There ar… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 908 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: John sta… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 909 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Carrie h… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 910 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Mr. John… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 911 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: John pla… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 912 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Laran ha… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 913 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Mason, N… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 914 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Mr. Caid… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 915 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Evergree… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 916 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: A blind … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 917 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Evie is … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 918 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Adam had… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 919 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: There ar… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 920 | failed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: There ar… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 921 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Gavin ha… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 922 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Elroy de… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 923 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: At Penny… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 924 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Before M… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 925 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Billy is… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 926 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Anya was… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 927 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Mrs. Ama… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 928 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Ali, Nad… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 929 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Lily goe… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 930 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Thor is … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 931 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Sherman … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 932 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: John pla… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 933 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Leila ea… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 934 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Janina s… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 935 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Erwin ea… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 936 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: John pai… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 937 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Anna's m… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 938 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Phil has… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 939 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Samantha… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 940 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: When the… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 941 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Clyde's … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 942 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: A flower… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 943 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Jorge bo… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 944 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: John buy… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 945 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Lyanna s… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 946 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Isabella… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 947 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: While pr… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 948 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Nathan b… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 949 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Three ti… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 950 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: For an o… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 951 | failed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Mandy is… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 952 | failed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Stephani… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 953 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: John get… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 954 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Every da… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 955 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Tony has… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 956 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Unique i… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 957 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Jack and… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 958 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Fred has… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 959 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Mark lov… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 960 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: John buy… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 961 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: The Lady… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 962 | failed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: A family… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 963 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: If I rea… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 964 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Wyatt's … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 965 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Merry ha… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 966 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Peter is… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 967 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Anna's m… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 968 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Jamie is… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 969 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Anne is … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 970 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: A park h… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 971 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: In the f… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 972 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: At a peo… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 973 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: A cafe h… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 974 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Kyle has… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 975 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Annie sp… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 976 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: A dietit… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 977 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Willow\u… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 978 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Kamil wa… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 979 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: A conven… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 980 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Monica i… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 981 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: A bowl o… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 982 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Madeline… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 983 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Gary bou… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 984 | failed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: John car… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 985 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: John is … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 986 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Louie se… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 987 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Mr. Maxi… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 988 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Mr. Caid… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 989 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: James ha… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 990 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Allen or… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 991 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: On a par… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 992 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Daisy's … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 993 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Brenda p… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 994 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Mitch ha… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 995 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Roman th… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 996 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Kyle mak… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 997 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: James ca… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 998 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Anya has… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 999 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Question: Unique i… | — | — | — | ||
|
Lade Detail …
|
|||||||