Samples · lm_eval_harness.ifeval
KI-Auswertung
Generiert 2026-05-13 04:00 · claude-sonnet-4-6Zusammenfassung
Das Modell erreicht eine Pass-Rate von 80,2 % im IFEval-Benchmark, was auf solide, aber nicht fehlerfreie Befolgung strikter Formatanweisungen hindeutet. Bei rund einem Fünftel der Aufgaben scheitert es an präzisen Format- oder Inhaltsvorgaben.
Stärken
- Komplexe Mehrfachanweisungen (z. B. Abschnittstitel mit `SECTION X`, doppelte eckige Klammern, Wiederholung des Prompts) werden zuverlässig umgesetzt.
- Sprachliche Constraints wie Kommaverbot oder reine Kleinschreibung werden in vielen Fällen korrekt eingehalten.
- Keine technischen Errors (0 von 541 Anfragen).
Schwächen
- Exakte Zählvorgaben werden nicht eingehalten: Bei Bullet-Point-Aufgaben liefert das Modell 6 statt 3 Punkte; Pflichtwiederholungen enthalten unerlaubte Zusatzzeichen.
- Strenges Zeichenausschlussverbot (z. B. kein „t" im gesamten Text, kein „c") wird konsequent verletzt — das Modell hält solche Low-Level-Constraints nicht durchgängig ein.
- Formatvorgaben wie „genau zwei Antworten, getrennt durch `**`" werden ignoriert (nur eine Antwort ohne Trennzeichen).
- Längenvorgaben (mind. 800 Wörter, in doppelte Anführungszeichen gewickelt) werden teils nur unvollständig oder abgeschnitten erfüllt.
Auffälligkeiten
Die Failures konzentrieren sich auf zwei Mustertypen: (1) Zeichenebene-Constraints (verbotene Buchstaben, exakte Sondersymbol-Wiederholungen) und (2) exakte Mengenvorgaben (Bullet-Anzahl, Antwort-Anzahl). Komplexere semantische Anweisungen gelingen besser als niedrigschwellige, mechanische Formatregeln.
Empfehlung
Gezielte Feinabstimmung oder Chain-of-Thought-Prompting speziell für Zähl- und Zeichenebene-Constraints einsetzen; alternativ einen systematischen Constraint-Verifier als Post-Processing-Schicht ergänzen und den IFEval-Subset mit Zeichenausschluss-Aufgaben gesondert evaluieren.
Übersicht
541 Samples| Frage-ID | Status | Score | Prompt | Latenz | Tokens/s | TTFT | |
|---|---|---|---|---|---|---|---|
| 0 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write a 300+ word … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"I am planning a tr… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 2 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write a resume for… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 3 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write an email to … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 4 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Given the sentence… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 5 | failed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write a dialogue b… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 6 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write a 2 paragrap… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 7 | failed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write me a resume … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 8 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write a letter to … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 9 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write a long email… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 10 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write a blog post … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 11 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Can you help me ma… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 12 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write a story of e… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 13 | failed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write a detailed r… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 14 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write a short blog… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 15 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Please provide the… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 16 | failed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"What is a name tha… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 17 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write two jokes ab… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 18 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Are hamburgers san… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 19 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"make a tweet for p… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 20 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write a poem about… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 21 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Given the sentence… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 22 | failed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write a short star… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 23 | failed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write a logic quiz… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 24 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write a 4 section … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 25 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write the lyrics t… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 26 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Explain in French … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 27 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write a funny haik… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 28 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Rewrite the follow… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 29 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"What are the advan… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 30 | failed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write a social med… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 31 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write a rubric for… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 32 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write a riddle for… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 33 | failed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write a template f… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 34 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Explain why people… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 35 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Can you give me tw… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 36 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"What happened when… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 37 | failed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"What sentiments ex… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 38 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write an ad copy f… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 39 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Which one is a bet… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 40 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write a blog post … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 41 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write a poem about… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 42 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"How many feet off … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 43 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write an advertise… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 44 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write a funny post… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 45 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Can you give me a … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 46 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write me a templat… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 47 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"\\\"The man was ar… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 48 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write a poem that'… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 49 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"I have a dime. Wha… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 50 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write a haiku abou… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 51 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Can you elaborate … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 52 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write an essay abo… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 53 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Can you give me an… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 54 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Take the text belo… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 55 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write a limerick a… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 56 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"What is the differ… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 57 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"How did a man name… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 58 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"What is the histor… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 59 | failed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write a short prop… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 60 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"List exactly 10 po… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 61 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write an outline f… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 62 | failed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Which of the follo… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 63 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"What are the pros … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 64 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write a funny arti… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 65 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Create a table wit… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 66 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"I would like to st… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 67 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write a brief biog… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 68 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"A nucleus is a clu… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 69 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Compose song lyric… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 70 | failed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write an extravaga… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 71 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Draft a blog post … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 72 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Improper use of th… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 73 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write a planning d… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 74 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write a limerick a… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 75 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write me a funny s… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 76 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write a letter to … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 77 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write a blog post … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 78 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"List the pros and … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 79 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"I'm interested in … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 80 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write a 30-line po… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 81 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write a joke about… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 82 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"A colt is 5 feet t… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 83 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write a story abou… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 84 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write an article a… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 85 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write a song that … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 86 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Can you rewrite \\… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 87 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write a weird poem… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 88 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write a blog post … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 89 | failed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write a casual, in… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 90 | failed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write a song about… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 91 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write a haiku abou… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 92 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Generate a forum t… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 93 | failed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write a five line … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 94 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"I am a software en… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 95 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write an essay of … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 96 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Come up with a pro… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 97 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write a cover lett… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 98 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Could you tell me … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 99 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"In this task, repe… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 100 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write a file for a… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 101 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"I want to apply fo… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 102 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Name a new fashion… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 103 | failed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write a song about… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 104 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Compose a poem tha… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 105 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write a story for … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 106 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Criticize this sen… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 107 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write two limerick… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 108 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Summarize the foll… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 109 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Compose a poem all… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 110 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"First repeat the r… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 111 | failed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write an extremely… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 112 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"What's a good way … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 113 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write a rap about … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 114 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"I really love the … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 115 | failed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write a plot for a… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 116 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Make the sentence … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 117 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Create a 5 day iti… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 118 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Make a rubric for … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 119 | failed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write a short essa… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 120 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"\\\"Coincidence is… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 121 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write a cover lett… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 122 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Create a riddle ab… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 123 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write a casual blo… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 124 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"I work in the mark… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 125 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Generate two alter… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 126 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Create a resume fo… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 127 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"I'm a 12th grader … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 128 | failed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Can you write me a… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 129 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Generate a busines… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 130 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write a rubric for… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 131 | failed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Can you provide a … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 132 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write a funny and … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 133 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"For the following … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 134 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Request:\\n 1. Wh… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 135 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write a profession… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 136 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write an interesti… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 137 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write a blog post … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 138 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write a fairy tale… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 139 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"What is the next n… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 140 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Rewrite the follow… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 141 | failed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write an essay abo… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 142 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"How to write a goo… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 143 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write a song about… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 144 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"The Jimenez family… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 145 | failed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write an itinerary… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 146 | failed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Hallucinate a resu… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 147 | failed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"A filmmaker is try… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 148 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Who won the defama… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 149 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Explain to a group… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 150 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Gandalf was a wiza… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 151 | failed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write an obviously… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 152 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"I want to write a … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 153 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Can you write a po… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 154 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Explain Generative… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 155 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"I asked a friend a… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 156 | failed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write a song about… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 157 | failed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Create an English … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 158 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"I was hoping you c… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 159 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"I work for a softw… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 160 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write a very angry… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 161 | failed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Can you please con… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 162 | failed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write a short arti… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 163 | failed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write a song about… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 164 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write a riddle for… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 165 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write a strange ra… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 166 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Expand the followi… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 167 | failed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Create a blog post… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 168 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"How can I learn to… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 169 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write an angry twe… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 170 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"What do you think … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 171 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"What has a dome bu… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 172 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Rewrite the follow… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 173 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Elaborate on the f… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 174 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Why star wars is s… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 175 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write a rubric, in… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 176 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"I'm a new puppy ow… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 177 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"My brother is tryi… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 178 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Before you answer … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 179 | failed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"What are the uses … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 180 | failed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write a 100-word a… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 181 | failed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"What are some star… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 182 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write a blog post … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 183 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write a story abou… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 184 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Generate a summary… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 185 | failed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write a story from… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 186 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write a review of … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 187 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Invent a funny tag… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 188 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"What is another wo… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 189 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write a funny song… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 190 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Are the weather co… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 191 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write a summary of… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 192 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write a joke with … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 193 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write a very long … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 194 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Titan makes clothi… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 195 | failed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"A psychologist is … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 196 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Could you give me … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 197 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Could you give me … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 198 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"Write a riddle abo… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 199 | passed | {…} {"gen_args_0":{"arg_0":["[{\"role\": \"user\", \"content\": \"If you gulped down… | — | — | — | ||
|
Lade Detail …
|
|||||||