Samples · lm_eval_harness.gsm8k
Run #40 · Adapter v1.0.0+humaneval-unsafe-flag · 200/2638 Samples angezeigt
KI-Auswertung
Keine KI-Auswertung verfügbar.
Übersicht
2638 SamplesVerteilung
Score-Histogramm
0.0 ────── 1.0
| Frage-ID | Status | Score | Prompt | Latenz | Tokens/s | TTFT | |
|---|---|---|---|---|---|---|---|
| 1200 | failed | {…} {"gen_args_0":{"arg_0":"Question: James is sitting outside, counting how many pe… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1201 | failed | {…} {"gen_args_0":{"arg_0":"Question: Josh and Anna were both born on August 17th, b… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1202 | failed | {…} {"gen_args_0":{"arg_0":"Question: John takes 3 naps a week. Each nap is 2 hours… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1203 | failed | {…} {"gen_args_0":{"arg_0":"Question: Winston has 14 quarters. He then spends half a… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1204 | failed | {…} {"gen_args_0":{"arg_0":"Question: Marissa has 4.5 feet of ribbon that she wants … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1205 | failed | {…} {"gen_args_0":{"arg_0":"Question: Monika went out for the day and spent some mon… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1206 | failed | {…} {"gen_args_0":{"arg_0":"Question: Mikaela was repainting her bathroom. She bough… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1207 | failed | {…} {"gen_args_0":{"arg_0":"Question: 1 chocolate bar costs $1.50 and can be broken … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1208 | failed | {…} {"gen_args_0":{"arg_0":"Question: Betty has a tray of cookies and a tray of brow… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1209 | failed | {…} {"gen_args_0":{"arg_0":"Question: Brian can only hold his breath underwater for … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1210 | failed | {…} {"gen_args_0":{"arg_0":"Question: Frank has 7 one-dollar bills, 4 five-dollar bi… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1211 | failed | {…} {"gen_args_0":{"arg_0":"Question: Joyce, Michael, Nikki, and Ryn have a favorite… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1212 | failed | {…} {"gen_args_0":{"arg_0":"Question: A pad of paper comes with 60 sheets. Evelyn us… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1213 | failed | {…} {"gen_args_0":{"arg_0":"Question: Carson needs to mow the lawn and plant some fl… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1214 | failed | {…} {"gen_args_0":{"arg_0":"Question: Percy wants to save up for a new PlayStation, … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1215 | failed | {…} {"gen_args_0":{"arg_0":"Question: A window has 4 glass panels each. A house has … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1216 | failed | {…} {"gen_args_0":{"arg_0":"Question: A 10 meters yarn was cut into 5 equal parts. I… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1217 | failed | {…} {"gen_args_0":{"arg_0":"Question: Miggy's mom brought home 3 bags of birthday ha… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1218 | failed | {…} {"gen_args_0":{"arg_0":"Question: A bicycle shop owner adds 3 bikes to her stock… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1219 | failed | {…} {"gen_args_0":{"arg_0":"Question: On the island of Castor, there are 40 chess pl… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1220 | failed | {…} {"gen_args_0":{"arg_0":"Question: Jerry is sweeping up pieces of broken glass in… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1221 | failed | {…} {"gen_args_0":{"arg_0":"Question: Paul, a biology teacher, assigns 265 points in… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1222 | passed | {…} {"gen_args_0":{"arg_0":"Question: Two bowls are holding marbles, and the first b… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1223 | failed | {…} {"gen_args_0":{"arg_0":"Question: Kurt's old refrigerator cost $0.85 a day in el… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1224 | passed | {…} {"gen_args_0":{"arg_0":"Question: Mary is chopping up some old furniture to make… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1225 | failed | {…} {"gen_args_0":{"arg_0":"Question: Juanita enters a drumming contest. It costs $1… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1226 | failed | {…} {"gen_args_0":{"arg_0":"Question: Gunther needs to clean his apartment. It take… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1227 | failed | {…} {"gen_args_0":{"arg_0":"Question: A question and answer forum has 200 members. T… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1228 | failed | {…} {"gen_args_0":{"arg_0":"Question: A bird is building a nest from twigs. The bird… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1229 | failed | {…} {"gen_args_0":{"arg_0":"Question: Sam is serving spaghetti and meatballs for din… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1230 | failed | {…} {"gen_args_0":{"arg_0":"Question: Marj has two $20 bills, three $5 bills, and $4… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1231 | failed | {…} {"gen_args_0":{"arg_0":"Question: Tina decides to fill a jar with coins. In the … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1232 | failed | {…} {"gen_args_0":{"arg_0":"Question: Justice has 3 ferns, 5 palms, and 7 succulent … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1233 | failed | {…} {"gen_args_0":{"arg_0":"Question: James builds a 20 story building. The first 1… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1234 | failed | {…} {"gen_args_0":{"arg_0":"Question: Two years ago, Jared was twice as old as Tom. … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1235 | failed | {…} {"gen_args_0":{"arg_0":"Question: Brandon has been fired from half the businesse… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1236 | failed | {…} {"gen_args_0":{"arg_0":"Question: Ali and Sara ate 80 small apples combined. Ali… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1237 | failed | {…} {"gen_args_0":{"arg_0":"Question: Dan spent an hour doing 400 work tasks at $0.2… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1238 | failed | {…} {"gen_args_0":{"arg_0":"Question: Luther made 12 pancakes for breakfast. His fam… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1239 | failed | {…} {"gen_args_0":{"arg_0":"Question: Laura loves to cook. One day she decided to ma… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1240 | failed | {…} {"gen_args_0":{"arg_0":"Question: Jackson wants to go on a shopping spree, so hi… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1241 | failed | {…} {"gen_args_0":{"arg_0":"Question: Felix is chopping down trees in his backyard. … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1242 | failed | {…} {"gen_args_0":{"arg_0":"Question: The Period 1 gym class has 5 fewer than twice … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1243 | failed | {…} {"gen_args_0":{"arg_0":"Question: Theresa has 5 more than thrice as many video g… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1244 | failed | {…} {"gen_args_0":{"arg_0":"Question: Cassy packs 12 jars of jam in 10 boxes while s… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1245 | failed | {…} {"gen_args_0":{"arg_0":"Question: John writes 3 stories every week. Each short … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1246 | failed | {…} {"gen_args_0":{"arg_0":"Question: Lorie has 2 pieces of $100 bills. He requested… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1247 | failed | {…} {"gen_args_0":{"arg_0":"Question: Five coworkers were talking during the lunch b… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1248 | failed | {…} {"gen_args_0":{"arg_0":"Question: Ivan has a bird feeder in his yard that holds … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1249 | failed | {…} {"gen_args_0":{"arg_0":"Question: Jean has 3 grandchildren. She buys each grand… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1250 | failed | {…} {"gen_args_0":{"arg_0":"Question: John manages to run 15 mph for his whole 5-mil… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1251 | failed | {…} {"gen_args_0":{"arg_0":"Question: Bill is trying to count the toddlers at his da… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1252 | failed | {…} {"gen_args_0":{"arg_0":"Question: Kimberly went strawberry picking with her fami… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1253 | failed | {…} {"gen_args_0":{"arg_0":"Question: There are many fish in the tank. One third of … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1254 | failed | {…} {"gen_args_0":{"arg_0":"Question: Dianne runs a store that sells books. 37% of … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1255 | failed | {…} {"gen_args_0":{"arg_0":"Question: A company hires employees on a contract basis,… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1256 | failed | {…} {"gen_args_0":{"arg_0":"Question: There are 30 students in Ms. Leech's class. Tw… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1257 | failed | {…} {"gen_args_0":{"arg_0":"Question: A plane takes off at 6:00 a.m. and flies for 4… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1258 | failed | {…} {"gen_args_0":{"arg_0":"Question: Hani said she would do 3 more situps per minut… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1259 | failed | {…} {"gen_args_0":{"arg_0":"Question: Ember is half as old as Nate who is 14. When … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1260 | passed | {…} {"gen_args_0":{"arg_0":"Question: Emmett does 12 jumping jacks, 8 pushups, and 2… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1261 | failed | {…} {"gen_args_0":{"arg_0":"Question: Viviana has five more chocolate chips than Sus… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1262 | failed | {…} {"gen_args_0":{"arg_0":"Question: To earn an airplane pilot certificate, Sangita… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1263 | failed | {…} {"gen_args_0":{"arg_0":"Question: The journey from Abel's house to Alice's house… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1264 | failed | {…} {"gen_args_0":{"arg_0":"Question: John's pool is 5 feet deeper than 2 times Sara… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1265 | failed | {…} {"gen_args_0":{"arg_0":"Question: One batch of cookies requires 4 cups of flour … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1266 | failed | {…} {"gen_args_0":{"arg_0":"Question: Charlie can make 5350 steps while running on a… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1267 | failed | {…} {"gen_args_0":{"arg_0":"Question: Two days ago, Uncle Welly planted 50 roses on … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1268 | failed | {…} {"gen_args_0":{"arg_0":"Question: Kim has 4 dozen shirts. She lets her sister h… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1269 | failed | {…} {"gen_args_0":{"arg_0":"Question: Jose bought 20,000 square meters of land and n… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1270 | failed | {…} {"gen_args_0":{"arg_0":"Question: The battery charge in Mary\u2019s cordless vac… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1271 | failed | {…} {"gen_args_0":{"arg_0":"Question: Ahmed is 11 years old and Fouad is 26 years ol… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1272 | failed | {…} {"gen_args_0":{"arg_0":"Question: Anika has 4 more than twice the number of penc… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1273 | failed | {…} {"gen_args_0":{"arg_0":"Question: Mark constructed a deck that was 30 feet by 40… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1274 | failed | {…} {"gen_args_0":{"arg_0":"Question: Hattie and her friend Lorelei are doing a jump… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1275 | failed | {…} {"gen_args_0":{"arg_0":"Question: There were three jars of candy in the cabinet.… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1276 | failed | {…} {"gen_args_0":{"arg_0":"Question: A bond paper ream has 500 sheets and costs $27… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1277 | failed | {…} {"gen_args_0":{"arg_0":"Question: 9 years from now, John will be 3 times as old … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1278 | failed | {…} {"gen_args_0":{"arg_0":"Question: A grocery store has 4 kinds of jelly. They sel… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1279 | failed | {…} {"gen_args_0":{"arg_0":"Question: Kim drives 30 miles to her friend's house. On… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1280 | failed | {…} {"gen_args_0":{"arg_0":"Question: Joe goes camping with his dad on a Friday. Joe… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1281 | failed | {…} {"gen_args_0":{"arg_0":"Question: Archie holds the school record for most touchd… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1282 | failed | {…} {"gen_args_0":{"arg_0":"Question: A rectangle has a length of 3 inches and a wid… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1283 | failed | {…} {"gen_args_0":{"arg_0":"Question: A bag of pistachios has 80 pistachios in it. … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1284 | failed | {…} {"gen_args_0":{"arg_0":"Question: A building has 10 floors. It takes 15 seconds … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1285 | failed | {…} {"gen_args_0":{"arg_0":"Question: Chris has twelve marbles, and Ryan has twenty-… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1286 | failed | {…} {"gen_args_0":{"arg_0":"Question: Kendra made 4 more than five times as many dec… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1287 | failed | {…} {"gen_args_0":{"arg_0":"Question: An apple tree produces 40 apples in its first … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1288 | failed | {…} {"gen_args_0":{"arg_0":"Question: A tank with a capacity of 8000 gallons is 3\/4… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1289 | failed | {…} {"gen_args_0":{"arg_0":"Question: James has 3 fish tanks. 1 of the tanks has 20… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1290 | failed | {…} {"gen_args_0":{"arg_0":"Question: Elsa started the day with 40 marbles. At brea… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1291 | passed | {…} {"gen_args_0":{"arg_0":"Question: Flies are Betty's frog's favorite food. Every … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1292 | failed | {…} {"gen_args_0":{"arg_0":"Question: Celina enjoys hiking in the mountains. Due to … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1293 | failed | {…} {"gen_args_0":{"arg_0":"Question: Hannah harvests 5 strawberries daily for the n… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1294 | failed | {…} {"gen_args_0":{"arg_0":"Question: An artist spends 30 hours every week painting.… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1295 | failed | {…} {"gen_args_0":{"arg_0":"Question: Jon makes 3\/4's the salary that Karen makes. … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1296 | failed | {…} {"gen_args_0":{"arg_0":"Question: Jeannie hikes the 12 miles to Mount Overlook a… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1297 | failed | {…} {"gen_args_0":{"arg_0":"Question: The bakery has 8 indoor tables and 12 outdoor … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1298 | failed | {…} {"gen_args_0":{"arg_0":"Question: The teacher brings in 14 mini-cupcakes and 12 … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1299 | failed | {…} {"gen_args_0":{"arg_0":"Question: Sally took 342 pens to her class of 44 student… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1300 | failed | {…} {"gen_args_0":{"arg_0":"Question: John's shirt cost 60% more than his pants. Hi… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1301 | failed | {…} {"gen_args_0":{"arg_0":"Question: Maria is baking cookies for Sally. Sally says … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1302 | failed | {…} {"gen_args_0":{"arg_0":"Question: Jason is climbing a telephone pole next to a t… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1303 | failed | {…} {"gen_args_0":{"arg_0":"Question: Each week Carina puts 20 more seashells in a j… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1304 | failed | {…} {"gen_args_0":{"arg_0":"Question: Beth is a scuba diver. She is excavating a su… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1305 | failed | {…} {"gen_args_0":{"arg_0":"Question: The stadium seats 60,000 fans, but only 75% of… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1306 | failed | {…} {"gen_args_0":{"arg_0":"Question: Mary is building a mosaic for her school cafet… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1307 | failed | {…} {"gen_args_0":{"arg_0":"Question: A portable battery charger can fully charge a … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1308 | failed | {…} {"gen_args_0":{"arg_0":"Question: In ten years, I'll be twice my brother's age. … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1309 | failed | {…} {"gen_args_0":{"arg_0":"Question: If Clover goes for a 1.5-mile walk in the morn… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1310 | failed | {…} {"gen_args_0":{"arg_0":"Question: A pen is longer than the rubber by 3 centimete… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1311 | failed | {…} {"gen_args_0":{"arg_0":"Question: Jake buys 2-pound packages of sausages. He bu… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1312 | failed | {…} {"gen_args_0":{"arg_0":"Question: You draw a rectangle that is 7 inches wide. It… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1313 | failed | {…} {"gen_args_0":{"arg_0":"Question: OpenAI runs a robotics competition that limits… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1314 | failed | {…} {"gen_args_0":{"arg_0":"Question: If Jason eats three potatoes in 20 minutes, ho… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1315 | failed | {…} {"gen_args_0":{"arg_0":"Question: Jessica's family is 300 km away from New York.… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1316 | failed | {…} {"gen_args_0":{"arg_0":"Question: A package of candy has 3 servings with 120 cal… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1317 | failed | {…} {"gen_args_0":{"arg_0":"Question: Magdalena has an apple tree on their farm, pro… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1318 | failed | {…} {"gen_args_0":{"arg_0":"Question: A car uses 20 gallons of gas to travel 400 mil… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 0 | failed | {…} {"gen_args_0":{"arg_0":"Question: Jen and Tyler are gymnasts practicing flips. J… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 1 | failed | {…} {"gen_args_0":{"arg_0":"Question: Steve finds 100 gold bars while visiting Orego… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 2 | failed | {…} {"gen_args_0":{"arg_0":"Question: Tom can type 90 words a minute. A page is 450… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 3 | failed | {…} {"gen_args_0":{"arg_0":"Question: Half of Jerome's money was $43. He gave $8 to … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 4 | failed | {…} {"gen_args_0":{"arg_0":"Question: Surfers enjoy going to the Rip Curl Myrtle Bea… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 5 | failed | {…} {"gen_args_0":{"arg_0":"Question: Ivan had $10 and spent 1\/5 of it on cupcakes.… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 6 | failed | {…} {"gen_args_0":{"arg_0":"Question: Stella wanted to buy a new dress for the upcom… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 7 | failed | {…} {"gen_args_0":{"arg_0":"Question: Ravi can jump higher than anyone in the class.… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 8 | failed | {…} {"gen_args_0":{"arg_0":"Question: Goldie makes $5 an hour for pet-sitting. Last … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 9 | failed | {…} {"gen_args_0":{"arg_0":"Question: James listens to super-fast music. It is 200 … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 10 | failed | {…} {"gen_args_0":{"arg_0":"Question: Yves and his siblings ordered pizza and asked … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 11 | failed | {…} {"gen_args_0":{"arg_0":"Question: James binges on Cheezits and eats 3 bags that … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 12 | failed | {…} {"gen_args_0":{"arg_0":"Question: It will take Richard and Sarah 3 years to save… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 13 | failed | {…} {"gen_args_0":{"arg_0":"Question: Jerome is taking a 150-mile bicycle trip. He w… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 14 | failed | {…} {"gen_args_0":{"arg_0":"Question: James rents his car out for $20 an hour. He r… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 15 | failed | {…} {"gen_args_0":{"arg_0":"Question: Tamara is 3 times Kim's height less 4 inches. … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 16 | failed | {…} {"gen_args_0":{"arg_0":"Question: There are 1250 pairs of shoes in the warehouse… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 17 | failed | {…} {"gen_args_0":{"arg_0":"Question: Bekah had to read 408 pages for history class.… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 18 | failed | {…} {"gen_args_0":{"arg_0":"Question: A beadshop earns a third of its profit on Mond… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 19 | failed | {…} {"gen_args_0":{"arg_0":"Question: Peter needs 80 ounces of soda for his party. H… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 20 | failed | {…} {"gen_args_0":{"arg_0":"Question: Abel leaves for a vacation destination 1000 mi… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 21 | failed | {…} {"gen_args_0":{"arg_0":"Question: Aida has twice as many dolls as Sophie, and So… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 22 | failed | {…} {"gen_args_0":{"arg_0":"Question: Samantha bought a crate of 30 eggs for $5. If … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 23 | failed | {…} {"gen_args_0":{"arg_0":"Question: Jake agrees to work part of his debt off. He … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 24 | failed | {…} {"gen_args_0":{"arg_0":"Question: Ronald can grill 15 hamburgers per session on … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 25 | failed | {…} {"gen_args_0":{"arg_0":"Question: Erik's dog can run 24 miles per hour. It is ch… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 26 | failed | {…} {"gen_args_0":{"arg_0":"Question: Since 1989, Lily has treated herself to 1 hydr… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 27 | failed | {…} {"gen_args_0":{"arg_0":"Question: Ines had $20 in her purse. She bought 3 pounds… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 28 | failed | {…} {"gen_args_0":{"arg_0":"Question: Eliza can iron a blouse in 15 minutes and a dr… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 29 | failed | {…} {"gen_args_0":{"arg_0":"Question: Francis and Kiera had breakfast at a cafe. Muf… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 30 | failed | {…} {"gen_args_0":{"arg_0":"Question: A laboratory needs flasks, test tubes, and saf… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 31 | failed | {…} {"gen_args_0":{"arg_0":"Question: Rob planned on spending three hours reading in… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 32 | failed | {…} {"gen_args_0":{"arg_0":"Question: Pete walks backwards three times faster than S… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 33 | failed | {…} {"gen_args_0":{"arg_0":"Question: A ship left a port and headed due west, having… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 34 | failed | {…} {"gen_args_0":{"arg_0":"Question: A fruit stand is selling apples for $2 each. E… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 35 | failed | {…} {"gen_args_0":{"arg_0":"Question: Luther made 12 pancakes for breakfast. His fam… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 36 | failed | {…} {"gen_args_0":{"arg_0":"Question: Andy bakes and sells birthday cakes. To make t… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 37 | failed | {…} {"gen_args_0":{"arg_0":"Question: Jake and Penny are hunting snakes. Jake's snak… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 38 | failed | {…} {"gen_args_0":{"arg_0":"Question: Greg's PPO algorithm obtained 90% of the possi… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 39 | failed | {…} {"gen_args_0":{"arg_0":"Question: Federal guidelines recommend eating at least 2… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 40 | failed | {…} {"gen_args_0":{"arg_0":"Question: A pound of strawberries costs $2.20 and a poun… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 41 | failed | {…} {"gen_args_0":{"arg_0":"Question: Lindsey bought 2 exercise bands to intensify h… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 42 | passed | {…} {"gen_args_0":{"arg_0":"Question: Janice gave all three dozens of pebbles from h… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 43 | failed | {…} {"gen_args_0":{"arg_0":"Question: Alex is on a cross-country bike trip. After st… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 44 | failed | {…} {"gen_args_0":{"arg_0":"Question: Karen packs peanut butter sandwiches in her da… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 45 | failed | {…} {"gen_args_0":{"arg_0":"Question: Danai is decorating her house for Halloween. S… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 46 | failed | {…} {"gen_args_0":{"arg_0":"Question: Alexander Bought 22 more joggers than Tyson. C… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 47 | failed | {…} {"gen_args_0":{"arg_0":"Question: Janet makes $20 per hour at work. She works 52… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 48 | passed | {…} {"gen_args_0":{"arg_0":"Question: Anthony and his friend Leonel read about the i… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 49 | failed | {…} {"gen_args_0":{"arg_0":"Question: Carla is dividing up insurance claims among 3 … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 50 | failed | {…} {"gen_args_0":{"arg_0":"Question: One batch of cookies requires 4 cups of flour … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 51 | failed | {…} {"gen_args_0":{"arg_0":"Question: Belle eats 4 dog biscuits and 2 rawhide bones … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 52 | failed | {…} {"gen_args_0":{"arg_0":"Question: If Sally had $20 less, she would have $80. If … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 53 | failed | {…} {"gen_args_0":{"arg_0":"Question: Manny is making lasagna for dinner with his fo… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 54 | failed | {…} {"gen_args_0":{"arg_0":"Question: Pete's memory card can hold 3,000 pictures of … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 55 | failed | {…} {"gen_args_0":{"arg_0":"Question: Fabian is shopping at a nearby supermarket. He… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 56 | failed | {…} {"gen_args_0":{"arg_0":"Question: At the zoo, there are 5 different types of ani… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 57 | failed | {…} {"gen_args_0":{"arg_0":"Question: Mr. Maximilian has a rental building that he c… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 58 | failed | {…} {"gen_args_0":{"arg_0":"Question: In a 50-question test with two marks for each … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 59 | failed | {…} {"gen_args_0":{"arg_0":"Question: Andy gets a cavity for every 4 candy canes he … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 60 | failed | {…} {"gen_args_0":{"arg_0":"Question: A school bus has 4 columns and 10 rows of seat… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 61 | failed | {…} {"gen_args_0":{"arg_0":"Question: Mark hires a singer for 3 hours at $15 an hour… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 62 | failed | {…} {"gen_args_0":{"arg_0":"Question: Jerry has an interesting novel he borrowed fro… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 63 | failed | {…} {"gen_args_0":{"arg_0":"Question: Chuck can ride the merry-go-round 5 times long… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 64 | failed | {…} {"gen_args_0":{"arg_0":"Question: Jordan is a hockey goalie. In the first period… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 65 | failed | {…} {"gen_args_0":{"arg_0":"Question: Kate has to fill 52 balloons for the party. Ea… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 66 | failed | {…} {"gen_args_0":{"arg_0":"Question: Kingsley's teacher instructed her to find four… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 67 | failed | {…} {"gen_args_0":{"arg_0":"Question: Rebecca bought 22 items for her camping trip, … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 68 | failed | {…} {"gen_args_0":{"arg_0":"Question: Carly had 42 lollipops to share with her frien… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 69 | failed | {…} {"gen_args_0":{"arg_0":"Question: A car uses 20 gallons of gas to travel 400 mil… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 70 | failed | {…} {"gen_args_0":{"arg_0":"Question: In a city, the number of people living per cub… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 71 | failed | {…} {"gen_args_0":{"arg_0":"Question: There are 322 voters in District 1. District 2… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 72 | failed | {…} {"gen_args_0":{"arg_0":"Question: Bob and Jim decide to skip rocks. Bob can ski… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 73 | failed | {…} {"gen_args_0":{"arg_0":"Question: Camille goes to the Ice Cream Palace with her … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 74 | failed | {…} {"gen_args_0":{"arg_0":"Question: Three blue chips are in a jar which is 10% of … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 75 | failed | {…} {"gen_args_0":{"arg_0":"Question: Jill bought 5 packs of red bouncy balls and 4 … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 76 | failed | {…} {"gen_args_0":{"arg_0":"Question: Marj has two $20 bills, three $5 bills, and $4… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 77 | failed | {…} {"gen_args_0":{"arg_0":"Question: Bernie loves eating chocolate. He buys two cho… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 78 | failed | {…} {"gen_args_0":{"arg_0":"Question: At football tryouts, the coach wanted to see w… | — | — | — | ||
|
Lade Detail …
|
|||||||
| 79 | failed | {…} {"gen_args_0":{"arg_0":"Question: Tim used to run 3 times a week but decided to … | — | — | — | ||
|
Lade Detail …
|
|||||||
| 80 | failed | {…} {"gen_args_0":{"arg_0":"Question: It rained twice as much on Tuesday as Monday. … | — | — | — | ||
|
Lade Detail …
|
|||||||