Samples · bfcl.single_turn

Run #75 · Adapter v1.0.0+v4-layout-aliases+reasoning · 200/707 Samples angezeigt · Score 49.1%
‹ Zurück zum Run-Detail

KI-Auswertung

Generiert 2026-05-13 21:38 · claude-sonnet-4-6

Zusammenfassung

Das Modell erreicht eine Pass-Rate von ~49 %, was einer knappen Mehrheit an Fehlern entspricht. Die Performance liegt damit deutlich unter einem praxistauglichen Schwellenwert für Single-Turn Function Calling.

Stärken

  • Keine Errors (0 von 1390 Aufrufen), das Modell produziert stets strukturierte Ausgaben.
  • Argumente und Werte werden inhaltlich meist korrekt übergeben (richtige Zahlen, richtige Parameter-Namen).

Schwächen

  • Dominantes Muster: Das Modell ersetzt Punkte in Funktionsnamen durch Unterstriche (`math.factorial` → `math_factorial`), was systematisch zu `wrong_func_name`-Fehlern führt.
  • Vereinzelt treten Typ-Fehler auf (z. B. `interval` als flaches Array statt Array-of-Arrays), was auf mangelndes Schema-Verständnis bei verschachtelten Typen hindeutet.
  • Optionale Parameter werden häufig weggelassen, obwohl sie im Expected-Output als leerer String akzeptiert würden — dies könnte weitere stille Fehler verursachen.

Auffälligkeiten

Fast alle sichtbaren Failures basieren auf einem einzigen, konsistenten Fehler: Punkt-Notation in Funktionsnamen wird nicht verwendet. Dies deutet darauf hin, dass das Modell entweder im Training kaum Beispiele mit Namespace-Trennzeichen (`.`) gesehen hat oder der System-Prompt / Tool-Schema diese Konvention nicht ausreichend vorgibt. Der Typ-Fehler bei `interval` ist ein separates, aber ebenfalls reproduzierbares Problem bei Array-of-Floats-Schemas.

Empfehlung

Den Tool-Schema-Prompt explizit um den Hinweis ergänzen, dass Funktionsnamen exakt — inklusive Punkt-Notation — aus dem Schema übernommen werden müssen; alternativ Few-Shot-Beispiele mit Dot-Namespaces in den System-Prompt integrieren und anschließend den Benchmark erneut auswerten.

Übersicht

707 Samples
Verteilung
707
Score-Histogramm
0 – 0.1: 707 0.1 – 0.2: 0 0.2 – 0.3: 0 0.3 – 0.4: 0 0.4 – 0.5: 0 0.5 – 0.6: 0 0.6 – 0.7: 0 0.7 – 0.8: 0 0.8 – 0.9: 0 0.9 – 1: 0
0.0 ────── 1.0
Status Score-Schwelle Score < 0.5
Frage-ID Status Score Prompt Latenz Tokens/s TTFT
simple_python_1 failed 0% {…} [[{"role":"user","content":"Calculate the factorial of 5 using math functions."}…
Lade Detail …
simple_python_2 failed 0% {…} [[{"role":"user","content":"Calculate the hypotenuse of a right triangle given t…
Lade Detail …
simple_python_3 failed 0% {…} [[{"role":"user","content":"Find the roots of a quadratic equation with coeffici…
Lade Detail …
simple_python_8 failed 0% {…} [[{"role":"user","content":"What's the area of a circle with a radius of 10?"}]]
Lade Detail …
simple_python_9 failed 0% {…} [[{"role":"user","content":"Calculate the area of a circle with a radius of 5 un…
Lade Detail …
simple_python_12 failed 0% {…} [[{"role":"user","content":"Calculate the circumference of a circle with radius …
Lade Detail …
simple_python_13 failed 0% {…} [[{"role":"user","content":"Calculate the area under the curve y=x^2 from x=1 to…
Lade Detail …
simple_python_16 failed 0% {…} [[{"role":"user","content":"Calculate the derivative of the function 2x^2 at x =…
Lade Detail …
simple_python_18 failed 0% {…} [[{"role":"user","content":"Find the prime factors of the number 123456."}]]
Lade Detail …
simple_python_19 failed 0% {…} [[{"role":"user","content":"Calculate the greatest common divisor of two numbers…
Lade Detail …
simple_python_20 failed 0% {…} [[{"role":"user","content":"Find the highest common factor of 36 and 24."}]]
Lade Detail …
simple_python_21 failed 0% {…} [[{"role":"user","content":"Find the Greatest Common Divisor (GCD) of two number…
Lade Detail …
simple_python_22 failed 0% {…} [[{"role":"user","content":"Calculate the greatest common divisor of two given n…
Lade Detail …
simple_python_24 failed 0% {…} [[{"role":"user","content":"Find the greatest common divisor (GCD) of 12 and 18"…
Lade Detail …
simple_python_30 failed 0% {…} [[{"role":"user","content":"What is the final velocity of a vehicle that started…
Lade Detail …
simple_python_35 failed 0% {…} [[{"role":"user","content":"Find an all vegan restaurant in New York that opens …
Lade Detail …
simple_python_37 failed 0% {…} [[{"role":"user","content":"Find the estimated travel time by car from San Franc…
Lade Detail …
simple_python_45 failed 0% {…} [[{"role":"user","content":"Calculate the energy (in Joules) absorbed or release…
Lade Detail …
simple_python_50 failed 0% {…} [[{"role":"user","content":"What is the change in entropy in Joules per Kelvin o…
Lade Detail …
simple_python_51 failed 0% {…} [[{"role":"user","content":"Calculate the entropy change for a certain process g…
Lade Detail …
simple_python_55 failed 0% {…} [[{"role":"user","content":"Find me detailed information about the structure of …
Lade Detail …
simple_python_56 failed 0% {…} [[{"role":"user","content":"What are the names of proteins found in the plasma m…
Lade Detail …
simple_python_58 failed 0% {…} [[{"role":"user","content":"What is the function of ATP synthase in mitochondria…
Lade Detail …
simple_python_60 failed 0% {…} [[{"role":"user","content":"Find the type of gene mutation based on SNP (Single …
Lade Detail …
simple_python_63 failed 0% {…} [[{"role":"user","content":"Find out how genetically similar a human and a chimp…
Lade Detail …
simple_python_66 failed 0% {…} [[{"role":"user","content":"Get me data on average precipitation in the Amazon r…
Lade Detail …
simple_python_69 failed 0% {…} [[{"role":"user","content":"Find out the population and species of turtles in Mi…
Lade Detail …
simple_python_76 failed 0% {…} [[{"role":"user","content":"Get me the predictions of the evolutionary rate for …
Lade Detail …
simple_python_77 failed 0% {…} [[{"role":"user","content":"Find a nearby restaurant that serves vegan food in L…
Lade Detail …
simple_python_81 failed 0% {…} [[{"role":"user","content":"Find the fastest route from San Francisco to Los Ang…
Lade Detail …
simple_python_82 failed 0% {…} [[{"role":"user","content":"Calculate the average of list of integers [12, 15, 1…
Lade Detail …
simple_python_85 failed 0% {…} [[{"role":"user","content":"What's the approximate distance between Boston, MA, …
Lade Detail …
simple_python_86 failed 0% {…} [[{"role":"user","content":"Find the shortest distance between two cities, New Y…
Lade Detail …
simple_python_87 failed 0% {…} [[{"role":"user","content":"Sort the list [5, 3, 4, 1, 2] in ascending order."}]…
Lade Detail …
simple_python_90 failed 0% {…} [[{"role":"user","content":"Retrieve Personal Info and Job History data of a spe…
Lade Detail …
simple_python_92 failed 0% {…} [[{"role":"user","content":"Find all movies starring Leonardo DiCaprio in the ye…
Lade Detail …
simple_python_96 failed 0% {…} [[{"role":"user","content":"Find records in database in user table where age is …
Lade Detail …
simple_python_97 failed 0% {…} [[{"role":"user","content":"Calculate the factorial of the number 5"}]]
Lade Detail …
simple_python_103 failed 0% {…} [[{"role":"user","content":"Calculate the area under the curve y=3x^2 + 2x - 4, …
Lade Detail …
simple_python_104 failed 0% {…} [[{"role":"user","content":"Calculate the area of a triangle with base 6 and hei…
Lade Detail …
simple_python_105 failed 0% {…} [[{"role":"user","content":"Calculate the power of 3 raised to the power 4."}]]
Lade Detail …
simple_python_109 failed 0% {…} [[{"role":"user","content":"Generate a random forest model with 100 trees and a …
Lade Detail …
simple_python_111 failed 0% {…} [[{"role":"user","content":"Generate a random number from a normal distribution …
Lade Detail …
simple_python_113 failed 0% {…} [[{"role":"user","content":"What's the probability of rolling a six on a six-sid…
Lade Detail …
simple_python_114 failed 0% {…} [[{"role":"user","content":"Find the probability of getting exactly 5 heads in 1…
Lade Detail …
simple_python_116 failed 0% {…} [[{"role":"user","content":"What's the probability of drawing a king from a well…
Lade Detail …
simple_python_118 failed 0% {…} [[{"role":"user","content":"Perform a two-sample t-test on my experiment data of…
Lade Detail …
simple_python_119 failed 0% {…} [[{"role":"user","content":"Perform a hypothesis test for two independent sample…
Lade Detail …
simple_python_123 failed 0% {…} [[{"role":"user","content":"Perform a two-sample t-test to determine if there is…
Lade Detail …
simple_python_126 failed 0% {…} [[{"role":"user","content":"What is the coefficient of determination (R-squared)…
Lade Detail …
simple_python_128 failed 0% {…} [[{"role":"user","content":"What's the quarterly dividend per share of a company…
Lade Detail …
simple_python_130 failed 0% {…} [[{"role":"user","content":"What's the NPV (Net Present Value) of a series of ca…
Lade Detail …
simple_python_133 failed 0% {…} [[{"role":"user","content":"Predict the future value of a $5000 investment with …
Lade Detail …
simple_python_134 failed 0% {…} [[{"role":"user","content":"Predict the total expected profit of stocks XYZ in 5…
Lade Detail …
simple_python_144 failed 0% {…} [[{"role":"user","content":"Find the market performance of the S&P 500 and the D…
Lade Detail …
simple_python_148 failed 0% {…} [[{"role":"user","content":"Calculate the future value of an investment with an …
Lade Detail …
simple_python_156 failed 0% {…} [[{"role":"user","content":"Look up details of a felony crime record for case nu…
Lade Detail …
simple_python_157 failed 0% {…} [[{"role":"user","content":"Find out if an individual John Doe with a birthday 0…
Lade Detail …
simple_python_163 failed 0% {…} [[{"role":"user","content":"Provide me with the property records of my house loc…
Lade Detail …
simple_python_165 failed 0% {…} [[{"role":"user","content":"Retrieve cases from 2020 about theft crimes in Los A…
Lade Detail …
simple_python_166 failed 0% {…} [[{"role":"user","content":"Find a lawyer specializing in divorce cases and char…
Lade Detail …
simple_python_167 failed 0% {…} [[{"role":"user","content":"Retrieve the details of a Supreme Court case titled …
Lade Detail …
simple_python_169 failed 0% {…} [[{"role":"user","content":"Find the details of the court case identified by doc…
Lade Detail …
simple_python_170 failed 0% {…} [[{"role":"user","content":"Find a historical law case about fraud from 2010 to …
Lade Detail …
simple_python_172 failed 0% {…} [[{"role":"user","content":"How to obtain the detailed case information of the '…
Lade Detail …
simple_python_175 failed 0% {…} [[{"role":"user","content":"How many months of experience a Lawyer John Doe has …
Lade Detail …
simple_python_176 failed 0% {…} [[{"role":"user","content":"Find details of patent lawsuits involving the compan…
Lade Detail …
simple_python_184 failed 0% {…} [[{"role":"user","content":"I need the details of the lawsuit case with case ID …
Lade Detail …
simple_python_188 failed 0% {…} [[{"role":"user","content":"What is the humidity level in Miami, Florida in the …
Lade Detail …
simple_python_193 failed 0% {…} [[{"role":"user","content":"Find the best local nurseries in Toronto with a good…
Lade Detail …
simple_python_199 failed 0% {…} [[{"role":"user","content":"Find air quality index in San Jose for next three da…
Lade Detail …
simple_python_200 failed 0% {…} [[{"role":"user","content":"How much CO2 is produced annually by a gas-fueled ca…
Lade Detail …
simple_python_204 failed 0% {…} [[{"role":"user","content":"Find restaurants near me within 10 miles that offer …
Lade Detail …
simple_python_206 failed 0% {…} [[{"role":"user","content":"Find the nearest park with a tennis court in London.…
Lade Detail …
simple_python_208 failed 0% {…} [[{"role":"user","content":"Get me the directions from New York to Los Angeles a…
Lade Detail …
simple_python_209 failed 0% {…} [[{"role":"user","content":"Locate the nearest public library in Boston, Massach…
Lade Detail …
simple_python_213 failed 0% {…} [[{"role":"user","content":"Book a direct flight from San Francisco to London fo…
Lade Detail …
simple_python_214 failed 0% {…} [[{"role":"user","content":"Search for upcoming month rock concerts in New York.…
Lade Detail …
simple_python_215 failed 0% {…} [[{"role":"user","content":"Give me a brief on movie 'Interstellar'"}]]
Lade Detail …
simple_python_217 failed 0% {…} [[{"role":"user","content":"Analyze my fMRI data in ~\/data\/myfMRI.nii from a m…
Lade Detail …
simple_python_218 failed 0% {…} [[{"role":"user","content":"Given patient with id 546382, retrieve their brain M…
Lade Detail …
simple_python_223 failed 0% {…} [[{"role":"user","content":"Find social behaviors and patterns in a group size o…
Lade Detail …
simple_python_224 failed 0% {…} [[{"role":"user","content":"Find the most followed person on twitter who tweets …
Lade Detail …
simple_python_225 failed 0% {…} [[{"role":"user","content":"What is the percentage of population preferring digi…
Lade Detail …
simple_python_231 failed 0% {…} [[{"role":"user","content":"Provide key war events in German history from 1871 t…
Lade Detail …
simple_python_232 failed 0% {…} [[{"role":"user","content":"What was the full name king of England in 1800?"}]]
Lade Detail …
simple_python_233 failed 0% {…} [[{"role":"user","content":"When did the Treaty of Tordesillas take place? Put i…
Lade Detail …
simple_python_234 failed 0% {…} [[{"role":"user","content":"Find important Wars in European history during the 1…
Lade Detail …
simple_python_236 failed 0% {…} [[{"role":"user","content":"Get start date on the American Civil War."}]]
Lade Detail …
simple_python_238 failed 0% {…} [[{"role":"user","content":"Who was the president of the United States during th…
Lade Detail …
simple_python_239 failed 0% {…} [[{"role":"user","content":"Who was the full name of the president of the United…
Lade Detail …
simple_python_240 failed 0% {…} [[{"role":"user","content":"Who was the President of the United States in 1940?"…
Lade Detail …
simple_python_244 failed 0% {…} [[{"role":"user","content":"What year was the law of universal gravitation publi…
Lade Detail …
simple_python_245 failed 0% {…} [[{"role":"user","content":"Who discovered radium?"}]]
Lade Detail …
simple_python_246 failed 0% {…} [[{"role":"user","content":"Who discovered Gravity and what was the method used?…
Lade Detail …
simple_python_247 failed 0% {…} [[{"role":"user","content":"What was Albert Einstein's contribution to science o…
Lade Detail …
simple_python_248 failed 0% {…} [[{"role":"user","content":"Who invented the theory of relativity and in which y…
Lade Detail …
simple_python_249 failed 0% {…} [[{"role":"user","content":"Tell me more about Christianity and its history till…
Lade Detail …
simple_python_255 failed 0% {…} [[{"role":"user","content":"Get the biography and main contributions of Pope Inn…
Lade Detail …
simple_python_260 failed 0% {…} [[{"role":"user","content":"Calculate how many gallons of paint is required to p…
Lade Detail …
simple_python_264 failed 0% {…} [[{"role":"user","content":"Find the size of the sculpture with title 'David' by…
Lade Detail …
simple_python_267 failed 0% {…} [[{"role":"user","content":"Find the top rated modern sculpture exhibition happe…
Lade Detail …
simple_python_268 failed 0% {…} [[{"role":"user","content":"Find me the sculptures of Michelangelo with material…
Lade Detail …
simple_python_270 failed 0% {…} [[{"role":"user","content":"Can you give me the height and width of Empire State…
Lade Detail …
simple_python_273 failed 0% {…} [[{"role":"user","content":"Find out the open hours for the Louvre Museum in Par…
Lade Detail …
simple_python_275 failed 0% {…} [[{"role":"user","content":"Get the list of top 5 popular artworks at the Metrop…
Lade Detail …
simple_python_276 failed 0% {…} [[{"role":"user","content":"Get the working hours of Louvre Museum in Paris."}]]
Lade Detail …
simple_python_279 failed 0% {…} [[{"role":"user","content":"What's the retail price of a Fender American Profess…
Lade Detail …
simple_python_283 failed 0% {…} [[{"role":"user","content":"Find the price of a used Gibson Les Paul guitar in e…
Lade Detail …
simple_python_284 failed 0% {…} [[{"role":"user","content":"Get information about the pop concerts in New York f…
Lade Detail …
simple_python_286 failed 0% {…} [[{"role":"user","content":"Get concert details for the artist Beyonce performin…
Lade Detail …
simple_python_287 failed 0% {…} [[{"role":"user","content":"Find me a classical concert this weekend in Los Ange…
Lade Detail …
simple_python_288 failed 0% {…} [[{"role":"user","content":"Get me two tickets for next Eminem concert in New Yo…
Lade Detail …
simple_python_289 failed 0% {…} [[{"role":"user","content":"Find concerts near me in Seattle that plays jazz mus…
Lade Detail …
simple_python_290 failed 0% {…} [[{"role":"user","content":"What's the timing and location for The Weeknd's conc…
Lade Detail …
simple_python_291 failed 0% {…} [[{"role":"user","content":"Generate a melody in C major scale, starting with th…
Lade Detail …
simple_python_293 failed 0% {…} [[{"role":"user","content":"Create a mix track using notes of C major scale and …
Lade Detail …
simple_python_294 failed 0% {…} [[{"role":"user","content":"Generate a major chord progression in C key with fou…
Lade Detail …
simple_python_296 failed 0% {…} [[{"role":"user","content":"Generate a major C scale progression with tempo 80 B…
Lade Detail …
simple_python_297 failed 0% {…} [[{"role":"user","content":"music.theory.chordProgression(progression=['I', 'V',…
Lade Detail …
simple_python_298 failed 0% {…} [[{"role":"user","content":"What key signature does C# major have?"}]]
Lade Detail …
simple_python_300 failed 0% {…} [[{"role":"user","content":"Calculate the duration between two notes of 440Hz an…
Lade Detail …
simple_python_303 failed 0% {…} [[{"role":"user","content":"Get the player stats of Cristiano Ronaldo in the 201…
Lade Detail …
simple_python_304 failed 0% {…} [[{"role":"user","content":"Get point and rebound stats for player 'LeBron James…
Lade Detail …
simple_python_305 failed 0% {…} [[{"role":"user","content":"Calculate the overall goal and assist of soccer play…
Lade Detail …
simple_python_307 failed 0% {…} [[{"role":"user","content":"Who won the basketball game between Lakers and Clipp…
Lade Detail …
simple_python_308 failed 0% {…} [[{"role":"user","content":"What are the next five matches for Manchester United…
Lade Detail …
simple_python_309 failed 0% {…} [[{"role":"user","content":"Find me the record of Tom Brady in the 2020 NFL seas…
Lade Detail …
simple_python_311 failed 0% {…} [[{"role":"user","content":"Find me the detailed profile of basketball player Le…
Lade Detail …
simple_python_313 failed 0% {…} [[{"role":"user","content":"What's the total worth in euro of Messi according to…
Lade Detail …
simple_python_314 failed 0% {…} [[{"role":"user","content":"Find all the major achievements of the footballer Li…
Lade Detail …
simple_python_320 failed 0% {…} [[{"role":"user","content":"Fetch the basketball league standings, where Golden …
Lade Detail …
simple_python_322 failed 0% {…} [[{"role":"user","content":"Get the current ranking for Liverpool Football Club …
Lade Detail …
simple_python_323 failed 0% {…} [[{"role":"user","content":"Who is ranked as the top player in woman tennis?"}]]
Lade Detail …
simple_python_324 failed 0% {…} [[{"role":"user","content":"Find the score of last game for Los Angeles Lakers i…
Lade Detail …
simple_python_325 failed 0% {…} [[{"role":"user","content":"Who won the last match between Chicago Bulls and Los…
Lade Detail …
simple_python_327 failed 0% {…} [[{"role":"user","content":"Give me the schedule of Manchester United for the ne…
Lade Detail …
simple_python_328 failed 0% {…} [[{"role":"user","content":"Find the rating and player count of the board game '…
Lade Detail …
simple_python_331 failed 0% {…} [[{"role":"user","content":"Find the top chess players in New York with a rating…
Lade Detail …
simple_python_332 failed 0% {…} [[{"role":"user","content":"What's the chess classical rating of Magnus Carlsen?…
Lade Detail …
simple_python_334 failed 0% {…} [[{"role":"user","content":"Check who is the winner in a game of blackjack given…
Lade Detail …
simple_python_336 failed 0% {…} [[{"role":"user","content":"Shuffle a deck of cards, and draw 3 cards from the t…
Lade Detail …
simple_python_338 failed 0% {…} [[{"role":"user","content":"What is the probability of drawing a heart card from…
Lade Detail …
simple_python_339 failed 0% {…} [[{"role":"user","content":"What is the probability of getting a full house in p…
Lade Detail …
simple_python_340 failed 0% {…} [[{"role":"user","content":"Determine the winner in a Poker game with John havin…
Lade Detail …
simple_python_341 failed 0% {…} [[{"role":"user","content":"What are the odds of drawing a heart card from a dec…
Lade Detail …
simple_python_342 failed 0% {…} [[{"role":"user","content":"Find all multi-player games released in 2019 with an…
Lade Detail …
simple_python_343 failed 0% {…} [[{"role":"user","content":"Fetch player statistics of 'Zelda' on Switch for use…
Lade Detail …
simple_python_347 failed 0% {…} [[{"role":"user","content":"Get me the details of the last game played by Liverp…
Lade Detail …
simple_python_349 failed 0% {…} [[{"role":"user","content":"Find the highest score achieved by any player in the…
Lade Detail …
simple_python_352 failed 0% {…} [[{"role":"user","content":"Get the average user score for the game 'The Legend …
Lade Detail …
simple_python_355 failed 0% {…} [[{"role":"user","content":"How many calories in the Beef Lasagna Recipe from Fo…
Lade Detail …
simple_python_356 failed 0% {…} [[{"role":"user","content":"Find me a recipe that serves 2 people, is vegan, and…
Lade Detail …
simple_python_363 failed 0% {…} [[{"role":"user","content":"Find the closest sushi restaurant with a patio in Bo…
Lade Detail …
simple_python_365 failed 0% {…} [[{"role":"user","content":"How many ounces in 2 pounds of butter?"}]]
Lade Detail …
simple_python_366 failed 0% {…} [[{"role":"user","content":"How many teaspoons are in 2 tablespoons for measurem…
Lade Detail …
simple_python_367 failed 0% {…} [[{"role":"user","content":"Find me a vegan recipe for brownies which prep time …
Lade Detail …
simple_python_369 failed 0% {…} [[{"role":"user","content":"Find a grocery store near me with organic fruits and…
Lade Detail …
simple_python_370 failed 0% {…} [[{"role":"user","content":"Order three bottles of olive oil and a five pound ba…
Lade Detail …
simple_python_371 failed 0% {…} [[{"role":"user","content":"Check the price of tomatoes and lettuce at the Whole…
Lade Detail …
simple_python_372 failed 0% {…} [[{"role":"user","content":"Find the top five organic bananas brands on the basi…
Lade Detail …
simple_python_373 failed 0% {…} [[{"role":"user","content":"I want to buy apples, rice, and 12 pack of bottled w…
Lade Detail …
simple_python_374 failed 0% {…} [[{"role":"user","content":"Check the amount of protein, calories and carbs in a…
Lade Detail …
simple_python_375 failed 0% {…} [[{"role":"user","content":"Check the total price for three pumpkins and two doz…
Lade Detail …
simple_python_378 failed 0% {…} [[{"role":"user","content":"Convert time 3pm from New York time zone to London t…
Lade Detail …
simple_python_381 failed 0% {…} [[{"role":"user","content":"Check if any Hilton Hotel is available for two adult…
Lade Detail …
simple_python_382 failed 0% {…} [[{"role":"user","content":"Book a single room for two nights at the Hilton Hote…
Lade Detail …
simple_python_384 failed 0% {…} [[{"role":"user","content":"Book a hotel room for two adults and one child in Pa…
Lade Detail …
simple_python_385 failed 0% {…} [[{"role":"user","content":"Book a hotel room with king size bed in Los Angeles …
Lade Detail …
simple_python_388 failed 0% {…} [[{"role":"user","content":"How many Canadian dollars can I get for 500 US dolla…
Lade Detail …
simple_python_390 failed 0% {…} [[{"role":"user","content":"Convert 150 Euros to Canadian dollars."}]]
Lade Detail …
simple_python_394 failed 0% {…} [[{"role":"user","content":"Get me the travel distance and duration from the Eif…
Lade Detail …
simple_python_395 failed 0% {…} [[{"role":"user","content":"Find the nearest parking lot within 2 miles of Centr…
Lade Detail …
simple_python_396 failed 0% {…} [[{"role":"user","content":"Find a hospital within 5 km radius around Denver, Co…
Lade Detail …
simple_python_397 failed 0% {…} [[{"role":"user","content":"Find the distance between New York and Boston, accou…
Lade Detail …
simple_java_0 failed 0% {…} [[{"role":"user","content":"Help me initialize the GIS geometry presentation in …
Lade Detail …
simple_java_1 failed 0% {…} [[{"role":"user","content":"Help me generate SQL completion proposals for a tabl…
Lade Detail …
simple_java_2 failed 0% {…} [[{"role":"user","content":"Help me generate the full SQL creation script with a…
Lade Detail …
simple_java_3 failed 0% {…} [[{"role":"user","content":"Help me resolve a tablespace reference named 'USERSP…
Lade Detail …
simple_java_4 failed 0% {…} [[{"role":"user","content":"Help me prepare a JDBC statement for a DB2 view name…
Lade Detail …
simple_java_5 failed 0% {…} [[{"role":"user","content":"Help me initialize a plain text presentation for a r…
Lade Detail …
simple_java_6 failed 0% {…} [[{"role":"user","content":"Help me update the data in a spreadsheet view within…
Lade Detail …
simple_java_7 failed 0% {…} [[{"role":"user","content":"Help me copy an NIO resource to a new path '\/backup…
Lade Detail …
simple_java_8 failed 0% {…} [[{"role":"user","content":"Help me update the contents of a file in the non-blo…
Lade Detail …
simple_java_9 failed 0% {…} [[{"role":"user","content":"Help me serialize a `MultiPoint` object with 5 point…
Lade Detail …
simple_java_10 failed 0% {…} [[{"role":"user","content":"Help me update the launcher information in the JNI B…
Lade Detail …
simple_java_11 failed 0% {…} [[{"role":"user","content":"What is the value of the 'EnableExtensions' property…
Lade Detail …
simple_java_12 failed 0% {…} [[{"role":"user","content":"Help me change the current schema to 'AnalyticsDB' i…
Lade Detail …
simple_java_13 failed 0% {…} [[{"role":"user","content":"Help me prepare a JDBC statement to retrieve the pri…
Lade Detail …
simple_java_14 failed 0% {…} [[{"role":"user","content":"In the SmartRefreshLayout library, help me trigger t…
Lade Detail …
simple_java_15 failed 0% {…} [[{"role":"user","content":"Help me decode a 9-patch image from an input stream …
Lade Detail …
simple_java_16 failed 0% {…} [[{"role":"user","content":"Help me create an `InvokePolymorphicNode` for a give…
Lade Detail …
simple_java_17 failed 0% {…} [[{"role":"user","content":"Help me attach generic type information to a constru…
Lade Detail …
simple_java_18 failed 0% {…} [[{"role":"user","content":"Help me obtain the third page of role counts with a …
Lade Detail …
simple_java_19 failed 0% {…} [[{"role":"user","content":"Help me display the personal information page for a …
Lade Detail …
simple_java_20 failed 0% {…} [[{"role":"user","content":"Help me update the HBase mapping configuration for a…
Lade Detail …
simple_java_21 failed 0% {…} [[{"role":"user","content":"Help me handle an exception event `ioExceptionEvent`…
Lade Detail …
simple_java_22 failed 0% {…} [[{"role":"user","content":"Help me update the new status to 2 for a list of pro…
Lade Detail …
simple_java_23 failed 0% {…} [[{"role":"user","content":"Help me obtain a list of new home products that cont…
Lade Detail …
simple_java_24 failed 0% {…} [[{"role":"user","content":"Help me change the visibility of product categories …
Lade Detail …
200 von 707 Samples · Limit 200 Nächste ›