Samples · bfcl.single_turn
KI-Auswertung
Generiert 2026-05-13 21:38 · claude-sonnet-4-6Zusammenfassung
Das Modell erreicht eine Pass-Rate von ~49 %, was einer knappen Mehrheit an Fehlern entspricht. Die Performance liegt damit deutlich unter einem praxistauglichen Schwellenwert für Single-Turn Function Calling.
Stärken
- Keine Errors (0 von 1390 Aufrufen), das Modell produziert stets strukturierte Ausgaben.
- Argumente und Werte werden inhaltlich meist korrekt übergeben (richtige Zahlen, richtige Parameter-Namen).
Schwächen
- Dominantes Muster: Das Modell ersetzt Punkte in Funktionsnamen durch Unterstriche (`math.factorial` → `math_factorial`), was systematisch zu `wrong_func_name`-Fehlern führt.
- Vereinzelt treten Typ-Fehler auf (z. B. `interval` als flaches Array statt Array-of-Arrays), was auf mangelndes Schema-Verständnis bei verschachtelten Typen hindeutet.
- Optionale Parameter werden häufig weggelassen, obwohl sie im Expected-Output als leerer String akzeptiert würden — dies könnte weitere stille Fehler verursachen.
Auffälligkeiten
Fast alle sichtbaren Failures basieren auf einem einzigen, konsistenten Fehler: Punkt-Notation in Funktionsnamen wird nicht verwendet. Dies deutet darauf hin, dass das Modell entweder im Training kaum Beispiele mit Namespace-Trennzeichen (`.`) gesehen hat oder der System-Prompt / Tool-Schema diese Konvention nicht ausreichend vorgibt. Der Typ-Fehler bei `interval` ist ein separates, aber ebenfalls reproduzierbares Problem bei Array-of-Floats-Schemas.
Empfehlung
Den Tool-Schema-Prompt explizit um den Hinweis ergänzen, dass Funktionsnamen exakt — inklusive Punkt-Notation — aus dem Schema übernommen werden müssen; alternativ Few-Shot-Beispiele mit Dot-Namespaces in den System-Prompt integrieren und anschließend den Benchmark erneut auswerten.
Übersicht
707 Samples| Frage-ID | Status | Score | Prompt | Latenz | Tokens/s | TTFT | |
|---|---|---|---|---|---|---|---|
| simple_java_25 | failed | {…} [[{"role":"user","content":"Help me update the sort order of a recommended subje… | — | — | — | ||
|
Lade Detail …
|
|||||||
| simple_java_26 | failed | {…} [[{"role":"user","content":"Help me create a callable statement for executing a … | — | — | — | ||
|
Lade Detail …
|
|||||||
| simple_java_27 | failed | {…} [[{"role":"user","content":"What are the indices of the two numbers in the array… | — | — | — | ||
|
Lade Detail …
|
|||||||
| simple_java_28 | failed | {…} [[{"role":"user","content":"Help me create a scheduled executor service that per… | — | — | — | ||
|
Lade Detail …
|
|||||||
| simple_java_29 | failed | {…} [[{"role":"user","content":"Help me test that the 'zipkin.collector.activemq.con… | — | — | — | ||
|
Lade Detail …
|
|||||||
| simple_java_30 | failed | {…} [[{"role":"user","content":"Help me asynchronously store the value '42' with the… | — | — | — | ||
|
Lade Detail …
|
|||||||
| simple_java_31 | failed | {…} [[{"role":"user","content":"Help me obtain a reactive queue with the name 'taskQ… | — | — | — | ||
|
Lade Detail …
|
|||||||
| simple_java_32 | failed | {…} [[{"role":"user","content":"Help me asynchronously attempt to acquire a permit f… | — | — | — | ||
|
Lade Detail …
|
|||||||
| simple_java_33 | failed | {…} [[{"role":"user","content":"Help me asynchronously store the value 'John Doe' wi… | — | — | — | ||
|
Lade Detail …
|
|||||||
| simple_java_34 | failed | {…} [[{"role":"user","content":"Help me schedule a cleanup task to run after 5 minut… | — | — | — | ||
|
Lade Detail …
|
|||||||
| simple_java_35 | failed | {…} [[{"role":"user","content":"Help me perform a bitwise AND operation on Redis key… | — | — | — | ||
|
Lade Detail …
|
|||||||
| simple_java_36 | failed | {…} [[{"role":"user","content":"Help me decode a list of alternating key-value objec… | — | — | — | ||
|
Lade Detail …
|
|||||||
| simple_java_37 | failed | {…} [[{"role":"user","content":"Help me process a markup text `buildOutput` for a sp… | — | — | — | ||
|
Lade Detail …
|
|||||||
| simple_java_38 | failed | {…} [[{"role":"user","content":"Help me create a stubbed source map for a nested doc… | — | — | — | ||
|
Lade Detail …
|
|||||||
| simple_java_39 | failed | {…} [[{"role":"user","content":"Help me append the node ID to the StringBuilder `log… | — | — | — | ||
|
Lade Detail …
|
|||||||
| simple_java_40 | failed | {…} [[{"role":"user","content":"Help me notify the routing nodes observer that a pre… | — | — | — | ||
|
Lade Detail …
|
|||||||
| simple_java_41 | failed | {…} [[{"role":"user","content":"Help me configure an `ObjectParser` instance named `… | — | — | — | ||
|
Lade Detail …
|
|||||||
| simple_java_42 | failed | {…} [[{"role":"user","content":"Help me create a term query for a field type `userna… | — | — | — | ||
|
Lade Detail …
|
|||||||
| simple_java_43 | failed | {…} [[{"role":"user","content":"Help me create a spy instance for an Elasticsearch t… | — | — | — | ||
|
Lade Detail …
|
|||||||
| simple_java_44 | failed | {…} [[{"role":"user","content":"Help me initialize the DES cipher in Java for encryp… | — | — | — | ||
|
Lade Detail …
|
|||||||
| simple_java_45 | failed | {…} [[{"role":"user","content":"Help me validate that the environment variable map `… | — | — | — | ||
|
Lade Detail …
|
|||||||
| simple_java_46 | failed | {…} [[{"role":"user","content":"Help me validate that the caller-sensitive method ha… | — | — | — | ||
|
Lade Detail …
|
|||||||
| simple_java_47 | failed | {…} [[{"role":"user","content":"Help me output a formatted Java constant declaration… | — | — | — | ||
|
Lade Detail …
|
|||||||
| simple_java_48 | failed | {…} [[{"role":"user","content":"Help me instantiate a dummy server with SSL encrypti… | — | — | — | ||
|
Lade Detail …
|
|||||||
| simple_java_49 | failed | {…} [[{"role":"user","content":"Help me send HTTP response headers with a status cod… | — | — | — | ||
|
Lade Detail …
|
|||||||
| simple_java_50 | failed | {…} [[{"role":"user","content":"Help me simulate the deletion of documents matching … | — | — | — | ||
|
Lade Detail …
|
|||||||
| simple_java_51 | failed | {…} [[{"role":"user","content":"Help me execute the master operation to gather the u… | — | — | — | ||
|
Lade Detail …
|
|||||||
| simple_java_52 | failed | {…} [[{"role":"user","content":"In a Java XML processing context, help me obtain a l… | — | — | — | ||
|
Lade Detail …
|
|||||||
| simple_java_53 | failed | {…} [[{"role":"user","content":"Help me create a predicate that determines if a `Joi… | — | — | — | ||
|
Lade Detail …
|
|||||||
| simple_java_54 | failed | {…} [[{"role":"user","content":"Help me initiate a shard operation on a searchable s… | — | — | — | ||
|
Lade Detail …
|
|||||||
| simple_java_55 | failed | {…} [[{"role":"user","content":"Help me create a new searchable snapshot directory f… | — | — | — | ||
|
Lade Detail …
|
|||||||
| simple_java_56 | failed | {…} [[{"role":"user","content":"Help me parse the HTTP response body from an entity … | — | — | — | ||
|
Lade Detail …
|
|||||||
| simple_java_57 | failed | {…} [[{"role":"user","content":"Help me determine the boolean value of a configurati… | — | — | — | ||
|
Lade Detail …
|
|||||||
| simple_java_58 | failed | {…} [[{"role":"user","content":"Help me serialize a map of data `userProfile` with k… | — | — | — | ||
|
Lade Detail …
|
|||||||
| simple_java_59 | failed | {…} [[{"role":"user","content":"Help me truncate the translog for a shard located at… | — | — | — | ||
|
Lade Detail …
|
|||||||
| simple_java_60 | failed | {…} [[{"role":"user","content":"In Elasticsearch, help me build a nested query for a… | — | — | — | ||
|
Lade Detail …
|
|||||||
| simple_java_61 | failed | {…} [[{"role":"user","content":"Help me create an exponential decay scoring function… | — | — | — | ||
|
Lade Detail …
|
|||||||
| simple_java_64 | failed | {…} [[{"role":"user","content":"Help me create a new field type for a date script in… | — | — | — | ||
|
Lade Detail …
|
|||||||
| simple_java_65 | failed | {…} [[{"role":"user","content":"Help me generate the XContent with xContentBuilderIn… | — | — | — | ||
|
Lade Detail …
|
|||||||
| simple_java_66 | failed | {…} [[{"role":"user","content":"Help me create a child runtime field for a composite… | — | — | — | ||
|
Lade Detail …
|
|||||||
| simple_java_67 | failed | {…} [[{"role":"user","content":"Help me generate a DMG setup script for an applicati… | — | — | — | ||
|
Lade Detail …
|
|||||||
| simple_java_68 | failed | {…} [[{"role":"user","content":"Help me ensure that the application image directory … | — | — | — | ||
|
Lade Detail …
|
|||||||
| simple_java_69 | failed | {…} [[{"role":"user","content":"Help me ensure that the signs of the BigDecimal elem… | — | — | — | ||
|
Lade Detail …
|
|||||||
| simple_java_70 | failed | {…} [[{"role":"user","content":"Help me signal the end of an XML element with the qu… | — | — | — | ||
|
Lade Detail …
|
|||||||
| simple_java_71 | failed | {…} [[{"role":"user","content":"Help me switch the execution from coroutine with ID … | — | — | — | ||
|
Lade Detail …
|
|||||||
| simple_java_72 | failed | {…} [[{"role":"user","content":"Help me append a substring of characters from a char… | — | — | — | ||
|
Lade Detail …
|
|||||||
| simple_java_73 | failed | {…} [[{"role":"user","content":"Help me retrieve the encoding information for UTF-8 … | — | — | — | ||
|
Lade Detail …
|
|||||||
| simple_java_74 | failed | {…} [[{"role":"user","content":"Help me handle surrogate pairs in XML serialization,… | — | — | — | ||
|
Lade Detail …
|
|||||||
| simple_java_75 | failed | {…} [[{"role":"user","content":"Help me determine if the system property 'enableXmlS… | — | — | — | ||
|
Lade Detail …
|
|||||||
| simple_java_76 | failed | {…} [[{"role":"user","content":"Help me execute the step method to update the graphi… | — | — | — | ||
|
Lade Detail …
|
|||||||
| simple_java_77 | failed | {…} [[{"role":"user","content":"Help me validate that the user-provided password 'P@… | — | — | — | ||
|
Lade Detail …
|
|||||||
| simple_java_78 | failed | {…} [[{"role":"user","content":"Help me configure an option parser to require the 'o… | — | — | — | ||
|
Lade Detail …
|
|||||||
| simple_java_79 | failed | {…} [[{"role":"user","content":"Help me obtain an InputSource for the entity with a … | — | — | — | ||
|
Lade Detail …
|
|||||||
| simple_java_80 | failed | {…} [[{"role":"user","content":"What is the compiled pattern for a failure message i… | — | — | — | ||
|
Lade Detail …
|
|||||||
| simple_java_81 | failed | {…} [[{"role":"user","content":"Help me perform a garbage collection test using the … | — | — | — | ||
|
Lade Detail …
|
|||||||
| simple_java_82 | failed | {…} [[{"role":"user","content":"Help me execute the `runIt` method to perform a test… | — | — | — | ||
|
Lade Detail …
|
|||||||
| simple_java_83 | failed | {…} [[{"role":"user","content":"Help me execute a performance test in Java with 500 … | — | — | — | ||
|
Lade Detail …
|
|||||||
| simple_java_85 | failed | {…} [[{"role":"user","content":"Help me execute the `runIt` method to test if a clas… | — | — | — | ||
|
Lade Detail …
|
|||||||
| simple_java_86 | failed | {…} [[{"role":"user","content":"In a Java debugging test environment, help me execut… | — | — | — | ||
|
Lade Detail …
|
|||||||
| simple_java_87 | failed | {…} [[{"role":"user","content":"Help me create a VMDeathRequest with a suspend polic… | — | — | — | ||
|
Lade Detail …
|
|||||||
| simple_java_88 | failed | {…} [[{"role":"user","content":"Help me create a MethodEntryRequest for a specific t… | — | — | — | ||
|
Lade Detail …
|
|||||||
| simple_java_89 | failed | {…} [[{"role":"user","content":"Help me execute the test runner `runThis` with argum… | — | — | — | ||
|
Lade Detail …
|
|||||||
| simple_java_90 | failed | {…} [[{"role":"user","content":"Help me execute the test that checks for source path… | — | — | — | ||
|
Lade Detail …
|
|||||||
| simple_java_91 | failed | {…} [[{"role":"user","content":"Help me execute the 'runIt' method to process comman… | — | — | — | ||
|
Lade Detail …
|
|||||||
| simple_java_92 | failed | {…} [[{"role":"user","content":"Help me locate the absolute path to the class file f… | — | — | — | ||
|
Lade Detail …
|
|||||||
| simple_java_93 | failed | {…} [[{"role":"user","content":"Help me execute the jar agent with the options 'trac… | — | — | — | ||
|
Lade Detail …
|
|||||||
| simple_java_94 | failed | {…} [[{"role":"user","content":"Can I determine if the symbol 'getVersion' is readab… | — | — | — | ||
|
Lade Detail …
|
|||||||
| simple_java_95 | failed | {…} [[{"role":"user","content":"Help me execute a generic operation on an inlined ob… | — | — | — | ||
|
Lade Detail …
|
|||||||
| simple_java_96 | failed | {…} [[{"role":"user","content":"Help me generate a CodeTree for a call conversion in… | — | — | — | ||
|
Lade Detail …
|
|||||||
| simple_java_97 | failed | {…} [[{"role":"user","content":"Help me generate introspection information for a cla… | — | — | — | ||
|
Lade Detail …
|
|||||||
| simple_java_98 | failed | {…} [[{"role":"user","content":"What is the probability of a loop condition being tr… | — | — | — | ||
|
Lade Detail …
|
|||||||
| simple_java_99 | failed | {…} [[{"role":"user","content":"Help me create a delegate library instance for a cus… | — | — | — | ||
|
Lade Detail …
|
|||||||
| simple_javascript_2 | failed | {…} [[{"role":"user","content":"Help me extract the last transaction ID that has a s… | — | — | — | ||
|
Lade Detail …
|
|||||||
| simple_javascript_5 | failed | {…} [[{"role":"user","content":"Given the manageReactState function, which encapsula… | — | — | — | ||
|
Lade Detail …
|
|||||||
| simple_javascript_9 | failed | {…} [[{"role":"user","content":"Help me analyze a JSON payload `responseData` to ver… | — | — | — | ||
|
Lade Detail …
|
|||||||
| simple_javascript_10 | failed | {…} [[{"role":"user","content":"Help me obtain a collection of records from the 'emp… | — | — | — | ||
|
Lade Detail …
|
|||||||
| simple_javascript_11 | failed | {…} [[{"role":"user","content":"Help me sort a list of items myItemList alphabetica… | — | — | — | ||
|
Lade Detail …
|
|||||||
| simple_javascript_12 | failed | {…} [[{"role":"user","content":"Help me implement a 'dataFetch' operation with an AP… | — | — | — | ||
|
Lade Detail …
|
|||||||
| simple_javascript_15 | failed | {…} [[{"role":"user","content":"Help me generate a new ChartSeries with initial sett… | — | — | — | ||
|
Lade Detail …
|
|||||||
| simple_javascript_16 | failed | {…} [[{"role":"user","content":"Help me compute the updated coordinates for a set of… | — | — | — | ||
|
Lade Detail …
|
|||||||
| simple_javascript_18 | failed | {…} [[{"role":"user","content":"What is the final velocity for an object in free fal… | — | — | — | ||
|
Lade Detail …
|
|||||||
| simple_javascript_21 | failed | {…} [[{"role":"user","content":"Help me locate a product in a list of products Produ… | — | — | — | ||
|
Lade Detail …
|
|||||||
| simple_javascript_29 | failed | {…} [[{"role":"user","content":"Help me schedule a sequence of events where 'setupSt… | — | — | — | ||
|
Lade Detail …
|
|||||||
| simple_javascript_32 | failed | {…} [[{"role":"user","content":"Help me process a queue of file watch objects named … | — | — | — | ||
|
Lade Detail …
|
|||||||
| simple_javascript_35 | failed | {…} [[{"role":"user","content":"Help me check if two TypeScript declaration objects,… | — | — | — | ||
|
Lade Detail …
|
|||||||
| simple_javascript_37 | failed | {…} [[{"role":"user","content":"Help me add statements for initializing properties n… | — | — | — | ||
|
Lade Detail …
|
|||||||
| simple_javascript_38 | failed | {…} [[{"role":"user","content":"Help me determine the appropriate directory to monit… | — | — | — | ||
|
Lade Detail …
|
|||||||
| simple_javascript_40 | failed | {…} [[{"role":"user","content":"Help me determine the value to be used for a propert… | — | — | — | ||
|
Lade Detail …
|
|||||||
| simple_javascript_41 | failed | {…} [[{"role":"user","content":"Help me create a queue with a myWorkerFunction that … | — | — | — | ||
|
Lade Detail …
|
|||||||
| simple_javascript_43 | failed | {…} [[{"role":"user","content":"Help me execute a callback function named 'processRe… | — | — | — | ||
|
Lade Detail …
|
|||||||
| simple_javascript_48 | failed | {…} [[{"role":"user","content":"Help me update the DOM event listeners from an old v… | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_0 | failed | {…} [[{"role":"user","content":"Can I find the dimensions and properties of a triang… | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_1 | failed | {…} [[{"role":"user","content":"Calculate the area of a triangle, given the lengths … | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_2 | failed | {…} [[{"role":"user","content":"What is the capital of Brazil?"}]] | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_3 | failed | {…} [[{"role":"user","content":"Compute the Euclidean distance between two points A(… | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_4 | failed | {…} [[{"role":"user","content":"Can you calculate the displacement of a car moving a… | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_5 | failed | {…} [[{"role":"user","content":"What is the wind speed and temperature in location g… | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_6 | failed | {…} [[{"role":"user","content":"Calculate the capacitance of a parallel plate capaci… | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_7 | failed | {…} [[{"role":"user","content":"How to assess the population growth in deer and thei… | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_8 | failed | {…} [[{"role":"user","content":"Find a 3 bedroom villa for sale within $300,000 to $… | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_10 | failed | {…} [[{"role":"user","content":"I need to delete some columns from my employees data… | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_11 | failed | {…} [[{"role":"user","content":"Calculate the roots of a quadratic equation with coe… | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_12 | failed | {…} [[{"role":"user","content":"What is the year over year growth rate for company '… | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_13 | failed | {…} [[{"role":"user","content":"How much revenue would company XYZ generate if we in… | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_14 | failed | {…} [[{"role":"user","content":"Calculate the depreciated value of a property costin… | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_15 | failed | {…} [[{"role":"user","content":"How much is the potential of the Solar farm at locat… | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_16 | failed | {…} [[{"role":"user","content":"What's the required minimum population size (Ne) for… | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_17 | failed | {…} [[{"role":"user","content":"Find the conversion rate from Euro to Dollar at Janu… | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_18 | failed | {…} [[{"role":"user","content":"Who were the main participants and what was the loca… | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_19 | failed | {…} [[{"role":"user","content":"What are the three great Schism in Christianity hist… | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_20 | failed | {…} [[{"role":"user","content":"What is the price to commission a sculpture made of … | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_22 | failed | {…} [[{"role":"user","content":"What is the record for the most points scored by a s… | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_23 | failed | {…} [[{"role":"user","content":"What are the current stats for basketball player LeB… | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_24 | failed | {…} [[{"role":"user","content":"What is the fastest route from London to Edinburgh f… | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_25 | failed | {…} [[{"role":"user","content":"What is the cheapest selling price for the game 'Ass… | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_26 | failed | {…} [[{"role":"user","content":"Find out the rewards for playing Fortnite on Playsta… | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_27 | failed | {…} [[{"role":"user","content":"What is the shortest path from Paris, France to Rome… | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_28 | failed | {…} [[{"role":"user","content":"What's the root of quadratic equation with coefficie… | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_29 | failed | {…} [[{"role":"user","content":"Find the intersection points of the functions y=3x+2… | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_30 | failed | {…} [[{"role":"user","content":"What is the area of a rectangle with length 12 meter… | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_31 | failed | {…} [[{"role":"user","content":"What is the area and perimeter of a rectangle with w… | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_32 | failed | {…} [[{"role":"user","content":"Calculate the volume of a cone with radius 4 and hei… | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_34 | failed | {…} [[{"role":"user","content":"Calculate the Least Common Multiple (LCM) of 18 and … | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_36 | failed | {…} [[{"role":"user","content":"Find out how fast an object was going if it started … | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_37 | failed | {…} [[{"role":"user","content":"Find the final velocity of an object thrown up at 40… | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_38 | failed | {…} [[{"role":"user","content":"Find a book 'The Alchemist' in the library branches … | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_39 | failed | {…} [[{"role":"user","content":"Find a ride from New York to Philadelphia with maxim… | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_40 | failed | {…} [[{"role":"user","content":"Calculate the strength of magnetic field given dista… | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_41 | failed | {…} [[{"role":"user","content":"Calculate the magnetic field at point P using Ampere… | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_43 | failed | {…} [[{"role":"user","content":"What is the energy produced by 5 mol of glucose (C6H… | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_44 | failed | {…} [[{"role":"user","content":"How much will I weigh on Mars if my weight on Earth … | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_45 | failed | {…} [[{"role":"user","content":"Calculate how many years ago was the Ice age?"}]] | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_47 | failed | {…} [[{"role":"user","content":"Calculate the cosine similarity between vector A [3,… | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_48 | failed | {…} [[{"role":"user","content":"Find me a pet-friendly library with facilities for d… | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_51 | failed | {…} [[{"role":"user","content":"Calculate the probability of rolling a sum of 7 on a… | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_53 | failed | {…} [[{"role":"user","content":"Predict the house prices for next 5 years based on i… | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_54 | failed | {…} [[{"role":"user","content":"Find out the historical dividend payments of Apple I… | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_57 | failed | {…} [[{"role":"user","content":"Can you please calculate the compound interest for a… | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_58 | failed | {…} [[{"role":"user","content":"Search for divorce law specialists in Los Angeles"}]… | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_61 | failed | {…} [[{"role":"user","content":"Find a Landscape Architect who is experienced 5 year… | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_62 | failed | {…} [[{"role":"user","content":"Find me the closest nature park that allows camping … | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_64 | failed | {…} [[{"role":"user","content":"Give me the UV index for Tokyo for tomorrow, June01 … | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_65 | failed | {…} [[{"role":"user","content":"Find the distance between New York City and Los Ange… | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_67 | failed | {…} [[{"role":"user","content":"Translate Hello, how are you? from English to French… | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_68 | failed | {…} [[{"role":"user","content":"Can I find a historical fiction book at the New York… | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_69 | failed | {…} [[{"role":"user","content":"Determine my personality type based on the five fact… | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_70 | failed | {…} [[{"role":"user","content":"Who were the kings of France during the 18th century… | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_72 | failed | {…} [[{"role":"user","content":"What was the population of California in 1970?"}]] | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_73 | failed | {…} [[{"role":"user","content":"Who was the founder of Buddhism and where was it ori… | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_74 | failed | {…} [[{"role":"user","content":"Find the price of Van Gogh's painting 'Starry Night'… | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_75 | failed | {…} [[{"role":"user","content":"Which paint color is currently most popular for livi… | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_76 | failed | {…} [[{"role":"user","content":"I want to order a custom bronze sculpture of a horse… | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_77 | failed | {…} [[{"role":"user","content":"Search for famous contemporary sculptures in New Yor… | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_78 | failed | {…} [[{"role":"user","content":"Get me information about Natural History Museum in L… | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_80 | failed | {…} [[{"role":"user","content":"Find a local guitar shop that also offers violin les… | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_81 | failed | {…} [[{"role":"user","content":"Book a ticket for the upcoming Eminem concert in New… | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_82 | failed | {…} [[{"role":"user","content":"Play a song in C Major key at tempo 120 bpm."}]] | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_83 | failed | {…} [[{"role":"user","content":"How many goals has Lionel Messi scored for Barcelona… | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_85 | failed | {…} [[{"role":"user","content":"Get the soccer scores for Real Madrid games in La Li… | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_86 | failed | {…} [[{"role":"user","content":"What are some recommended board games for 2 players … | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_87 | failed | {…} [[{"role":"user","content":"Find the latest update or patch for the game 'Cyberp… | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_88 | failed | {…} [[{"role":"user","content":"Find me the number of active players in the game 'Wo… | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_90 | failed | {…} [[{"role":"user","content":"I want a seafood restaurant in Seattle that can acco… | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_91 | failed | {…} [[{"role":"user","content":"Can I find a good cooking recipe for apple pie using… | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_92 | failed | {…} [[{"role":"user","content":"Get me a list of available vegetarian and gluten-fre… | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_93 | failed | {…} [[{"role":"user","content":"Book a deluxe room for 2 nights at the Marriott hote… | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_94 | failed | {…} [[{"role":"user","content":"I want to book a suite with queen size bed for 3 nig… | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_95 | failed | {…} [[{"role":"user","content":"Convert 200 euros to US dollars using current exchan… | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_97 | failed | {…} [[{"role":"user","content":"What's the area of a circle with a radius of 10?"}]] | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_98 | failed | {…} [[{"role":"user","content":"Calculate the circumference of a circle with radius … | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_99 | failed | {…} [[{"role":"user","content":"Calculate the derivative of the function 2x^2 at x =… | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_100 | failed | {…} [[{"role":"user","content":"Find the highest common factor of 36 and 24."}]] | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_101 | failed | {…} [[{"role":"user","content":"Find the greatest common divisor (GCD) of 12 and 18"… | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_109 | failed | {…} [[{"role":"user","content":"What are the names of proteins found in the plasma m… | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_110 | failed | {…} [[{"role":"user","content":"Find the type of gene mutation based on SNP (Single … | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_114 | failed | {…} [[{"role":"user","content":"Get me the predictions of the evolutionary rate for … | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_118 | failed | {…} [[{"role":"user","content":"Find all movies starring Leonardo DiCaprio in the ye… | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_119 | failed | {…} [[{"role":"user","content":"Find records in database in user table where age is … | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_121 | failed | {…} [[{"role":"user","content":"Calculate the area of a triangle with base 6 and hei… | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_122 | failed | {…} [[{"role":"user","content":"Run a linear regression model with predictor variabl… | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_124 | failed | {…} [[{"role":"user","content":"What's the probability of drawing a king from a well… | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_127 | failed | {…} [[{"role":"user","content":"What's the quarterly dividend per share of a company… | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_131 | failed | {…} [[{"role":"user","content":"Find the market performance of the S&P 500 and the D… | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_132 | failed | {…} [[{"role":"user","content":"Calculate the future value of an investment with an … | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_134 | failed | {…} [[{"role":"user","content":"Look up details of a felony crime record for case nu… | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_136 | failed | {…} [[{"role":"user","content":"Provide me the official crime rate of violent crime … | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_138 | failed | {…} [[{"role":"user","content":"How to obtain the detailed case information of the R… | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_139 | failed | {…} [[{"role":"user","content":"Find details of patent lawsuits involving the compan… | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_141 | failed | {…} [[{"role":"user","content":"I need the details of the lawsuit case with case ID … | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_142 | failed | {…} [[{"role":"user","content":"What is the humidity level in Miami, Florida in the … | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_146 | failed | {…} [[{"role":"user","content":"Find restaurants near me within 10 miles that offer … | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_147 | failed | {…} [[{"role":"user","content":"Get me the directions from New York to Los Angeles a… | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_149 | failed | {…} [[{"role":"user","content":"Analyze the sentiment of a customer review 'I love t… | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_151 | failed | {…} [[{"role":"user","content":"Find the most followed person on twitter who tweets … | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_152 | failed | {…} [[{"role":"user","content":"Provide key war events in German history from 1871 t… | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_154 | failed | {…} [[{"role":"user","content":"Who was the full name of the president of the United… | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_156 | failed | {…} [[{"role":"user","content":"What was Albert Einstein's contribution to science o… | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_158 | failed | {…} [[{"role":"user","content":"Get the biography and main contributions of Pope Inn… | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_163 | failed | {…} [[{"role":"user","content":"Get the list of top 5 popular artworks at the Metrop… | — | — | — | ||
|
Lade Detail …
|
|||||||
| multiple_164 | failed | {…} [[{"role":"user","content":"What's the retail price of a Fender American Profess… | — | — | — | ||
|
Lade Detail …
|
|||||||