Samples · bfcl.single_turn

Run #75 · Adapter v1.0.0+v4-layout-aliases+reasoning · 200/707 Samples angezeigt · Score 49.1%
‹ Zurück zum Run-Detail

KI-Auswertung

Generiert 2026-05-13 21:38 · claude-sonnet-4-6

Zusammenfassung

Das Modell erreicht eine Pass-Rate von ~49 %, was einer knappen Mehrheit an Fehlern entspricht. Die Performance liegt damit deutlich unter einem praxistauglichen Schwellenwert für Single-Turn Function Calling.

Stärken

  • Keine Errors (0 von 1390 Aufrufen), das Modell produziert stets strukturierte Ausgaben.
  • Argumente und Werte werden inhaltlich meist korrekt übergeben (richtige Zahlen, richtige Parameter-Namen).

Schwächen

  • Dominantes Muster: Das Modell ersetzt Punkte in Funktionsnamen durch Unterstriche (`math.factorial` → `math_factorial`), was systematisch zu `wrong_func_name`-Fehlern führt.
  • Vereinzelt treten Typ-Fehler auf (z. B. `interval` als flaches Array statt Array-of-Arrays), was auf mangelndes Schema-Verständnis bei verschachtelten Typen hindeutet.
  • Optionale Parameter werden häufig weggelassen, obwohl sie im Expected-Output als leerer String akzeptiert würden — dies könnte weitere stille Fehler verursachen.

Auffälligkeiten

Fast alle sichtbaren Failures basieren auf einem einzigen, konsistenten Fehler: Punkt-Notation in Funktionsnamen wird nicht verwendet. Dies deutet darauf hin, dass das Modell entweder im Training kaum Beispiele mit Namespace-Trennzeichen (`.`) gesehen hat oder der System-Prompt / Tool-Schema diese Konvention nicht ausreichend vorgibt. Der Typ-Fehler bei `interval` ist ein separates, aber ebenfalls reproduzierbares Problem bei Array-of-Floats-Schemas.

Empfehlung

Den Tool-Schema-Prompt explizit um den Hinweis ergänzen, dass Funktionsnamen exakt — inklusive Punkt-Notation — aus dem Schema übernommen werden müssen; alternativ Few-Shot-Beispiele mit Dot-Namespaces in den System-Prompt integrieren und anschließend den Benchmark erneut auswerten.

Übersicht

707 Samples
Verteilung
707
Score-Histogramm
0 – 0.1: 707 0.1 – 0.2: 0 0.2 – 0.3: 0 0.3 – 0.4: 0 0.4 – 0.5: 0 0.5 – 0.6: 0 0.6 – 0.7: 0 0.7 – 0.8: 0 0.8 – 0.9: 0 0.9 – 1: 0
0.0 ────── 1.0
Status Score-Schwelle Score < 0.5
Frage-ID Status Score Prompt Latenz Tokens/s TTFT
simple_java_25 failed 0% {…} [[{"role":"user","content":"Help me update the sort order of a recommended subje…
Lade Detail …
simple_java_26 failed 0% {…} [[{"role":"user","content":"Help me create a callable statement for executing a …
Lade Detail …
simple_java_27 failed 0% {…} [[{"role":"user","content":"What are the indices of the two numbers in the array…
Lade Detail …
simple_java_28 failed 0% {…} [[{"role":"user","content":"Help me create a scheduled executor service that per…
Lade Detail …
simple_java_29 failed 0% {…} [[{"role":"user","content":"Help me test that the 'zipkin.collector.activemq.con…
Lade Detail …
simple_java_30 failed 0% {…} [[{"role":"user","content":"Help me asynchronously store the value '42' with the…
Lade Detail …
simple_java_31 failed 0% {…} [[{"role":"user","content":"Help me obtain a reactive queue with the name 'taskQ…
Lade Detail …
simple_java_32 failed 0% {…} [[{"role":"user","content":"Help me asynchronously attempt to acquire a permit f…
Lade Detail …
simple_java_33 failed 0% {…} [[{"role":"user","content":"Help me asynchronously store the value 'John Doe' wi…
Lade Detail …
simple_java_34 failed 0% {…} [[{"role":"user","content":"Help me schedule a cleanup task to run after 5 minut…
Lade Detail …
simple_java_35 failed 0% {…} [[{"role":"user","content":"Help me perform a bitwise AND operation on Redis key…
Lade Detail …
simple_java_36 failed 0% {…} [[{"role":"user","content":"Help me decode a list of alternating key-value objec…
Lade Detail …
simple_java_37 failed 0% {…} [[{"role":"user","content":"Help me process a markup text `buildOutput` for a sp…
Lade Detail …
simple_java_38 failed 0% {…} [[{"role":"user","content":"Help me create a stubbed source map for a nested doc…
Lade Detail …
simple_java_39 failed 0% {…} [[{"role":"user","content":"Help me append the node ID to the StringBuilder `log…
Lade Detail …
simple_java_40 failed 0% {…} [[{"role":"user","content":"Help me notify the routing nodes observer that a pre…
Lade Detail …
simple_java_41 failed 0% {…} [[{"role":"user","content":"Help me configure an `ObjectParser` instance named `…
Lade Detail …
simple_java_42 failed 0% {…} [[{"role":"user","content":"Help me create a term query for a field type `userna…
Lade Detail …
simple_java_43 failed 0% {…} [[{"role":"user","content":"Help me create a spy instance for an Elasticsearch t…
Lade Detail …
simple_java_44 failed 0% {…} [[{"role":"user","content":"Help me initialize the DES cipher in Java for encryp…
Lade Detail …
simple_java_45 failed 0% {…} [[{"role":"user","content":"Help me validate that the environment variable map `…
Lade Detail …
simple_java_46 failed 0% {…} [[{"role":"user","content":"Help me validate that the caller-sensitive method ha…
Lade Detail …
simple_java_47 failed 0% {…} [[{"role":"user","content":"Help me output a formatted Java constant declaration…
Lade Detail …
simple_java_48 failed 0% {…} [[{"role":"user","content":"Help me instantiate a dummy server with SSL encrypti…
Lade Detail …
simple_java_49 failed 0% {…} [[{"role":"user","content":"Help me send HTTP response headers with a status cod…
Lade Detail …
simple_java_50 failed 0% {…} [[{"role":"user","content":"Help me simulate the deletion of documents matching …
Lade Detail …
simple_java_51 failed 0% {…} [[{"role":"user","content":"Help me execute the master operation to gather the u…
Lade Detail …
simple_java_52 failed 0% {…} [[{"role":"user","content":"In a Java XML processing context, help me obtain a l…
Lade Detail …
simple_java_53 failed 0% {…} [[{"role":"user","content":"Help me create a predicate that determines if a `Joi…
Lade Detail …
simple_java_54 failed 0% {…} [[{"role":"user","content":"Help me initiate a shard operation on a searchable s…
Lade Detail …
simple_java_55 failed 0% {…} [[{"role":"user","content":"Help me create a new searchable snapshot directory f…
Lade Detail …
simple_java_56 failed 0% {…} [[{"role":"user","content":"Help me parse the HTTP response body from an entity …
Lade Detail …
simple_java_57 failed 0% {…} [[{"role":"user","content":"Help me determine the boolean value of a configurati…
Lade Detail …
simple_java_58 failed 0% {…} [[{"role":"user","content":"Help me serialize a map of data `userProfile` with k…
Lade Detail …
simple_java_59 failed 0% {…} [[{"role":"user","content":"Help me truncate the translog for a shard located at…
Lade Detail …
simple_java_60 failed 0% {…} [[{"role":"user","content":"In Elasticsearch, help me build a nested query for a…
Lade Detail …
simple_java_61 failed 0% {…} [[{"role":"user","content":"Help me create an exponential decay scoring function…
Lade Detail …
simple_java_64 failed 0% {…} [[{"role":"user","content":"Help me create a new field type for a date script in…
Lade Detail …
simple_java_65 failed 0% {…} [[{"role":"user","content":"Help me generate the XContent with xContentBuilderIn…
Lade Detail …
simple_java_66 failed 0% {…} [[{"role":"user","content":"Help me create a child runtime field for a composite…
Lade Detail …
simple_java_67 failed 0% {…} [[{"role":"user","content":"Help me generate a DMG setup script for an applicati…
Lade Detail …
simple_java_68 failed 0% {…} [[{"role":"user","content":"Help me ensure that the application image directory …
Lade Detail …
simple_java_69 failed 0% {…} [[{"role":"user","content":"Help me ensure that the signs of the BigDecimal elem…
Lade Detail …
simple_java_70 failed 0% {…} [[{"role":"user","content":"Help me signal the end of an XML element with the qu…
Lade Detail …
simple_java_71 failed 0% {…} [[{"role":"user","content":"Help me switch the execution from coroutine with ID …
Lade Detail …
simple_java_72 failed 0% {…} [[{"role":"user","content":"Help me append a substring of characters from a char…
Lade Detail …
simple_java_73 failed 0% {…} [[{"role":"user","content":"Help me retrieve the encoding information for UTF-8 …
Lade Detail …
simple_java_74 failed 0% {…} [[{"role":"user","content":"Help me handle surrogate pairs in XML serialization,…
Lade Detail …
simple_java_75 failed 0% {…} [[{"role":"user","content":"Help me determine if the system property 'enableXmlS…
Lade Detail …
simple_java_76 failed 0% {…} [[{"role":"user","content":"Help me execute the step method to update the graphi…
Lade Detail …
simple_java_77 failed 0% {…} [[{"role":"user","content":"Help me validate that the user-provided password 'P@…
Lade Detail …
simple_java_78 failed 0% {…} [[{"role":"user","content":"Help me configure an option parser to require the 'o…
Lade Detail …
simple_java_79 failed 0% {…} [[{"role":"user","content":"Help me obtain an InputSource for the entity with a …
Lade Detail …
simple_java_80 failed 0% {…} [[{"role":"user","content":"What is the compiled pattern for a failure message i…
Lade Detail …
simple_java_81 failed 0% {…} [[{"role":"user","content":"Help me perform a garbage collection test using the …
Lade Detail …
simple_java_82 failed 0% {…} [[{"role":"user","content":"Help me execute the `runIt` method to perform a test…
Lade Detail …
simple_java_83 failed 0% {…} [[{"role":"user","content":"Help me execute a performance test in Java with 500 …
Lade Detail …
simple_java_85 failed 0% {…} [[{"role":"user","content":"Help me execute the `runIt` method to test if a clas…
Lade Detail …
simple_java_86 failed 0% {…} [[{"role":"user","content":"In a Java debugging test environment, help me execut…
Lade Detail …
simple_java_87 failed 0% {…} [[{"role":"user","content":"Help me create a VMDeathRequest with a suspend polic…
Lade Detail …
simple_java_88 failed 0% {…} [[{"role":"user","content":"Help me create a MethodEntryRequest for a specific t…
Lade Detail …
simple_java_89 failed 0% {…} [[{"role":"user","content":"Help me execute the test runner `runThis` with argum…
Lade Detail …
simple_java_90 failed 0% {…} [[{"role":"user","content":"Help me execute the test that checks for source path…
Lade Detail …
simple_java_91 failed 0% {…} [[{"role":"user","content":"Help me execute the 'runIt' method to process comman…
Lade Detail …
simple_java_92 failed 0% {…} [[{"role":"user","content":"Help me locate the absolute path to the class file f…
Lade Detail …
simple_java_93 failed 0% {…} [[{"role":"user","content":"Help me execute the jar agent with the options 'trac…
Lade Detail …
simple_java_94 failed 0% {…} [[{"role":"user","content":"Can I determine if the symbol 'getVersion' is readab…
Lade Detail …
simple_java_95 failed 0% {…} [[{"role":"user","content":"Help me execute a generic operation on an inlined ob…
Lade Detail …
simple_java_96 failed 0% {…} [[{"role":"user","content":"Help me generate a CodeTree for a call conversion in…
Lade Detail …
simple_java_97 failed 0% {…} [[{"role":"user","content":"Help me generate introspection information for a cla…
Lade Detail …
simple_java_98 failed 0% {…} [[{"role":"user","content":"What is the probability of a loop condition being tr…
Lade Detail …
simple_java_99 failed 0% {…} [[{"role":"user","content":"Help me create a delegate library instance for a cus…
Lade Detail …
simple_javascript_2 failed 0% {…} [[{"role":"user","content":"Help me extract the last transaction ID that has a s…
Lade Detail …
simple_javascript_5 failed 0% {…} [[{"role":"user","content":"Given the manageReactState function, which encapsula…
Lade Detail …
simple_javascript_9 failed 0% {…} [[{"role":"user","content":"Help me analyze a JSON payload `responseData` to ver…
Lade Detail …
simple_javascript_10 failed 0% {…} [[{"role":"user","content":"Help me obtain a collection of records from the 'emp…
Lade Detail …
simple_javascript_11 failed 0% {…} [[{"role":"user","content":"Help me sort a list of items myItemList alphabetica…
Lade Detail …
simple_javascript_12 failed 0% {…} [[{"role":"user","content":"Help me implement a 'dataFetch' operation with an AP…
Lade Detail …
simple_javascript_15 failed 0% {…} [[{"role":"user","content":"Help me generate a new ChartSeries with initial sett…
Lade Detail …
simple_javascript_16 failed 0% {…} [[{"role":"user","content":"Help me compute the updated coordinates for a set of…
Lade Detail …
simple_javascript_18 failed 0% {…} [[{"role":"user","content":"What is the final velocity for an object in free fal…
Lade Detail …
simple_javascript_21 failed 0% {…} [[{"role":"user","content":"Help me locate a product in a list of products Produ…
Lade Detail …
simple_javascript_29 failed 0% {…} [[{"role":"user","content":"Help me schedule a sequence of events where 'setupSt…
Lade Detail …
simple_javascript_32 failed 0% {…} [[{"role":"user","content":"Help me process a queue of file watch objects named …
Lade Detail …
simple_javascript_35 failed 0% {…} [[{"role":"user","content":"Help me check if two TypeScript declaration objects,…
Lade Detail …
simple_javascript_37 failed 0% {…} [[{"role":"user","content":"Help me add statements for initializing properties n…
Lade Detail …
simple_javascript_38 failed 0% {…} [[{"role":"user","content":"Help me determine the appropriate directory to monit…
Lade Detail …
simple_javascript_40 failed 0% {…} [[{"role":"user","content":"Help me determine the value to be used for a propert…
Lade Detail …
simple_javascript_41 failed 0% {…} [[{"role":"user","content":"Help me create a queue with a myWorkerFunction that …
Lade Detail …
simple_javascript_43 failed 0% {…} [[{"role":"user","content":"Help me execute a callback function named 'processRe…
Lade Detail …
simple_javascript_48 failed 0% {…} [[{"role":"user","content":"Help me update the DOM event listeners from an old v…
Lade Detail …
multiple_0 failed 0% {…} [[{"role":"user","content":"Can I find the dimensions and properties of a triang…
Lade Detail …
multiple_1 failed 0% {…} [[{"role":"user","content":"Calculate the area of a triangle, given the lengths …
Lade Detail …
multiple_2 failed 0% {…} [[{"role":"user","content":"What is the capital of Brazil?"}]]
Lade Detail …
multiple_3 failed 0% {…} [[{"role":"user","content":"Compute the Euclidean distance between two points A(…
Lade Detail …
multiple_4 failed 0% {…} [[{"role":"user","content":"Can you calculate the displacement of a car moving a…
Lade Detail …
multiple_5 failed 0% {…} [[{"role":"user","content":"What is the wind speed and temperature in location g…
Lade Detail …
multiple_6 failed 0% {…} [[{"role":"user","content":"Calculate the capacitance of a parallel plate capaci…
Lade Detail …
multiple_7 failed 0% {…} [[{"role":"user","content":"How to assess the population growth in deer and thei…
Lade Detail …
multiple_8 failed 0% {…} [[{"role":"user","content":"Find a 3 bedroom villa for sale within $300,000 to $…
Lade Detail …
multiple_10 failed 0% {…} [[{"role":"user","content":"I need to delete some columns from my employees data…
Lade Detail …
multiple_11 failed 0% {…} [[{"role":"user","content":"Calculate the roots of a quadratic equation with coe…
Lade Detail …
multiple_12 failed 0% {…} [[{"role":"user","content":"What is the year over year growth rate for company '…
Lade Detail …
multiple_13 failed 0% {…} [[{"role":"user","content":"How much revenue would company XYZ generate if we in…
Lade Detail …
multiple_14 failed 0% {…} [[{"role":"user","content":"Calculate the depreciated value of a property costin…
Lade Detail …
multiple_15 failed 0% {…} [[{"role":"user","content":"How much is the potential of the Solar farm at locat…
Lade Detail …
multiple_16 failed 0% {…} [[{"role":"user","content":"What's the required minimum population size (Ne) for…
Lade Detail …
multiple_17 failed 0% {…} [[{"role":"user","content":"Find the conversion rate from Euro to Dollar at Janu…
Lade Detail …
multiple_18 failed 0% {…} [[{"role":"user","content":"Who were the main participants and what was the loca…
Lade Detail …
multiple_19 failed 0% {…} [[{"role":"user","content":"What are the three great Schism in Christianity hist…
Lade Detail …
multiple_20 failed 0% {…} [[{"role":"user","content":"What is the price to commission a sculpture made of …
Lade Detail …
multiple_22 failed 0% {…} [[{"role":"user","content":"What is the record for the most points scored by a s…
Lade Detail …
multiple_23 failed 0% {…} [[{"role":"user","content":"What are the current stats for basketball player LeB…
Lade Detail …
multiple_24 failed 0% {…} [[{"role":"user","content":"What is the fastest route from London to Edinburgh f…
Lade Detail …
multiple_25 failed 0% {…} [[{"role":"user","content":"What is the cheapest selling price for the game 'Ass…
Lade Detail …
multiple_26 failed 0% {…} [[{"role":"user","content":"Find out the rewards for playing Fortnite on Playsta…
Lade Detail …
multiple_27 failed 0% {…} [[{"role":"user","content":"What is the shortest path from Paris, France to Rome…
Lade Detail …
multiple_28 failed 0% {…} [[{"role":"user","content":"What's the root of quadratic equation with coefficie…
Lade Detail …
multiple_29 failed 0% {…} [[{"role":"user","content":"Find the intersection points of the functions y=3x+2…
Lade Detail …
multiple_30 failed 0% {…} [[{"role":"user","content":"What is the area of a rectangle with length 12 meter…
Lade Detail …
multiple_31 failed 0% {…} [[{"role":"user","content":"What is the area and perimeter of a rectangle with w…
Lade Detail …
multiple_32 failed 0% {…} [[{"role":"user","content":"Calculate the volume of a cone with radius 4 and hei…
Lade Detail …
multiple_34 failed 0% {…} [[{"role":"user","content":"Calculate the Least Common Multiple (LCM) of 18 and …
Lade Detail …
multiple_36 failed 0% {…} [[{"role":"user","content":"Find out how fast an object was going if it started …
Lade Detail …
multiple_37 failed 0% {…} [[{"role":"user","content":"Find the final velocity of an object thrown up at 40…
Lade Detail …
multiple_38 failed 0% {…} [[{"role":"user","content":"Find a book 'The Alchemist' in the library branches …
Lade Detail …
multiple_39 failed 0% {…} [[{"role":"user","content":"Find a ride from New York to Philadelphia with maxim…
Lade Detail …
multiple_40 failed 0% {…} [[{"role":"user","content":"Calculate the strength of magnetic field given dista…
Lade Detail …
multiple_41 failed 0% {…} [[{"role":"user","content":"Calculate the magnetic field at point P using Ampere…
Lade Detail …
multiple_43 failed 0% {…} [[{"role":"user","content":"What is the energy produced by 5 mol of glucose (C6H…
Lade Detail …
multiple_44 failed 0% {…} [[{"role":"user","content":"How much will I weigh on Mars if my weight on Earth …
Lade Detail …
multiple_45 failed 0% {…} [[{"role":"user","content":"Calculate how many years ago was the Ice age?"}]]
Lade Detail …
multiple_47 failed 0% {…} [[{"role":"user","content":"Calculate the cosine similarity between vector A [3,…
Lade Detail …
multiple_48 failed 0% {…} [[{"role":"user","content":"Find me a pet-friendly library with facilities for d…
Lade Detail …
multiple_51 failed 0% {…} [[{"role":"user","content":"Calculate the probability of rolling a sum of 7 on a…
Lade Detail …
multiple_53 failed 0% {…} [[{"role":"user","content":"Predict the house prices for next 5 years based on i…
Lade Detail …
multiple_54 failed 0% {…} [[{"role":"user","content":"Find out the historical dividend payments of Apple I…
Lade Detail …
multiple_57 failed 0% {…} [[{"role":"user","content":"Can you please calculate the compound interest for a…
Lade Detail …
multiple_58 failed 0% {…} [[{"role":"user","content":"Search for divorce law specialists in Los Angeles"}]…
Lade Detail …
multiple_61 failed 0% {…} [[{"role":"user","content":"Find a Landscape Architect who is experienced 5 year…
Lade Detail …
multiple_62 failed 0% {…} [[{"role":"user","content":"Find me the closest nature park that allows camping …
Lade Detail …
multiple_64 failed 0% {…} [[{"role":"user","content":"Give me the UV index for Tokyo for tomorrow, June01 …
Lade Detail …
multiple_65 failed 0% {…} [[{"role":"user","content":"Find the distance between New York City and Los Ange…
Lade Detail …
multiple_67 failed 0% {…} [[{"role":"user","content":"Translate Hello, how are you? from English to French…
Lade Detail …
multiple_68 failed 0% {…} [[{"role":"user","content":"Can I find a historical fiction book at the New York…
Lade Detail …
multiple_69 failed 0% {…} [[{"role":"user","content":"Determine my personality type based on the five fact…
Lade Detail …
multiple_70 failed 0% {…} [[{"role":"user","content":"Who were the kings of France during the 18th century…
Lade Detail …
multiple_72 failed 0% {…} [[{"role":"user","content":"What was the population of California in 1970?"}]]
Lade Detail …
multiple_73 failed 0% {…} [[{"role":"user","content":"Who was the founder of Buddhism and where was it ori…
Lade Detail …
multiple_74 failed 0% {…} [[{"role":"user","content":"Find the price of Van Gogh's painting 'Starry Night'…
Lade Detail …
multiple_75 failed 0% {…} [[{"role":"user","content":"Which paint color is currently most popular for livi…
Lade Detail …
multiple_76 failed 0% {…} [[{"role":"user","content":"I want to order a custom bronze sculpture of a horse…
Lade Detail …
multiple_77 failed 0% {…} [[{"role":"user","content":"Search for famous contemporary sculptures in New Yor…
Lade Detail …
multiple_78 failed 0% {…} [[{"role":"user","content":"Get me information about Natural History Museum in L…
Lade Detail …
multiple_80 failed 0% {…} [[{"role":"user","content":"Find a local guitar shop that also offers violin les…
Lade Detail …
multiple_81 failed 0% {…} [[{"role":"user","content":"Book a ticket for the upcoming Eminem concert in New…
Lade Detail …
multiple_82 failed 0% {…} [[{"role":"user","content":"Play a song in C Major key at tempo 120 bpm."}]]
Lade Detail …
multiple_83 failed 0% {…} [[{"role":"user","content":"How many goals has Lionel Messi scored for Barcelona…
Lade Detail …
multiple_85 failed 0% {…} [[{"role":"user","content":"Get the soccer scores for Real Madrid games in La Li…
Lade Detail …
multiple_86 failed 0% {…} [[{"role":"user","content":"What are some recommended board games for 2 players …
Lade Detail …
multiple_87 failed 0% {…} [[{"role":"user","content":"Find the latest update or patch for the game 'Cyberp…
Lade Detail …
multiple_88 failed 0% {…} [[{"role":"user","content":"Find me the number of active players in the game 'Wo…
Lade Detail …
multiple_90 failed 0% {…} [[{"role":"user","content":"I want a seafood restaurant in Seattle that can acco…
Lade Detail …
multiple_91 failed 0% {…} [[{"role":"user","content":"Can I find a good cooking recipe for apple pie using…
Lade Detail …
multiple_92 failed 0% {…} [[{"role":"user","content":"Get me a list of available vegetarian and gluten-fre…
Lade Detail …
multiple_93 failed 0% {…} [[{"role":"user","content":"Book a deluxe room for 2 nights at the Marriott hote…
Lade Detail …
multiple_94 failed 0% {…} [[{"role":"user","content":"I want to book a suite with queen size bed for 3 nig…
Lade Detail …
multiple_95 failed 0% {…} [[{"role":"user","content":"Convert 200 euros to US dollars using current exchan…
Lade Detail …
multiple_97 failed 0% {…} [[{"role":"user","content":"What's the area of a circle with a radius of 10?"}]]
Lade Detail …
multiple_98 failed 0% {…} [[{"role":"user","content":"Calculate the circumference of a circle with radius …
Lade Detail …
multiple_99 failed 0% {…} [[{"role":"user","content":"Calculate the derivative of the function 2x^2 at x =…
Lade Detail …
multiple_100 failed 0% {…} [[{"role":"user","content":"Find the highest common factor of 36 and 24."}]]
Lade Detail …
multiple_101 failed 0% {…} [[{"role":"user","content":"Find the greatest common divisor (GCD) of 12 and 18"…
Lade Detail …
multiple_109 failed 0% {…} [[{"role":"user","content":"What are the names of proteins found in the plasma m…
Lade Detail …
multiple_110 failed 0% {…} [[{"role":"user","content":"Find the type of gene mutation based on SNP (Single …
Lade Detail …
multiple_114 failed 0% {…} [[{"role":"user","content":"Get me the predictions of the evolutionary rate for …
Lade Detail …
multiple_118 failed 0% {…} [[{"role":"user","content":"Find all movies starring Leonardo DiCaprio in the ye…
Lade Detail …
multiple_119 failed 0% {…} [[{"role":"user","content":"Find records in database in user table where age is …
Lade Detail …
multiple_121 failed 0% {…} [[{"role":"user","content":"Calculate the area of a triangle with base 6 and hei…
Lade Detail …
multiple_122 failed 0% {…} [[{"role":"user","content":"Run a linear regression model with predictor variabl…
Lade Detail …
multiple_124 failed 0% {…} [[{"role":"user","content":"What's the probability of drawing a king from a well…
Lade Detail …
multiple_127 failed 0% {…} [[{"role":"user","content":"What's the quarterly dividend per share of a company…
Lade Detail …
multiple_131 failed 0% {…} [[{"role":"user","content":"Find the market performance of the S&P 500 and the D…
Lade Detail …
multiple_132 failed 0% {…} [[{"role":"user","content":"Calculate the future value of an investment with an …
Lade Detail …
multiple_134 failed 0% {…} [[{"role":"user","content":"Look up details of a felony crime record for case nu…
Lade Detail …
multiple_136 failed 0% {…} [[{"role":"user","content":"Provide me the official crime rate of violent crime …
Lade Detail …
multiple_138 failed 0% {…} [[{"role":"user","content":"How to obtain the detailed case information of the R…
Lade Detail …
multiple_139 failed 0% {…} [[{"role":"user","content":"Find details of patent lawsuits involving the compan…
Lade Detail …
multiple_141 failed 0% {…} [[{"role":"user","content":"I need the details of the lawsuit case with case ID …
Lade Detail …
multiple_142 failed 0% {…} [[{"role":"user","content":"What is the humidity level in Miami, Florida in the …
Lade Detail …
multiple_146 failed 0% {…} [[{"role":"user","content":"Find restaurants near me within 10 miles that offer …
Lade Detail …
multiple_147 failed 0% {…} [[{"role":"user","content":"Get me the directions from New York to Los Angeles a…
Lade Detail …
multiple_149 failed 0% {…} [[{"role":"user","content":"Analyze the sentiment of a customer review 'I love t…
Lade Detail …
multiple_151 failed 0% {…} [[{"role":"user","content":"Find the most followed person on twitter who tweets …
Lade Detail …
multiple_152 failed 0% {…} [[{"role":"user","content":"Provide key war events in German history from 1871 t…
Lade Detail …
multiple_154 failed 0% {…} [[{"role":"user","content":"Who was the full name of the president of the United…
Lade Detail …
multiple_156 failed 0% {…} [[{"role":"user","content":"What was Albert Einstein's contribution to science o…
Lade Detail …
multiple_158 failed 0% {…} [[{"role":"user","content":"Get the biography and main contributions of Pope Inn…
Lade Detail …
multiple_163 failed 0% {…} [[{"role":"user","content":"Get the list of top 5 popular artworks at the Metrop…
Lade Detail …
multiple_164 failed 0% {…} [[{"role":"user","content":"What's the retail price of a Fender American Profess…
Lade Detail …
200 von 707 Samples · Limit 200 ‹ Vorherige Nächste ›