Test AI agents’ API interactions using WireMock Cloud and Custom GPTs

Tom Akehurst
CTO and Co-founder

LLMs are cool, and what’s even cooler is when they can interact with other tools and knowledge bases. Many teams are currently experimenting or deploying these types of ‘agent’ applications - but if you’ve tried, you probably know that getting an LLM to ‘master’ an API can get tricky. 

In this article, we’ll explain how you can use API mocking to build and test AI agents that interact with external services. We’ll then explain how you can use a WireMock mock API and OpenAI’s custom GPT interface to develop and test an agent that can interact with a store catalog API. 

Watch the video or read on for more context + step by step instructions:

AI agents depend on APIs

 AI agents are tools that can interact with other software through APIs, or with knowledge bases via retrieval augmented generation (RAG); and use these interactions to complete more complex tasks, based on the inputs they receive. In some cases the interaction would be bi-directional, with the LLM-powered application also composing and sending requests to perform actions in the external services.

A practical example that you can already see ‘in the wild’ is the use of AI agents for tier-1 customer support. When faced with a user request, these bots can pull information from company knowledge bases to answer the query themselves, escalate issues to human support by creating tickets in a helpdesk system, or perform basic action on the user account (e.g. update the user address). 

All of these actions require an interaction with external systems via API calls. As such, making sure your agent knows how to use the APIs it has access to (often described as ‘function calling’ by LLM providers) is a crucial part of developing these types of applications. 

Using mock APIs to eliminate bottlenecks and let you focus on the AI

When developing an AI agent, you'll need to run a lot of tests to ensure it can interact correctly with your APIs. This means verifying that it can formulate proper requests, parse responses accurately, and - crucially - not hallucinate API specifications that don't exist. Anyone who’s worked with LLMs knows that there is a lot of work involved in this stage, accounting for many edge cases and unpredictable behaviors that every LLM is prone to.

For various reasons, you might not always want (or be able) to use your live API for this testing. Maybe the API is still under development, or perhaps you want to isolate specific aspects of the AI's behavior without the variables introduced by a live system. This is where API mocking comes in.

Building mock APIs with WireMock Cloud allow you to simulate the behavior of your actual API without the overhead of a full implementation. With a mock API, you can:

  1. Test edge cases: Throw weird responses at your AI and see how it handles them, without risking your production data.
  2. Simulate errors: See how your AI agent deals with timeouts, rate limiting, or other API hiccups.
  3. Iterate quickly: Change API responses on the fly to test different scenarios without waiting for backend changes.
  4. Develop in parallel: Your AI team can work on integration while the API team is still hammering out the details.
  5. Control your test environment: Ensure consistent responses for reproducible testing.

The main advantage here is that you can focus your efforts testing the AI-specific parts of what you’re building - such as the model’s ability to ‘understand’ a prompt and decide to make an API call - without getting confused or bugged down by issues related to the API itself.

Example implementation with OpenAI’s Custom GPTs

Let’s now look at a practical example of how we can use API mocking in our AI agent development. We’ll be using WireMock Cloud to create the mocks, and OpenAI’s custom GPTs to simulate our AI agent.

Custom GPTs allow users to create tailored versions of ChatGPT for specific purposes. These custom GPTs can combine instructions, extra knowledge, and various skills. While the main idea behind these custom GPTs is to deploy them within the ChatGPT interface, they also provide a useful no-code interface for testing out agent applications more broadly.

For the purposes of this example, we’ve built a mock API for an imaginary zoo animal store. The API lets you browse through the animal catalog, retrieve information about a specific animal, or make a purchase.

 Once we’ve created this API in WireMock Cloud, we automatically have an OpenAPI specification for it, also in WireMock Cloud:

We want to build a custom GPT that can interact with this API in order to allow the users to perform various actions at our online animal store. 

Switching over to ChatGPT, we click over to “My GPTs” and create a new custom GPT. Note that this requires a paid ChatGPT subscription.

We want to “teach” our GPT to interact with the catalog API. To do this, we will go to “Add Actions”. Here, we can simply paste our OpenAPI schema:

(Note that OpenAI expects the spec to be OpenAPI 3.1.0, while WireMock Cloud generates a 3.1.1 spec. Simply update this inline and you should be fine)

From there, we go back and finish configuring our Zoo Catalog GPT, including adding a conversation starter (essentially a default prompt that the user sees when opening the GPT):

We can then test that our agent is working as expected in Preview mode:

Now that we’ve made a few requests, we want to make sure that the agent is using the API correctly. We can do this easily within WireMock Cloud, which shows us a detailed history of the requests made to our mock API:

Using our mock API, we can now continue developing our custom GPT or AI agent. We can update our API specification in WireMock Cloud if we need to do so, and easily replace the OpenAPI spec that the custom GPT is referring to. Importantly, we can make sure our AI agent is working as expected, even when our catalog API is still under development.

Try it yourself with real API templates

Ready to get hands on? 

  • You can start instantly with hundreds of ready-made API mocks for many popular services such as Salesforce and Stripe. Use these templates to create a WireMock Cloud mock, then follow the steps outlined above to start testing how your custom GPT will interact with external APIs. 
  • Not using WireMock Cloud? Get a free-forever account.
  • Need help building out a production proof of concept? Get in touch

/

Latest posts

Have More Questions?