5 min read

AI Chatbots for Performance Testing Analysis

AI Chatbots for Performance Testing Analysis
AI Chatbots for Performance Testing Analysis
10:02

Introduction 

As performance testing becomes more data-intensive, analysing the results quickly and accurately is crucial for maintaining optimal website speed and user experience. In this blog, I explore how AI chatbots can support this process by comparing several popular options for their ability to analyse a real set of performance testing results. From ease of use and response times to the depth of insights provided, this comparison reveals which AI tools stand out as reliable partners in performance analysis—and which fall short.

Let’s dive into the findings and see which AI chatbot comes out on top. 

Executive Summary 

This article explores the capabilities of various AI chatbots in analysing performance testing results. By comparing tools like ChatGPT, DeepAI, Gemini, Meta AI, and Claude AI, the analysis highlights their strengths, weaknesses, and overall effectiveness in providing insights and recommendations. 

  • ChatGPT: Developed by OpenAI, it generates human-like text responses and is useful for conversation, content creation, and data analysis. 
  • DeepAI: An AI platform offering tools for generating text, images, and other content, emphasizing accessibility and user-friendliness. 
  • Gemini: Developed by Google DeepMind, it is designed to understand complex queries and provide insightful responses. 
  • Meta AI: A conversational assistant developed by Meta, providing real-time information and answering questions with advanced natural language understanding. 
  • Claude AI: Developed by Anthropic, it focuses on delivering safe, reliable, and conversational support optimized for in-depth reasoning and complex task assistance. 

Summary of Conclusions 

All AI tools provided valuable insights and recommendations, but each had its own set of challenges. ChatGPT and Claude AI stood out for their detailed analysis and useful suggestions, while DeepAI and Meta AI offered solid performance with some limitations. Gemini had some issues with date formats and vague recommendations but still provided useful observations. Overall, these AI tools can serve as reliable partners in performance analysis, especially for small data sets. 

  • ChatGPT Analysis: ChatGPT initially required bribery to provide results but eventually highlighted trends, concerns, and potential improvements in the performance data, providing a decent high-level analysis. 
  • DeepAI Analysis: DeepAI allowed data to be pasted directly and provided detailed observations, trends, areas of concern, and recommendations, impressing the author with its insights. 
  • Gemini Analysis: Gemini provided high-level and specific observations but made an error in date format assumption and incorrectly reported error rates, leading to some concerns about its accuracy. 
  • Meta AI Analysis: Meta AI provided accurate overall trends, areas of concern, and potential improvements, even asking the author questions to refine the analysis further. 
  • Claude AI Analysis: Claude AI offered correct high-level trends, areas of concern, and potential improvements, delivering a concise and accurate analysis. 

 

The exercise in more detail 

Methodology 

I started off by creating a very simple JMeter test running against the Spike website at www.wearespike.co.uk. Nothing complicated, just a few users each hitting the various pages of the site. Once I had my base results, I then manually created a few more sets of results and stored them in an Excel spreadsheet, see below. Each set of results presented progressively slower response times apart from the final set which I adjusted to give improved response times. 

The results set: 

AI Chatbots Evaluated 

At this point, I started exploring the various AI chatbots available. I only wanted to use free ones for this exercise. It’s important that a tool is confident enough in its own ability that it can give you a decent free offering before it starts to charge. I asked each tool the exact same question and if possible didn’t provide any further prompts. 

The prompt I gave each tool was as follows: 

"I'm going to upload you a file with several sets of performance test results over a number of days. These results are from a test I've been running against our company website. I'd like you to analyse the results and provide me with any trends you see, areas of concern, areas to improve etc. Are you comfortable with that? I only want you to consider columns A to N. Can you do this?" 

Almost all the tools I asked were more than happy to help in the first instance but as soon as I uploaded the file or copied and pasted the data, they told me I’d need to upgrade before they would “help me.” Those tools were immediately discounted. Finally, I was left with the following set of willing helpers: 

  • ChatGPT 
  • DeepAI 
  • Gemini 
  • Meta AI 
  • Claude AI 

Let’s take a quick look at how each one did. 

 

ChatGPT 

ChatGPT is an advanced AI language model developed by OpenAI that generates human-like text responses, making it useful for a variety of applications such as conversation, content creation, and data analysis. Its capabilities include understanding context, providing information, and engaging in interactive dialogue. 

Key Findings: 

  • Highlighted the gradual increase in response times over time and the improvement with the final set of metrics. 
  • Raised concerns about the "We Are Spike Website" page consistently recording much higher response times compared to other pages. 
  • Suggested investigating the "We Are Spike Website" page specifically for performance lags. 
  • Provided a decent summary of issues and potential solutions. 

Challenges: 

  • Initially, it was very helpful but then told me I’d gone over my daily free data allowance when I attached the file. 
  • After some persistence and bribery, it finally provided some results. 

 

DeepAI 

DeepAI is an AI platform that offers various tools for generating text, images, and other content, emphasizing accessibility and user-friendliness. Its API allows developers to integrate AI capabilities into applications, facilitating creative and analytical tasks. 

Key Findings: 

  • Provided lower-level observations about the increase in response times for each page. 
  • Highlighted trends such as growing response times and variance in performance. 
  • Raised concerns about the "We Are Spike Website" page and general performance degradation. 
  • Suggested recommendations like investigating performance issues, optimizing content delivery, and monitoring metrics continuously. 

Challenges: 

  • No option to upload a file, but it allowed me to copy and paste the results set straight into the message box. 

 

Gemini 

Gemini, developed by Google DeepMind, is an advanced AI chatbot designed to understand complex queries and provide insightful responses, with a focus on integrating deep learning for enhanced contextual comprehension and user interaction. 

Key Findings: 

  • Provided high-level observations about response time, throughput, and error rates. 
  • Highlighted specific observations for different dates. 
  • Raised concerns about increasing response times, error rate spikes, and throughput fluctuation. 
  • Suggested recommendations like investigating specific dates, monitoring response times, and addressing error rate spikes. 

Challenges: 

  • Assumed dates were in American format (MM/DD/YYYY), leading to some incorrect observations. 
  • Some recommendations were vague and obvious. 

 

Meta AI 

Then came Meta AI, and I was expecting big things…. 

Meta AI is a conversational assistant developed by Meta, designed to provide real-time information and answer questions, with advanced natural language understanding and integration across Meta's platforms. 

Key Findings: 

  • Highlighted overall trends such as variability in performance, increased traffic, and consistent error rates. 
  • Raised concerns about the "We Are Spike Website" page, response time spikes, and throughput fluctuations. 
  • Suggested areas for improvement like optimizing resource-intensive pages, implementing caching mechanisms, and reviewing server resource allocation. 

Challenges: 

  • Meta actually flipped the table and asked me some questions. Not sure I was ready for that . They were all perfectly valid though. Meta told me that If I answered these questions, they would give me more detailed information about the results. Rules are rules though so we ended our discussion there. 

 

Claude AI 

Claude AI, developed by Anthropic, is an AI assistant focused on delivering safe, reliable, and conversational support, optimized for in-depth reasoning and complex task assistance. 

Key Findings: 

  • Provided trend analysis showing consistent increases in performance metrics. 
  • Raised concerns about the "We Are Spike Website" page and maximum response times for certain pages. 
  • Suggested potential improvements like investigating root causes, optimizing server infrastructure, and implementing caching strategies. 

Challenges: 

  • Short and sweet analysis but everything was correct and there were good observations. 

 

Comparison Table 

Conclusion 

All in all, I was very impressed with what I saw from each of the AI tools. It’s certainly not the most comprehensive performance test analysis I’ve ever seen but it was all generated in a matter of seconds without any fuss….apart from ChatGPT who was obviously having a bad morning. Granted this was only a small data set but I can’t see why it wouldn’t work on much larger results files and it certainly gives you a good starting point. The errors from Gemini make me a tad nervous but you would always expect to check the outputs before you acted upon them. 

I’d be interested to hear your thoughts and experiences. Have I missed any good tools that you can recommend? I’d certainly like to see how much more the paid options could give you and if they are worth the subscription costs. 

Related posts

Webinar - Page Speed - Who's Doing it Well?

Webinar - Page Speed - Who's Doing it Well?

Join Spike’s Head of Performance, Mike Forshaw as we dive into our analysis of the winners and losers on e-commerce website page speed. Mike will...

Read More
Web page speed: Keeping your ecommerce website turbocharged

Web page speed: Keeping your ecommerce website turbocharged

For most ecommerce sites 2020 has allowed an ever rising influx of customers to flock online and buy pretty much everything they want or need as...

Read More
Is your agile up to speed? The top 5 challenges in agile

Is your agile up to speed? The top 5 challenges in agile

To quote Top Gun: “I feel the need … the need for speed!” This is what every business and IT leader should be thinking about when it comes to...

Read More
Core Web Vitals Webinar

Core Web Vitals Webinar

Core Web Vitals: What is it and what the hell does it mean for you? If you missed the live session you can view the recording below!

Read More