February 12, 2026

AI: Good as a Research Assistant. Bad for Creating GitHub Action Workflows

I've been organizing head-too-head match-ups of various flavors of GitHub Copilot and Cursor, to keep from being bored sending resumes out into the void.  

In the past I have used AI: Code-completion at MassMutual, code review at SELF, and vibe-coding apps for fun during Christmas break. 

The most powerful use I've found with AI? Research Assistant:
  • "Here is a list of toolsets. Describe them. Be brief". 
  • "What are the release dates of these toolsets?" 
  • "Use corporate tech blogs as primary sources".
  • "Cite your sources. Provide links". 

These prompts provide excellent documentation that allows me to deepen my own toolset education. 

When I ask AI to build this study guide, AI is teaching me how I can teach myself. 
When I ask AI to do something for me, the only thing its doing is teaching me is how to craft a prompt.

See, when AI does eventually screw up, the only thing I can do to "fix" it is try to craft another prompt, hoping that maybe this time it can see and correct its mistake. And if it can't... ask it again? And again? Tenth time is the charm?

Where I find AI fouling up time and time again? Creating CI/CD pipelines using GitHub Actions Workflows running automated tests in Android emulators. 
  • It doesn't realize that GitHub actions have new versions that have been released in the past year.
  • It declares you should be using a Mac runner since it is more stable. No, a Linux runner is better! No, a Mac runner! It flip flops on them between code reviews. 
  • If you ask it to shift to a build / test / report stage, it always forgets to upload the artifacts in one stage so they can be downloaded to another. 
  • It suggests Intel-based Android emulators to be used on Mac OS runners, only recognizing how it fowled up if you copy-and-paste the error to it. 
  • It will erase and change the comments you placed in the workflow if you do not watch out. 
  • It keeps wanting to go out of the box you placed it in, and rearrange disorganized code that runs elsewhere in the project, and change it to pretty code that has hidden errors in it. 

Why do they get iOS workflows mostly right and Android workflows mostly wrong? Who knows! 

All I can say is that if you are going to add AI into your workflow, learn some breathing exercises. You will need them to work through the frustration you are about to face.

Want to see other projects where I have used AI? Check out the list of programming projects on my blog, where I have sample code I have written for the past ten years at https://www.tjmaher.com/p/programming-projects.html 

What do you find AI constantly screws up? How do you fix it? Leave comments below! 

Happy Testing!

-T.J. Maher
Software Engineer in Test

BlueSky | YouTubeLinkedIn | Articles

February 11, 2026

New Project: Cursor creates a Playwright + C-Sharp test framework

It's a head-to-head matchup! Cursor AI versus VS Code + GitHub CoPilot battling to create automated test frameworks using MS Playwright + C#. Who creates the best tests? The best GitHub Actions Workflow? The best README docs? And can it be created only using prompts?

In this corner, GitHub CoPilot, with the GitHub project: Login-C-Sharp.

In the other, Cursor AI, with the GitHub project: Cursor-creates-playwright-c-sharp

Let the battle begin!

Happy Testing!

-T.J. Maher
Software Engineer in Test

BlueSky | YouTubeLinkedIn | Articles

February 9, 2026

New project: Creating an automated test framework in Playwright + C# using GitHub Copilot

When I learned that a company I was interviewing was thinking of pairing Playwright with C#, a computer language I have never worked with before, for an automated test framework for their web application, I was inspired.

The Test


Given a website, such as https://the-internet.herokuapp.com/login, can GitHub Copilot examine the website, and create, through only prompting an automated test framework using C#, NUnit, and Playwright? What if we are using the free version of GPT-4.1?

Presenting a work in progress! 
Want to see what prompts were used for this site? The last section of the README file contains a summary of prompts used to create this project and its documentation, along with the actions Copilot executed. 

Surprisingly, only very few minor manual tweaks of the documentation and code below were needed, such as weird formatting issues in YAML files, and new text in this README placed incorrectly.

WARNING!


Chat-GPT 4.1 has a cutoff date of two years ago. When creating a workflow, GitHub Copilot did not realize that it was using a deprecated version of actions/upload-artifact (v3) causing the workflow to fail. Caveat emptor!

The Results!


So, how did GitHub Copilot + Chat GPT do creating an automation framework? I would say it did so good that it was hideously frustrating when it messed up the simple things.

It's like an eager-to-please junior dev who doesn't completely know the material and doesn't know it isn't reading the latest documentation.

Why would it not know it was implementing out-of-date libraries when creating the GitHub Actions Workflow? It was so sure it had everything correct until I copied-and-pasted the error I received from the GitHub Actions log files and fed the error back to it.

Why does it not read actual documentation? Why does it skip carefully enumerated steps? And why does it always profusely apologize to me while doing the same mistake over and over again?

I feel that it got me 80% there, but it was super frustrating needing to drag it bodily across the finish line.


Happy Testing!

-T.J. Maher
Software Engineer in Test

BlueSky | YouTubeLinkedIn | Articles

February 6, 2026

New Features of Detox Demo: Security Scanning + Android Support + Cross-Platform Builds!

Remember that tiny little two-screen React Native app I created back in December 2025? The one that just had a Login Page and a Secure Area? Well, I may have gone a little overboard adding features to it again.

What started as a simple React Native Login Page demo for my AutomationGuild talk in April 2026 has become... way, way, way too much.

What's new in Detox Demo?

๐Ÿ“š Tools and Technologies Galore!

The project now uses: React Native, Yarn, Detox, Detox CLI, Allure Reports, Snyk, GitHub Actions, GitHub Workflows, GitHub Pages, Metro bundler, CocoaPods, Android Gradle configuration, iPhone simulators and Android emulators, and includes troubleshooting guides for both macOS and Windows.

All open-source. All documented. All completely unnecessary for what is essentially a Login button and a Logout button.

๐Ÿ” Snyk Security Scanning

Because even a demo app that has hardcoded credentials (yes, tomsmith and SuperSecretPassword! are right there in plain text in credentials.ts) deserves security scanning!

I've added a new security.yml GitHub Actions workflow that:

  • Scans package.json and yarn.lock for vulnerable npm packages
  • Runs Static Application Security Testing (SAST) on the source code
  • Uploads results to GitHub Code Scanning so they appear in the repository's Security tab

It runs on every push to main, every pull request, and you can kick it off manually. 

Snyk is free for public repositories. If it's free, it's for me, I'll take three. 

๐Ÿค– Android Support

The app now runs on Android! I've added:

Run locally on Windows 11 or macOS:

yarn start          # Start Metro in one terminal
yarn detox:android  # Build and test in another

All 5 tests pass:

  • ✅ Secure Area Flow: 2 tests
  • ✅ Login Flow: 3 tests

๐ŸชŸ Windows 11 Local Development

Since I'm developing on a Windows 11 machine these days, I asked GitHub CoPilot to generate comprehensive Setup for Windows 11 Local Development guide covering:

  • Android SDK installation
  • AVD creation
  • Environment variable setup
  • Troubleshooting common issues

Plus a matching Setup for macOS Local Development guide for MacBook users.

๐Ÿงน GitHub Copilot Code Review Fixes

I now run GitHub Copilot's code review feature on the codebase. All the source files created by GitHub Copilot now have a "Created by GitHub Copilot" comment at the top, because credit where credit is due!

And thank you, GitHub Copilot for the rough draft of this post, for copying my stream-of-consciousness writing style, and the following suggestion: 

What's the most over-engineered demo project YOU'VE ever built? Leave some notes in the comments below! ๐Ÿ‘‡


Happy Testing!

-T.J. Maher
Software Engineer in Test

BlueSky | YouTubeLinkedIn | Articles

February 4, 2026

Creating a GitHub Actions Workflow for Android Detox Testing with GitHub CoPilot? What Could Go Wrong?

Last month, I shared my experience using GitHub Copilot to create a React Native app from scratch to be used in my DetoxDemo project in my article, First Time Using GitHub CoPilot to Create a ReactNative LoginPage app. What Could Go Wrong?

This time, I used GitHub Copilot (Claude Opus 4.5) to create a GitHub Actions CI/CD workflow for running Detox end-to-end tests on Android. While GitHub CoPilot is incredibly powerful, it still required significant human guidance to get the workflow passing.

Detox Demo: https://github.com/tjmaher/detox-demo

I had a working GitHub Actions Workflow with ios-regression.yml and asked Copilot to create an Android version that matched. Despite this instruction, I had to repeatedly ask Copilot to compare against the iOS workflow to create the Android workflow, android-regression.yml.

The result? 14 commits, 17 hours, and a lot of lessons learned. Here's the timeline of what went wrong, and what finally worked:

[ View the Pull Request ]

The Stats

Total Commits: 14 commits

Time Span: ~17 hours

  • Started: Feb 3, 2026 at 9:54 PM EST
  • Finally Passed: Feb 4, 2026 at 3:15 PM EST

February 3, 2026

The Facebook Ecosystem: React, React Native, Metro, and Yarn

Whenever attempting to construct a new automation framework from scratch, it can be difficult figuring out which automated testing toolsets should be used. This is why, before I do anything, I research the new tools and technologies used to create the app I will be testing, hoping to see if there are any industry standards already out there. I’ve paired Angular with Protractor, Ruby with Watir and Capybara. What should I pair with a React Native mobile app? Appium, like I did with the Stop & Shop mobile apps? Or is there something else?
Before building an automated testing framework, I had to do some research on the toolsets in the Facebook ecosystem that SELF’s mobile app used: React, React Native, Metro, and Yarn.

GitHub:

January 30, 2026

Hands on Automated Testing with Playwright is the start of a wonderful conversation with the Playwright community!


Butch Mayhew is a Playwright Ambassador, dedicated to helping others, and it shows! I fully recommend this book -- and Butch's many LinkedIn Learning Playwright courses -- for those attempting to understand Playwright. 

The real beauty of the book is that it feels like only the start of a continuing conversation: 
  • Sample code is included: Just like Butch's courses, it provides a GitHub repo chock-full of code examples where Butch and Faraz walks through the examples chapter by chapter so the reader can see implemented the concepts that they both explain. 
  • Reference links is included: Need to do a deep dive on a topic? The authors have included links to the primary sources, such as Faraz Kelhini's article, Understanding Shadow Dom. (2019)
  • QR codes that connect to the Playwright community: Want to connect with the Playwright community at large? See a Playwright community calendar? Scan the included QR codes in the books.

It's no problem if you have never used Playwright before. Readers are walked through installing the toolsets, writing & running their first tests, setting up VS Code, and how to configure Playwright settings. 

The book also walks the reader through chapters on AI-Powered Test Generation using GitHub CoPilot and the Playwright Model Context Protocol (MCP), generating tests with Playwright's Codegen feature. 

Thank you so much for the advance copy, Butch! 

Happy Testing!

-T.J. Maher
Software Engineer in Test

BlueSky | YouTubeLinkedIn | Articles
GitHub repo

January 23, 2026

#OpenToWork: Looking for My Next Test Automation Adventure!

Hey everyone! I'm looking for my next adventure in test automation.

After an incredible but way too short run at SELF ID, building mobile test frameworks for our React Native app, I'm ready to bring my expertise to a new team that values quality, automation, and collaboration.

What I bring to the table:

With a decade of experience as an automation developer, I don't just write tests. Embedded with a development team, I learn about the wants and needs of the stakeholders - the developers, the designers, the business analysts, the business itself - and construct a test automation framework, two week sprint by two week sprint, that truly fits their needs. [ See my Programming Projects ]

The quicker the automation framework is stood up, the quicker I can get to the truly fun stuff: Making sure the brand-new untested features fresh off the developer's local machine meets not just the spec, wireframes, requirements, and design, but also to make sure it matches those undocumented expectations that were discussed but may have not been carefully documented... something that AI will never be able to do.

My most recent project? DetoxDemo (https://github.com/tjmaher/detox-demo) - a complete mobile automation framework showcasing:
  • Mobile test automation with Detox + TypeScript for React Native iOS apps
  • Page Object architecture that keeps tests clean and maintainable
  • Allure Reports integration with visual test results published via CI/CD
  • GitHub Actions workflows with configurable test execution options
On top of my extensive software testing experience, the last ten years has been focused on test automation. I've built automation frameworks from scratch at SELF Id and ThreatStack. I’ve created automated development courses for Test Automation University. I’ve written extensively about testing (check out my blog "Adventures in Automation"), and organized the Ministry of Testing - Boston meetup for years.

What I'm looking for:

Software Engineer in Test or SDET roles where I can design automation strategies, sharing my testing worldview with developers, and build frameworks that the entire team can use to check their work before merging into main. Remote work is preferred, but I am open to hybrid opportunities in the Boston area or Southeastern Massachusetts.

My toolbox includes: Detox, some Playwright, Selenium WebDriver, Ruby/Capybara, Java, TypeScript/JavaScript, React Native, CI/CD pipelines, and Allure Reports. But more importantly - I know how to research new tools, validate stakeholder needs, and implement solutions that fit your tech stack.

If you know of opportunities, I'd love to connect. Drop me a message or comment below. I'm always happy to chat about testing, automation, or that one flaky test that's driving you crazy.


I will be speaking at Joe Colantonio's TestGuild in April 2026. See you then!

#OpenToWork #SDET #TestAutomation #SoftwareEngineering #QualityEngineering


Happy Testing!

-T.J. Maher
Software Engineer in Test

BlueSky | YouTubeLinkedIn | Articles

AI wants to take over QA? Let it!

Someone on LinkedIn posted: "The smartest engineering leaders I know have been divesting from manual QA for years now and I think that bet is going to pay off big in today's world.

"If your development process still relies heavily on manual QA and your engineers are now using AI-assisted coding tools, you've created a new bottleneck".

My response? Let it! 

I am all in favor of outsourcing the "boring stuff" to automated tests or AI. Who really wants to check for the umpteenth time that the same page in the web app has the correct working functionality on Chrome, Firefox, MS Edge, Mac Safari, and all the various screen sizes. BORING!

Now, it takes a real software tester to make sure the brand-new untested features fresh off the developer's local machine meets not just the spec, wireframes, requirements, and design, but also to make sure it matches those undocumented expectations that were discussed but may have not been carefully documented. Or that the user experiences matches how the designer really wanted the web app to operate. Or that the user experience doesn't change too much when you operate the web app in the wild.

That all is fun part of the job ... realizing during testing that the business requirement or the design were actually unclear, and that you as a tester found a unique edge case.

Let AI do the mindless drudgery. Just as long as I can still focus on the fun stuff! 

Happy Testing!

-T.J. Maher
Software Engineer in Test

BlueSky | YouTubeLinkedIn | Articles

January 22, 2026

DetoxDemo: Now with more GitHub Action Workflow CI/CD Options!

Have you ever known that you probably should have gone to bed hours ago, but you were doing something so fun, you didn't want to stop? I am like that with my toy React Native application, DetoxDemo, which I created as part of my presentation to the AutomationGuild in April 2026.
Late last night, after pouring over Wix's Detox Docs for Artifacts, I decided I wanted to implement that in my GitHub Actions CI/ CD Workflow. 

Want to kick off a job to run all the Login tests in the CI/ CD platform using the GitHub Actions workflow? With the DetoxDemo GitHub:
  • Go to Actions -> View all Workflows
  • Under the Actions column to the left, select Build & Test iOS
  • Select the [Run workflow] button to see all the choices I set up in the ios-regression.yml configuration file under the on: workflow_dispatch -> inputs
  • Say you were a developer that wanted to test out their JIRA-123 branch code before merging, under "Use workflow from" they could choose branch JIRA-123 here instead of running against the main branch.
  • Which test suite would you like to run? Login? SecureArea? Default is "all".
  • Which iPhone 16 would you like to run the tests on? Regular iPhone 16, Pro, or Pro Max? Or maybe an iPad Mini, Air, or Pro?
  • What log level? Select any range from the very verbose "trace", to throwing alerts only if things are "fatal". Default is "info".
  • What level of artifacts do you want to capture for logs, screenshots, or videos? All, just failing, or none?
  • Do you want to run performance testing with Detox Instruments? We have that option! Still looking how the Wix Incubator's Detox Instruments works with CI/CD.
  • Or you can just scroll down to the bottom and select [Run Workflow] and kick off the default values set up in ios-regression.yml
  • A new "Build & Test iOS" run will be created. Feel free to click into the run to see it run through the build -> test -> publish-allure-reports -> cleanup stages where you can see all Homebrew, RubyGems, Cocoapods, Node.js, and Applesimutils are configured and run.
  • If you click into the "build" stage, you can see it work through tasks such as "Set up job", "Checkout repository", "Setup Homebrew", "Setup Ruby", "Cache Homebrew and RubyGems", etc. It takes 30 minutes for a Detox-embedded build to be generated.