What I Learned While Building a macOS App with GPT-4

7 min readJun 1, 2023

This April, I got to thinking. There’s been a lot of buzz around GPT-4, a super-smart language model which can write essays, create poetry, generate code… In my previous article, I’ve shown that GPT-4 is great for prototyping. But can it handle building a full-scale macOS app?

Well, the answer is yes, but…

I took this weekend challenge to build an app just by pasting the code from GPT-4. Here is what I built and learned on the way.

General Domain Knowledge is Necessary

From the beginning I was convinced, that if I can do this without knowing macOS programming, anyone can start building apps... However, soon I realized that without a 10+ years of experience in app development, it would probably take me a WAY more time.

Knowing my way around software development made it for me way easier. I mean — the terminology, processes, what makes up software, the basics of coding languages, and the tools used to build stuff. Sure, AI can be a great helper, but be ready to invest a lot of time into learning & experiment.

If you're about to start building app with chatGPT, and you have 0 knowledge in software development, I highly recommend taking a few days to study the basics first. There are many free resources to get started:

Software Development: www.freecodecamp.org, www.codecademy.com
Github: https://docs.github.com/en/get-started/quickstart/hello-world
APIs: https://www.ibm.com/topics/api
UX: https://www.uxbeginner.com/start
MVP Development: Lean Startup by Eric Ries
User Testing and Feedback: www.usability.gov

Good Questions Matters

In school, we’re rewarded for having the right answer, not for asking a good question. This will have to change entirely. Our brain is greatly limited compared to AI which has absorbed all the knowledge in the world. With AI by your side and ability to ask good questions, you’re one step ahead.

“The wise man doesn’t give the right answers, he poses the right questions.” — Claude Levi-Strauss

The magical thing about AI is that you can formulate your prompts almost anyhow you want but by following certain rules, you can make your prompts more effective and get better results from the AI:

Give to GPT the right context. Send it all the related code, message from log, used programming language, target platform and detailed description of intended result.
Be specific & exact. Just like with the real developers, fuzzy and unclear assignment leads to unsatisfying results. Apt description with proper terminology will save you a lot of time.
Don't repeat yourself. I've seen on the internet many examples of prompts where author repeat the same command in a several different ways. It’s not needed and just muddles the prompt.
Sometimes, it’s best to hit the “New Chat” button. If you're stuck with particular problem for too long, it’s better to clear the context and start fresh. The old context tends to get compressed and confuse the AI.
Sample output helps. If you need to generate let's say a .json file with certain data, give AI example how you want to format it. You'll save yourself a few tries.

There is a great post about Prompt Engineering on Microsoft Blog which I highly recommend to read.

Understanding Tokens & Limits

When programming with chatGPT, it’s important to know its limitations. Tokens play a key role here — they’re like the building blocks the GPT model uses to understand text (approx one token equals one word, but not exact).

Tokens affect the model’s memory and how it remembers past chat. If a conversation gets too long, it might exceed the token limit, and earlier parts could be lost. Just forget about using version 3.5 and subscribe to GPT-4 right away. The difference is huge.

Beside that, I came up with a few tricks to deal with the limits:

If chatGPT stops generating code, simply ask to continue.

Split your prompt. If your code is too long for one prompt, break it up into a 2–3 messages and let chatGPT know there’s more coming.

Remove all unnecessary parts of your code before pasting it to chatGPT (e.g. comments, unused methods & styles, etc.)
Ask for pseudocode or general approach first. This can help ensure that the model understands the task before writing actual code.
Clarify what part of code you need to adjust. Often GPT keeps rewriting all the original code. Tell ChatGPT exactly which part of your code needs an update.

Even with all the tips, tweaking big codes (like over a 1000 lines) or stuff spread across different parts of your app can be tricky. I feel like here we're hitting the real limit of current version of chatGPT.

We'll have to wait for tools like vector databases or enhanced context tracking that could make a big difference in processing big chunks of code.

GPT-4: Default · Internet Access · Plugins

Remember that the default GPT-4 model is trained on data until certain period (September 2021) and don't know anything beyond this date.

This is quite limiting in rapidly developing environment such as software engineering. For luck, just weeks ago, OpenAI introduced a brand new version with access to the internet and possibility to use plugins.

I didn't have this option while undergoing my programming challenge, but this new features can help you a lot. For example you can send AI the github library which you want to use (even though it's brand new) and GPT-4 will study it and tell you how to use it…

What's Current GPT-4 Good At?

It’s not all about limitations or issues. The current GPT is fantastic and a huge time-saver. It’s particularly great for:

UIs

Surprisingly, it's quite easy to get a reasonably good user interface just by describing how the app should look like.

I'm usually taking an approach like describing the UI for a blind person. Explaining from top-down, left-right how each elements look like, what's the hover effect, how it looks like after interaction, etc.

Also adding dark theme to my app was a piece of cake.

Small Functional blocks

AI is great companion if you need to create just one specific function.

Here’s a tip: have a chat with your AI about the layout and artchitecture of your app first. Once that’s sorted, ask it to writing all those individual functions for you one by one.

All kind of time-saving scripts

Whenever I need to do work which is repetitive and would take me more then a few hours, I ask myself — couldn't AI write a python script for me which would automate this boring task? And in about 80% cases, the answer is yes.

Tweaking the Old Cod

I found out that AI is especially useful if you need to just change the design, restructure complicated function or make your code more efficient. It will literally breathe new life into that old code, making the adjustments smoother than you might expect.

Fixing Bugs

Working with code often means dealing with bugs, and sometimes they’re tricky to spot and fix. For luck, there is GPT-4. In my experience, it's able to save hours of bug-hunting even to an experienced developer.

With a little bit of debugging experience and AI on your side, you can fix anything.

Conclusion

As I look back on this adventure, building a macOS app with GPT-4, I'm left with mixed feelings.

While GPT-4 is a fantastic tool, it’s not quite ready to take on the task of building complex software from scratch. I learned a lot on this way about the macOS software development, how the UI is done, about feasibility of various functions, etc.

However, when I showed the code to the experienced Swift developer, his impression was that code doesn't have the best quality and is quite bloated. Considering also the limitations of GPT-4, I can't imagine working on more complex apps in this way.

Anyway, this doesn’t take away from its potential; rather, it sets a clear path for future growth and refinement. This is one of the first truly usable AIs and it seems like the development has barely started.

So, here’s to a future full of innovation and discovery!

Download HelloAI here!

And for more content like this, follow me on Twitter: @michallangmajer.