OpenAI's DevDay developer conference recently went down and it did not disappoint in the slightest. The company announced the many great things coming to their platform in the near future, both in terms of their developer API, but also in the more publicly available ChatGPT.
New improved models, cheaper prices all around and a 'create-your-own' GPT system that's soon to be released were just a few big announcements made.
And as someone who is a big proponent of AI, both in the world and in the workplace, I could not be more exited to try a few of these out.
Here are a few of the highlights from that event.
It seems like just yesterday that GPT-4 was introduced to top the already impressive GPT 3.5 model. Just to recap, GPT-4 was introduced in March of 2023 and it was made generally available to all developers in July of 2023.
And now here we are a few months later and the next generation is among us already.
This is one area where OpenAI has vastly outdone itself and what sets it apart from the competition. Not only are they constantly iterating on their product, but they are doing so at an alarming pace.
GPT-4 Turbo is a more capable model that has a knowledge base that dates to April of 2023, solving one of the biggest issues that it had for some time. Outdated documentation on various libraries. If you're a developer and you use ChatGPT, you know exactly what I mean.
If you were working on a library that was recently modified or updated, then you couldn't really just ask ChatGPT to help you out with your work. You had to train it on the recent updates first.
The promise of shorter cutoff dates was also hinted at during the course of the conference.
Many things can happen during the course of 2 years, and the original 2021 limitation became more of an issue with each passing day.
The new model allows for up to 128K tokens of context, meaning that it can store and process the equivalent of 300 pages of a book into a single prompt. Compare this to its previous 8k model, and you can really make out the giant leap. This one is huge, because the original limitation, while still fairly robust, still wasn't enough for certain people in certain areas of expertise.
A larger context means in general much more accurate results. Not only is GPT-4 Turbo now available in the API, but it will also be powering ChatGPT moving forward.
Also, another great add for developers. With reproduceable outputs developers will be able to add a seed parameter to their requests which will force the model to return consistent results.
This one is huge if you find yourself having to debug constantly, as having different results on every request is akin to chasing a moving target.
For developers, this is a big one as well. You now have the option of turning on JSON Mode. This will ensure that the model will respond back in valid JSON through the API.
Definitely big for developers as this will make working with API requests a much simpler process overall.
More features are great, but being able to actually use those features is a whole other thing. Which is why it's great that GPT-4 customers will now be able to get 2x as many tokens per minute as previously. And you will also have the ability to request even more, if your workflow requires it through the API console.
The OpenAI API now supports Dall-E 3 for image generation, GPT-4 Turbo with Vision and Text To Speech. This definitely opens the world of possibility even further for developers, and companies, looking to streamline even more of their work in a variety of ways through the API.
GPT-4 Turbo on the API is also now able to take images as input as well, being able to analyze said images and to generate content around them.
And with Text To Speech on the API, you can now generate audio results, in 1 of 6 voices, that are almost imperceptible as not human to the naked ear.
This will make developing accessibility or language learning applications a much more streamlined process compared to the older methods of having to cobble together dozens of libraries, applications and API's to do the same.
And really, this just goes to show OpenAI's commitment to bringin AI to everyone, in every field, and at every level.
One of the biggest challenges that some developers face when building out complex and resource heavy applications, is in the cost. If something is prohibitively expensive, then odds are only the largest of enterprise companies will be able to use those tools.
GPT-4 Turbo, the latest most advanced model, will actually be cheaper than the original GPT-4 model by a factor of 3x for prompt tokens and 2x for completion tokens. Huge huge news.
For context, that's around 1 cent per 1000 input tokens and 3 cents per 1000 completion tokens. Combining this with increased rate limits and the added modalities mentioned above, developers should now be able to take their ideas to the next level.
It wouldn't be an A.I. conference if ChatGPT wasn't brought up in some capacity. The interface that brought A.I. to the masses has gotten a few improvements as well.
As of the conference, ChatGPT is now using GPT-4 Turbo meaning that it will allow for larger prompts, better accuracy and the increased cutoff date of April 2023 for all queries going forward.
One of the quirks that many people complained about in the past was in the way that users selected a particular model to do their work. You first had to select the proper version of ChatGPT, whether 3.5 or 4.0, and then you further had to select whether you wanted to target Dall-E, browse the web or use a plugin.
If you started a prompt with a web browsing version, but then needed to generate an image, then you pretty much had to start over with a new chat window.
That is no longer an issue, as you can use a single interface and ChatGPT can contextually provide the most appropriate response based on your prompts.
And if you're a daily user of ChatGPT, you might notice a few UI improvements as well, such as new icons and a new section in the left nav showing your most recent GPT's.
Quite possibly the most interesting part of the conference was near the very end where the latest feature drop was introduced. GPT's are custom tailored models that pretty much anyone can generate using basic language and that can behave in a very specific way to accomplish some given task. No code required.
You can think of them as pre-generated personalities that anyone can select and prompt accordingly.
One use case shown during the event was a GPT that was created by Canva, in which a user could prompt their digital design idea, and the GPT would then provide the visual Canva result as the response. The user could then navigate to Canva to finish working on their pre-generated design.
Another interesting GPT discussed was that of Zapier, the 'if this, then that' no code tool that integrates thousands of services together. This further broadens the reach that you can have with A.I. agents. You could integrate your calendar and have ChatGPT notify your co-workers that you'll be late for the next meeting. Technically. If you were to connect those applications to your Zapier account and then to ChatGPT.
Some of the example GPT's displayed on ChatGPT include a "Sticker Whiz", which aims to turn your ideas into die-cut stickers shipped to your door. For now, it seems more like a placeholder for an idea that could eventually come to life.
The idea is there anyway, which often times is the most difficult part. Introducing to the world the concept of autonomous A.I. agents is no easy task, as mass adoption is really going to be the key to long term success.
And lastly, you will be able to generate your own GPT's for either your own personal use, your companies use if you're on Enterprise, or for public use if you publish your GPT to the soon to be store.
Though only hinted at, the concept of a GPT Store was introduced during the conference as well. Not only will you be able to generate custom GPT's for yourself, or your organization, but you will also be able to share them with the community as a whole.
While no clear payment model was broken down or explained, the idea of revenue sharing based on GPT popularity and usage was brought up. Which technically, makes perfect sense.
Training an A.I. agent to do something that many people can benefit from probably won't be as simple as simply writing a 2 sentence instruction set. It might require having to upload large amounts of data sets and managing prompts that are hundreds of pages long. So why not compensate agent developers for their work and really for maintaining these agents.
If this is the first OpenAI developer conference, I can only imagine what the second or third are going to be like in the near future.
Once GPT's are out in the wild helping people do a myriad of things in their daily lives and once more applications are able to leverage image generation, text to speech capabilities and even image recognition, the future will change drastically.
At this rate, I can only hope that I can keep up.
Walter Guevara is a software engineer, startup founder and currently teaches programming for a coding bootcamp. He is currently building things that don't yet exist.