With the recent announcements from OpenAI (GPT-4o) and Google (during their Google I/O) showcasing impressive advancements in multimodality, many are wondering whether to stick to closed-source models (OpenAI GPT-4o, Google Gemini, Anthropic's Claude) or invest in open-source alternatives (e.g., Mistral, LLaMa, Falcon).

While personally I am a big proponent of the open-source models, let me be unbiased and answer that question again on open-source vs. closed-source. In an enterprise setup when someone is making decisions around model selections, there are few criteria to consider: functionality, cost, performance, flexibility, risk, etc.

Let's consider each of them:

Functionality:

Closed-source:

With recent announcements from OpenAI on their GPT-4o and their demonstration, it's quite easy to understand that how seamless they have made multimodality: audio, vision and text - all combined in real time.

While they showcased many example, this math problem solution is something I like the best because it's advancing how our younger generation will learn new skills.

Similarly, Google showcased many examples around multimodality, their project astra is something blew my mind, and yes, they specifically mentioned that it's taken in a single shot at 1x speed :)

In both of these example above, multimodality is showcased in real time and gives us goosebumps thinking what's the future looks like in next few months to a year!

Open-source:

While there are lot of advancements have happened in open-source community in a text-only modality, such as LLaMa 3, Mixtral, and most recently Falcon 2; image/video modality via Stability AI (stable video diffusion); there are very few models out there in open-source community that support multimodality - primarily image and text combined. LLaVa is one of the multimodal which combines a vision encoder and Vicuna for general-purpose visual and language understanding, achieving impressive chat capabilities mimicking spirits of the multimodal GPT-4. There are other variant created from LLaVa but this is the closest we can get to GPT-4 multimodal capabilities. We are still missing the audio piece.

Decision:

By looking at the demo videos of GPT-4o and project Astra, I am sure there are 100s of novel use cases people would have come up with where all three modalities (audio, text, image) can be applied. Are you going to implement them next week or next month? If so, you certainly want to consider closed-source models. Can you wait till open-source ecosystem catch up? Then wait.

Cost:

Closed-source:

Along with real time multimodality, OpenAI also announced the cost reduction. First thing first, they made GPT-4o (along with 3.5) completely free for anyone. Bravo! There are only 63% of the world population knows what chatGPT is. By making this free, it will certainly increase the awareness and adoption for consumers. Which may convert them to become an enterprise customer!

Talking about enterprise customers, most of them will be using it via API and they dropped GPT-4o price by 50% ($20 for 1M token input/output pair). This is a welcome change! In comparison to Claude Opus which is still at $90 for 1M token input/output, GPT-4o seems like a steal.

Open-source:

While open-source models may not be as good as closed-source model when it comes to performance/accuracy, they certainly beat the closed-source models on price.

LLaMa 3-70B via Amazon Bedrock will set you up for $6.15 for 1M token input/output pair. While same model to be run via Groq will set you up for $1.38 via their cloud offering.

Using Mistral 8x7B SMoE via Amazon Bedrock is $1.15 for 1M token input/output. While same model to be run via Groq will set you up for $0.48 via their cloud offering.

Though most of the enterprise customers who uses any of these open-source model do not use them as-is, instead they fine-tune the model with their own datasets or in some cases even continuously pretrain the entire model with their own datasets. It would be difficult to guesstimate that price for fine-tuning or pretraining, enterprises need to consider adding that in their total cost.

Decision:

Open-source models offer substantial cost savings, especially when enterprises have the expertise to fine-tune or pretrain them.

While I may be biased here but at Cerebras we recognized this early and helping enterprises of all size to pretrain/fine-tune the open-source models so they can deploy it at relatively lower cost and own the IP of the model.

Since we are talking about skillset, another bias alert, forgive me, but since enterprises lacks the skillsets to fine-tune/pretrain the models, I have started offering a specialized course on AI Engineering where I teach software engineers on how to customize LLMs so that they can apply these skills in their job.

Performance:

In performance, open-source models are getting closer to the closed-source models. If we go by the LMSys Arena Leaderboard which track more than 90 models, both open and closed; while GPT-4o is leading the pack, we do see LLaMa3-70B in top 10.

LMSys Arena Leaderboard

If we follow what OpenAI has produced as their model score against various benchmark and also comparing against different models, we do see that score of GPT-4o and Llama3-400B (yet to be released) is closer.

OpenAI's benchmark for GPT-4o

Decision:

Evaluate both closed and open-source models to determine which meets your performance and accuracy needs. Open-source models are increasingly competitive.

Flexibility:

When I talk about flexibility, I am considering few factors: vendor lock-in, ability to switch from one model to another, pricing, deployment options, etc.

With a proprietary models, once you start using them, you are stuck with them unless you put in effort to migrate from the provider or restart the work with a different model. It's similar to signing up for a cloud provider - if you start using Azure and want to switch to AWS, you will need to put in engineering effort to move.

Same issue exist in the open-source model where if you are using Llama today and want to switch to Mistral, it requires engineering effort such as prompt changes.

On a pricing, you can negotiate with the vendor for the closed-source. While open-source model do not have any license fee, it requires hosting that models on GPU or similar AI accelerators. Based on your deployment options, you can negotiate with the provider on the hardware cost.

Talking about deployment options, only open-source models allow you to deploy wherever you want: cloud or on-premises. With closed-source models, you are only using their services in a SaaS/API way.

Decision:

Flexibility needs vary by company. Consider your requirements for vendor independence and deployment options.

Risk:

Closed-source:

Most of the closed-source model companies provide indemnification clause. This way you are protected from future lawsuits in case the model outputs infringes any third party's intellectual property rights. So in a way, you are transferring the risk to model providers.

Since closed-source models are available via SaaS/API way, enterprises can not own the model's intellectual property. If owning an AI model is a must have for the company than closed-source model would not fit their criteria.

Open-source:

Though many open-source model creators publishes their code, model weights and which data they used to train the model, not everyone does that. So onus is on the enterprises to do the due diligence. While in software there are standard open-source license such as MIT or Apache 2.0 (Falcon, Mistral), many of the model creators do not follow those standard and have created their own license terms, such as Llama license - which restricts enterprises to do certain things.

Another risk of using open-source model is future of the model creator. Many of the model creator companies are in business for hardly 12-24 months and are operating using venture funds. What if they go out of business? Who will end up maintaining or upgrading the models? Enterprises need to consider this.

One more thing to consider is potential vulnerabilities in the code which may allows bad actors to exploit the system. Enterprises need to consider how to harden it and/or create appropriate guardrails.

Since open-source models are offered as permissive license, enterprises can modify the model and eventually own the intellectual property of the model.

Decision:

Choose based on your enterprise's risk tolerance and need for IP ownership.

In closing:

Overall, while closed-source models currently offer advanced functionalities, the rapid evolution of open-source models makes them a compelling choice, especially considering cost and flexibility. The AI landscape is still in its early days, and we can expect open-source models to continue bridging the gap with their closed-source counterparts.

What do you think?

Open-source or closed-source AI

Functionality:

Cost:

Performance:

Flexibility:

Risk:

In closing:

Transitioning into AI Sales

Thinking like a human: agentic workflow in action