Why local, fine-tuned models are the near future of the AI workflow

For many people, ChatGPT is synonymous with generative AI. It was certainly the platform that captured the world’s attention. Even today, many people haven’t ventured further than the comfortable chat-based interface that it offers. This is a shame, as it means that many people haven’t yet fully grasped the amazing opportunities that these technologies bring — or the potential associated dangers.

Where we are at now with LLM?

We now see an interesting split in the large language model (LLM) landscape as their developers take one of two paths. The first path is the one most people are familiar with — the development of the largest, most intelligent and most impressive models. 

These models follow GPT-3.5 and GPT-4 (the models powering ChatGPT) and aim to be the absolute best LLM possible. There are many of these, such as Google Gemini, Metta Llama-3 and Mistral. While GPT-4 is still regarded by many as the “king” of the group, the competition is getting tighter daily.

One commonality between these models is that they are huge and require massive amounts of computing and electrical power to function. Because of this, they can only be used through APIs or hosted interfaces.

Dig deeper: 6 ways to use generative AI for your marketing

A new path toward fine-tuned models

But a second path is emerging and creating a lot of intrigue for good reason. Instead of going for the most powerful AI ever, a new breed of models can balance size and speed with power. Simply put, these models take the concepts from the “big” models and squeeze them into much tighter packages. Some of these are smaller incarnations of the big models — such as Llama-3-8B and Gemma (baby sister to Google Gemini). Others have been designed from the ground up to be small, like Microsoft’s Phi-3.

These are all exciting because they can all be run on modest modern hardware. I have run all these (and many more) on my three-year-old MacBook M1 Pro. No need for my data to leave my laptop. No API calls to massive servers. Everything is local.

In addition to all of those benefits, one major element makes these small models special. Unlike the huge models, which are all closed, most smaller models are open. Their weights — the billions of parameters that determine how they behave or “think” — have been released publicly. That means they can be fine-tuned by mere mortals like you or me.

While this fine-tuning process is possible, it isn’t for the faint-hearted. It still requires technical savvy and access to computing power, but training a typical small model can be done with about $10-$50 of rented Cloud GPUs and some patience. And the resulting fine-tuned model can be run anywhere — even on your laptop.

We can now take data we have or some behavior we want and create a new version of the model that exactly replicates that behavior or can reason around the data we supply. This means that the model can know everything about your products, segmentation and scoring scheme, what you look for in a lead or really anything related to your business workflow. The most valuable part is that it can all run on a modest machine completely within your own infrastructure.

Much of the transformational power of LLMs today comes when the technologies are embedded into a workflow, performing a single focused task — think Adobe’s Generative Fill or GitHub CoPilot. By leveraging the power and value of fine-tuned models, we can specify and develop small models that fit a workflow step and run cheaply and securely within our own infrastructure. 

Dig deeper: Microsoft unveils a new small language model

Applying small models to your growth marketing strategy

A common task in many marketing workflows is account scoring: assigning an estimated value or assessment grouping to a new incoming lead. This allows for both prioritization and measuring the health of the current pipeline. Scoring is often simple, based on company size and maybe a salesperson’s estimate of potential. However, with a custom model, we can do much better.

We first need to build a dataset on which to train the new model. We can use our existing sales database with actual sales data, augmented with company descriptions downloaded directly from their websites. Due to the sensitivity of this data, it’s crucial to work locally with a small model instead of sharing it with an external model like ChatGPT.

We can train a model so that given a company’s description — the words on their website — it will provide us with an automated score completely based on our internal performance data. This model can be embedded in our workflow so that we get an accurate, immediate assessment of the value of the lead. You are not going to get that using the large public models. 

Take a moment to think about what is critical to your marketing workflow and identify the actions — however small — that can be revolutionized by applying some intelligence. Is it your lead scoring, as we discussed above? Proposal development? Release scheduling? I am sure that an hour on a whiteboard can identify many places where applying very focused intelligence will transform our efficiency and competitiveness. We can use the new breed of fine-tuned models for these focused tasks.

Given the fast pace of AI development, it would be foolish to declare anything as the “future” of AI. However, if we restrict ourselves to the near future, the most transformational opportunities will likely be those enabled by custom, local, fine-tuned models. These will be the silent components in the most successful business workflows and products.

Dig deeper: Decoding generative AI: Top LLMs and the app ecosystems they support


The post Why local, fine-tuned models are the near future of the AI workflow appeared first on MarTech.