Do you want to integrate AI into your business? Fine Tuning Won't Cut It | by Max Surkiz | January, 2025

nimda January 21, 2025

0 7 6 minutes read

Do you want to integrate AI into your business? Fine Tuning Won't Cut It | by Max Surkiz | January, 2025

Machine learning advice from one CEO to another

Until recently, the “AI business” referred exclusively to companies like OpenAI that developed large-scale linguistic models (LLMs) and related machine learning solutions. Now, any business, usually a traditional one, can be considered an “AI business” if it harnesses AI for automation and workflow improvement. But not all companies know where to start this change.

As the CEO of a technology startup, my goal is to discuss how you can integrate AI into your business and overcome the biggest hurdle: customizing a third-party LLM to create the right AI solution that fits your specific needs. As a former CTO who works with people from many disciplines, I set myself the additional goal of putting it in a way that non-engineers can easily understand.

Integrate AI to simplify your business and customize your offerings

Since every business deals with customers, customer-facing roles or partners are part of a universal business. These roles involve managing data, whether you're selling tires, managing a warehouse, or planning global travel like I do. Quick and accurate responses are essential. You must provide the right information quickly, using the most appropriate resources from within your business and your wider market as well. This involves dealing with large amounts of data.

This is where AI excels. It is constantly “on the job”, processing data and making calculations quickly. AI embedded in business operations can be seen in different ways, from “visible” AI assistants like talking chatbots (the main focus of this article) to “invisible” ones like silent filters that power e-commerce websites, including ranking algorithms and recommendation systems.

Consider the travel industry. A customer wants to book a trip to Europe, and wants to know:

best flight deals
good time to travel in good weather
cities include museums with Renaissance art
hotels offer vegetarian options and a nearby tennis court

Before AI, answering these questions would have involved processing each question separately and collating the results by hand. Now, with an AI-powered solution, my team and I can handle all these requests simultaneously and at lightning speed. This is not about my business though: the same is true for almost every industry. Therefore, if you want to increase costs and strengthen your operations, switching to AI is inevitable.

Adjust your AI model to focus on specific business needs

You might be asking yourself, “This sounds great, but how do I integrate AI into it mine jobs?” Fortunately, today's market offers a variety of commercially available LLMs, depending on preference and target area: ChatGPT, Claude, Grok, Gemini, Mistral, ERNIE, and YandexGPT, just to name a few. Once you've found the one you like – as open as Llama – the next step is to plan properly.

In short, optimization is the process of developing a pre-trained AI model from an upstream provider, such as Meta, for a specific downstream application, i.e., your business. This means taking a model and “tweaking” it to fit less defined needs. Fine-tuning doesn't actually add more data; instead, you assign greater “weights” to certain parts of the existing dataset, effectively telling the AI model, “This is important, isn't it.”

Let's say you run a bar and want to create an AI assistant to help bartenders mix cocktails or train new employees. The word “punch” will appear in your raw AI model, but it has a few general meanings. However, in your case, “punch” refers specifically to a mixed drink. Therefore, fine-tuning will instruct your model to ignore MMA references when it encounters the word “punch.”

Use RAG to use the latest data

That said, even a well-tuned model is not enough, because most businesses need new data on a regular basis. Let's say you're building an AI assistant for a dental practice. During the fine-tuning, he explained to the model that “bridge” meant dental restoration, not public buildings or a card game. So far, so good. But how do you get your AI assistant to incorporate the information that appeared in the research piece published last week? All you need to do is feed new data into your AI model, a process known as retrieval-augmented generation (RAG).

RAG involves taking data from an external source, beyond the pre-trained LLM you are using, and updating your AI solution with this new information. Let's say you are creating an AI assistant to assist a user, a professional analyst, in financial consulting or research. Your AI chatbot needs to be updated with the latest quarterly statements. This specific, newly released data will be your RAG source.

It is important to note that using RAG does not eliminate the need for fine tuning. Indeed, RAG without fine tuning can work for a specific Q&A program that relies exclusively on external data, for example an AI chatbot that includes NBA statistics from previous seasons. On the other hand, a well-tuned AI chatbot can prove to be sufficient without RAG for tasks like PDF summarization, which are usually much less domain-specific. However, in most cases, a customer-facing AI chatbot or a robust AI assistant that fits your team's needs will require a combination of both processes.

Subtract vectors to extract RAG data

The biggest challenge for anyone who wants to use RAG is deciding how to configure their data source efficiently. When a user query is made, your domain-specific AI chat engine retrieves the information from the data source. The relevance of this information depends on what kind of data you extracted during pre-processing. So, while RAG will always provide your AI chatbot with external data, the quality of its responses is subject to your programming.

Preparing your external data source means extracting the right information and not feeding your model insufficient or conflicting information that could compromise the accuracy of the AI assistant's output. Going back to the fintech example, if you are interested in parameters such as funds invested in offshore projects or monthly payments in derivative contracts, you should not combine RAG with unrelated data, such as social security payments.

If you ask ML developers how they achieve this, most will say the “vector” method. Although vectors are useful, they have two major problems: the multi-stage process is very complex, and ultimately fails to deliver high accuracy.

If you feel confused by the image above, you are not alone. Being a technical, non-linguistic method, vector routing tries to use complex tools to break large documents into smaller pieces. This often (always) results in the loss of complex semantic relationships and a reduced understanding of the context of the language.

Let's say you're involved in an automotive supply chain, and you need some statistics on tire sales in the Pacific Northwest. Your data source – the latest industry reports – contains national data. Because of how vectors work, you can end up removing irrelevant data, like New England statistics. Otherwise, you may end up extracting data that is related but not specific to the destination, such as hubcap sales. In other words, your extracted data may be relevant but not accurate. The performance of your AI assistant will be affected accordingly when it receives this data between user queries, leading to erroneous or incomplete answers.

Create information maps for better RAG navigation

Fortunately, there is now a new and more direct method – information maps – which is already being used by respected technology companies such as CLOVA X and Trustbit*. Using information maps reduces RAG contamination during data extraction, resulting in systematic retrieval during live user queries.

A business information map is like a driving map. Just as a detailed map leads to a smoother journey, an information map improves data extraction by mapping all relevant information. This is done with the help of domain experts, internal or external, who are familiar with the specifics of your industry.

Once you've developed these “must-knows” for your business structure, compiling a knowledge map ensures that your updated AI assistant will refer to this map when searching for answers. For example, to prepare for RAG's LLM specific to the oil industry, domain experts can identify the molecular differences between the newest synthetic diesel and traditional petroleum diesel. With this information map, RAG's extraction process is more streamlined, improving the accuracy and relevance of the Q&A chatbot during real-time data retrieval.

Most importantly, unlike vector-based RAG systems that simply store data as numbers and cannot learn or adapt, information mapping allows for continuous improvement within the loop. Think of it as having a dynamic, programmable system that gets better with feedback the more you use it. This is similar to actors who improve their acts based on audience response to ensure that each show is better than the last. This means that the capabilities of your AI system will continuously evolve as business demands change and new benchmarks are set.

The key to take

If your business aims to streamline workflows and optimize processes through industry-leading AI, it's important to go beyond optimization.

As we've seen, with few exceptions, a strong AI assistant, whether serving customers or employees, cannot function effectively without new data from RAG. In order to ensure high-quality data extraction and efficient use of RAG, companies should create domain-specific information maps instead of relying on ubiquitous numerical vector databases.

While this article may not answer all of your questions, I hope it will point you in the right direction. I encourage you to discuss these strategies with your colleagues to consider other steps.

*How We Build Better Rag Programs With Knowledge Mapstrustbit, Accessed on 1 Nov. 2024

Source link

nimda January 21, 2025

0 7 6 minutes read