How to use GPT-5 effectively

nimda November 7, 2025

0 9 6 minutes read

and powerful and helpful features. The model has various parameters and options to choose from, which you should choose carefully to optimize the performance of GPT-5 for your application environment.

In this article, I'll go over the different options you have when using GPT-5, and help you choose the optimal settings to work best for your use case. I will discuss the different input methods you can use, the available features GPT-5 has, such as tools and file uploads, and I will discuss the parameters you can set for the model.

This article is not sponsored by OpenAI, and is simply a summary of my experience from GPT-5, discussing how to successfully use the model.

This highlights the main content of this article. I will discuss how GPT-5 handles multimoral input and how you can use it effectively. In addition, I will cover tool calling and consultation efforts / verbosity resource. I'll discuss structured output again where that's useful, as well as file uploads. Photo by chatgpt.

Why you should use GPT-5

The GPT-5 is a very powerful model that you can use for various tasks. To do, for example, use a chatbot assistant or extract important metadata from documents. However, GPT-5 also has many different modes and settings, many of which can be read more about the Opelai guide in GPT-5. I'll discuss how to navigate through all of these options and make the most of GPT-5 for your use case.

Multimodal skills

GPT-5 is a multimodal model, meaning you can input text, images, and audio, and the model will output text. You can also mix different input lodalies, for example, to insert a picture and a moment to ask about the picture, and then get an answer. Inserting text is, of course, expected, expected in LLM, but the ability to insert images and audio is much more powerful.

As I've discussed in previous articles, VLMSs are very powerful in their ability to understand images, which often works better than doing OCR on an image and then understanding the extracted text. The same concept applies to audio as well. You can, for example, send directly to an audio clip, and not only analyze the words in the clip, but also the speed, speed, etc. from the audio clip. Multimodal understanding allows you to allow a deeper understanding of the data you are analyzing.

Tools

Tools are another powerful thing you have. You can define tools that the model can use at runtime, which turns GPT-5 into an agent. An example of a simple tool is the Get_Weather() Function:

def get_weather(city: str):
   return "Sunny"

You can remove your custom tools from your model, as well as the definition and parameters of your work:

tools = [
    {
        "type": "function",
        "name": "get_weather",
        "description": "Get today's weather.",
        "parameters": {
            "type": "object",
            "properties": {
                "city": {
                    "type": "string",
                    "description": "The city you want the weather for",
                },
            },
            "required": ["city"],
        },
    },
]

It is important to ensure detailed and descriptive information in your job organizations, including the job description and the parameters to use the job.

You can define many discovery tools in your model, but it is important to remember the main principles of AI TOOL definitions:

The tools are well explained
Tools don't transfer
Make it clear to the model how you should use the function. Aliguity makes the use of tools ineffective

Parameters

There are three main parameters you should care about when using GPT-5:

An attempt to reason
The thief
Systematic release

Now I will explain the different parameters and how to approach their selection.

An attempt to reason

The consultation effort is a parameter where you choose:

Less thinking actually makes the GPT-5 a non-thinking model and should be used for simple tasks, where you need quick answers. For example, you can use less thinking effort in a chat system where questions are easy to answer, and users expect quick answers.

Your job is more difficult, the more you think you have to use, although you have to keep in mind the cost and latency of using more thinking. Consultations are counted as Withdrawal Tokens, and at the time of writing this article, 10 USD / million GPT-5 tokens.

I usually test the model, starting with the lowest consultation effort. When I see the model battles to give high quality answers, I go up in the level of reasoning, first from minimalil -> low. I continue to test the model and see how well it performs. You should strive to use the lowest possible consultation effort of acceptable quality.

You can set up a consultation effort with:

client = OpenAI()
request_params = {
        "model" = "gpt-5",
        "input" = messages,
        "reasoning": {"effort": "medium"}, # can be: minimal, low, medium, high
    }
client.responses.create(**request_params)

The thief

Verbosity is another important parameter to choose from, and you can choose from:

Verbosity sets how many output tokens (except logic tokens here) the model should execute. The default is mid-water, which Ountai says is actually used for their previous models.

Let's say you want the model to generate longer and more detailed answers, you should set verbosing to High. However, I find myself choosing between low and medium resource.

For Chat applications, medium maid
For extraction purposes, however, where you only want to extract specific information, such as a date from a document, I recommend reducing it. This helps to ensure that the model only responds with the output I want (date), without providing additional reasons and context.

You can set the resource veil with:

client = OpenAI()
request_params = {
        "model" = "gpt-5",
        "input" = messages,
        "text" = {"verbosity": "medium"}, # can be: low, medium, high
    }
client.responses.create(**request_params)

Systematic release

Formatted output is a powerful setting that you can use to ensure that GPT-5 responds in JSON format. This is also useful if you want to extract specific datapoints, and no other text, such as a date from a document. This ensures that the model responds with a valid JSON object, which you can compile. All Metadata extractions I do use these structured properties, because they are very helpful in ensuring consistency. You can use the structured result by adding the “Text” key to the request params in GPT-5, as below.

client = OpenAI()
request_params = {
        "model" = "gpt-5",
        "input" = messages,
        "text" = {"format": {"type": "json_object"}},
    }
client.responses.create(**request_params)

Be sure to mention “JSON” in your prompt; Otherwise, you will get an error when using the structured result.

Uploading a file

File uploading is another powerful feature available with GPT-5. I discussed earlier the multimodal capabilities of the model. However, in some cases, it is useful to upload the document directly and have acalwai prse this document. For example, if you haven't OCRed or extracted images from the document yet, you can instead upload the document directly to Opelai and ask questions. From experience, uploading files is faster and faster, and you usually get answers faster, especially according to the effort you are asking for.

If you need quick responses from documents and don't have time to run OCR first, file upload is a powerful feature to use.

Under GPT-5

GPT-5 also has low fish. Based on a large custom I noticed during the implementation that Opelai does not share logic tokens when using the model. You can only achieve a snapshot of thinking.

This is very limited in live applications, because if you want to use higher effort (medium or high), you cannot broadcast any information from the GPT-5 user to the user, while the model thinks, it makes a poor user experience. The choice is then to use low thinking efforts, which leads to low quality output. Other Frontier model providers, such as anthropic and gemini, both have virtualization tokens available.

There has been a lot of discussion about how GPT-5 is more creative than its predecessors, although this is not a big problem for the applications I work on, because creativity is often not a requirement for the use of APPT-5.

Lasting

In this article, I have provided an overview of GPT-5 with different parameters and options, and how you can effectively use the model. When used in the right way, GPT-5 is a very powerful model, although it comes naturally with low attacks, which is the main reason in my opinion that OpenTai does not deal with consultation tokens. Whenever working on LLM applications, I always recommend getting backup models available from other Frontier model suppliers. This, for example, can have GPT-5 as the main model, but if it fails, you can go back to Gemini 2.5 Pro from Google.

👉 Find me in the community:

📩 Subscribe to my newsletter

🧑💻 Get in touch

🔗 lickEdin

🐦 X / Twitter

✍️ Medium

You can read my other articles:

Source link

nimda November 7, 2025

0 9 6 minutes read