Machine Learning

Meridian new mixing model | By Benjamin Etienne | Feb, 2025

Let us now use a Meridian library and data. The first step is to enter the meridian with a pip or poetry: pip install google-meridian either poetry add google-meridian

We will be getting details and starts to explain the columns we have interested.

import pandas as pd

raw_df = pd.read_csv("https://raw.githubusercontent.com/sibylhe/mmm_stan/main/data.csv")

With the diversity of control, we will use all flexible holidays in the Database. Our KPI will sell, and time of time jewelry will be weekly.

Next, we will choose our media travel methods. Meridian makes the difference between Media information including Media Delivery:

  • Media writing (or “performance“) Contains metrics displayed in each station and time of time (such as each time). Media values ​​should not contain unique prices.
  • Media issue : Containing the use of stories in the channel and time. The media data and the use of the press should be the same size.

When should spend use of money vs to be killed?

Often it is recommended using metrics expressing metrics such as direct entries in the model as represented by the method of news activities have been dealt with consumers. However, no one is organizing the budget using the Death Data. If you are using the MMM to do well budget configuration, my advice can use the data you control, which means spending.

Loading Information

In our case, we will only use the spending from 5 stations: The newspaper, radio, TV, communication sources and internet shown.

# 1. control variables
CONTROL_COLS = [col for col in raw_df.columns if 'hldy_' in col]

# 2. media variables
spends_mapping = {
"mdsp_nsp": "Newspaper",
"mdsp_audtr": "Radio",
"mdsp_vidtr": "TV",
"mdsp_so": "Social Media",
"mdsp_on": "Online Display",
}
MEDIA_COLS = list(spends_mapping.keys())

# 3. sales variables
SALES_COL = "sales"

# 4. Date column
DATE_COL = "wk_strt_dt"
data_df = raw_df[[DATE_COL, SALES_COL, *MEDIA_COLS, *CONTROL_COLS]]
data_df[DATE_COL] = pd.to_datetime(data_df[DATE_COL])

We will count column in their data list so that the meridian can understand. This page CoordToColumns The object will help us do that, and requires compulsory information:

  • time : the time column (usually the day, day or church)
  • controls : Control methods
  • kpi : Answer we want the model to predict. In our case, we will give it the amount revenue As we want to predict sell.
  • media : Media Data (which appears, click, etc.) or spending if we do not have murder data. In our case, we will set expenditure.
  • media_spends : Waste of the media.

There are several other parameters can be used, that is geo parameter if we have several groups (Exclisters of ex.), population , reach , frequency . Details about this is out of this snow but the documents can be found here.

So we can create our column mappings:

from meridian.data import load

coord_to_columns = load.CoordToColumns(
time=DATE_COL,
controls=CONTROL_COLS,
kpi=SALES_COL,
media=MEDIA_COLS,
media_spend=MEDIA_COLS,
)

Next, we will use our data data and column mappings to create a data item to be used by the model.

loader = load.DataFrameDataLoader(
df=data_df,
kpi_type='revenue',
coord_to_columns=coord_to_columns,
media_to_channel=spends_mapping,
media_spend_to_channel=spends_mapping
)
data = loader.load()

To View Details

Sale

fig, ax = plt.subplots()
data_df.set_index("wk_strt_dt")[SALES_COL].plot(color=COLORS[1], ax=ax)
ax.set(title="Sales", xlabel='date', ylabel="sales");
fig.tight_layout();

There seems to be a good year with high peaks on Christmas. Trend always lasts OSCILILLATING LEVEL between 50 and 150m.

The dissolution of the media

fig, ax = plt.subplots(5, figsize=(20,30))

for axis, channel in zip(ax, spends_columns_raw):
data_df.set_index("wk_strt_dt")[channel].plot(ax=axis, color=COLORS[1])
axis.legend(title="Channel", fontsize=12)
axis.set(title=spends_mapping[channel], xlabel="Date", ylabel="Spend");
fig.tight_layout()

We see the obvious inclination of the newspaper that is linked to the increasing communication practice. Spending looks like it seems or just before Christmas.

Specifies the model

Creating a model and selecting appropriate parameters can be complex as there are many options available. I will share here findings for me but I feel free to examine yourself.

The first part is to choose priors of the spectrum. We will use the PriorDistribution Section that allows us to explain several variables. You can change the priors about any model parameter (Mu, Tau, Gamma, Beta, etc. If you are using only waste money, to use beta_m . You can choose roi_m either mroi_m But you will need to synchronize the code to use different before that.

import tensorflow_probability as tfp
from meridian import constants
from meridian.model import prior_distribution

prior = prior_distribution.PriorDistribution(
beta_m=tfp.distributions.HalfNormal(
0.2,
name=constants.BETA_M,
# If you want to use the ROI vision instead of the coefficients approach
# roi_m=tfp.distributions.HalfNormal(
# 0.2,
# name=constants.ROI_M
)
)

When describing the model details, you will be able to explain:

  • priors (cf above).
  • max_len : The higher number of lag times (≥ “ 0`) to
    Add to the calculation of the ad. I recommend choices between 2 and 6.
  • paid_media_prior_type : If you choose to model the beta_m and select coefficient . Other, choose roi either mroi .
  • knots: Meridian works for the rest of the year's change of change for the future, governed by the knots the value. You can set the 1 (permanent, no modelity modity, or equal to the given number to be below the Data Length. The low amounts may lead to the lower foundation, a high value may lead to excessive reductions and lead the foundation that is all. I recommend to put it at 10% of the amount of data points

It is also possible to explain the division of the train test to avoid extremes with holdout_id parameter. I will not cover HERE, but it is the best practice of being divided into model selection.

Briefly:

from meridian.model import spec
from meridian.model import model

model_spec = spec.ModelSpec(
prior=prior,
max_lag=6,
knots=int(0.1*len(data_df)),
paid_media_prior_type='coefficient',
)
mmm = model.Meridian(input_data=data, model_spec=model_spec)

Running model

Model balances can be slow if you have a large number of data points and variables. I recommend starting with 2 chains, and you have left the default number of samples:

mmm.sample_prior(500)
mmm.sample_posterior(n_chains=2, n_adapt=500, n_burnin=500, n_keep=1000)

Exemplary diagnosis

Once the model is performed, we will make a series of checks to ensure that we can use it with confidence.

  1. R-hat

IR-HAT next to 1.0 shows modification. R-Hat <1.2 shows an estimated modification and a mere monitoring of many problems.

Lack of conversion is usually one of two opposers. Even if the model is not very badly reduced information, which can be easier (specified model) or previously. Or, no enough mashing, which means n__daptt + n_burnin is not great enough.

from meridian.analysis import visualizer

model_diagnostics = visualizer.ModelDiagnostics(mmm)
model_diagnostics.plot_rhat_boxplot()

We see that all R-HAT prices are below 1.02, indicating empty or problem during training.

2. MODEL Trace

Model Trace contains sample values ​​from chains. A delicious track is when two background distribution (as 2 chains) for a given parameter. On the line below, you can see that blue and dark lines on the left left:

3. Forward VS POSTERIOSATIONS

Knowing that our model learns at the right time, will we compare the pre-vs distribution? If they are completely collecting, this means that our model did not change its previous submission so it may have learned anything, or that priors have missed. To ensure that our model has learned, we would like to see a little change in submission:

Obviously that priors and distributions are exclusive. With TV and Social Media, we see that Oranga Half Officer Private switched to normal blue distribution.

4. R2 and model is appropriate

Finally, we will use matterns to measure our appropriate model. You probably know with metrics such as R2, mapers, etc., so let's look at those standards:

model_diagnostics = visualizer.ModelDiagnostics(mmm)
model_diagnostics.predictive_accuracy_table()
Photo by the writer

Obviously, R2 of 0.54 is not at all. We can improve that by adding more knots to the foundation, or other information in the model, or playing with the robbers to try to find more information.

Now let's set up model:

model_fit = visualizer.ModelFit(mmm)
model_fit.plot_model_fit()

Media contributions to sell

Remember that one of the MMM goals provide you with media donations vs your sales. This is what we will look at the water drawing:

media_summary = visualizer.MediaSummary(mmm)
media_summary.plot_contribution_waterfall_chart()

We often expect to have a base between 60 and 80%. Keep in mind that this amount can be very sensitive and depends on models of models and parameters. I encourage you to play with Diround knots Prices and priors and see the impact you may have on the model.

You spend VS donations

The Vermater Versus Donation Charter compares the use of the use of the use and revenue or KPI SPLIT between channels. Green Bar highlights Refunds (ROI) for each channel.

media_summary.plot_roi_bar_chart()

We see that the highest ROI appears to the social media, followed by TV. But with this and when the time of uncertainty is the largest. MMM is not a direct response: It gives you the amounts and uncertainty associated with those. My opinion is that the uncertain times are too big. Maybe we should use many steps to sample or add more variables to the model.

To make my shop

Remember that one of the MMM goals is to propose high budgeting to expend. This can be done first by checking what we call the answer curve. Reply Curves describes the relationship between spending money and income.

We see there:

  1. The rising income increases as money spending increases
  2. In some cases, such as the newspaper, growth is slow, meaning 2x increase used for use will not translate 2x income.

The goal of doing well will be to take those curves and wander to find the best combination of the amount that increases our sales of sales. We know that sales = f (media, control, foundation), and we are trying to find media * that increase our work.

We can choose between several problems of using, ex:

  • How can I reach the sales level with a SAMED for a small budget?
  • Given the same budget, which high amounts can I receive?

Let's use Merarian to use our budget and increase the sale (state 1). We will use the default parameters here but may have roused the issues at each of each station to reduce search limit.

from meridian.analysis import optimizer

budget_optimizer = optimizer.BudgetOptimizer(mmm)
optimization_results = budget_optimizer.optimize()

# Plot the response curves before and after
optimization_results.plot_response_curves()

We see that the Optimizer recommends to reduce the newspaper, an online display and preoccupion of radio spending, social media and TV.

How does it translate the income?

3% increase in revenue increase by reorganizing our budget! This conclusion is fast. First, the past accumulation is easy. You don't have a confirmation that your Basic Sale (60%) will handle the same year later. Think of the coving. Second, our model does not respond to interaction between channels. What we have used here is a simple model, but some methods use a log-log model for a collaborative account between changing things. Third, there is uncertainty in our curves that are not treated by optimizer, because it takes the normal turn of each channel. Responding curves uncertainty looks like a picture below and doing the uncertainty becomes very complicated:

However, it still gives you the idea that you are when perhaps you do not work or spend money underneath.

MMM is a complex but powerful tool that can be opened to your sales information, helping you understand your advertising and help you in budget planning. New ways in reducing Bayesian decks provide a good feature such as advanc and the SATURATION MOLODING, INTRODUCTION OFFICERS, uncertainty standards and performance standards. To enter codes.

Source link

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button