Meridian new mixing model | By Benjamin Etienne | Feb, 2025
Let us now use a Meridian library and data. The first step is to enter the meridian with a pip or poetry: pip install google-meridian
either poetry add google-meridian
We will be getting details and starts to explain the columns we have interested.
import pandas as pdraw_df = pd.read_csv("https://raw.githubusercontent.com/sibylhe/mmm_stan/main/data.csv")
With the diversity of control, we will use all flexible holidays in the Database. Our KPI will sell, and time of time jewelry will be weekly.
Next, we will choose our media travel methods. Meridian makes the difference between Media information including Media Delivery:
- Media writing (or “performance“) Contains metrics displayed in each station and time of time (such as each time). Media values should not contain unique prices.
- Media issue : Containing the use of stories in the channel and time. The media data and the use of the press should be the same size.
When should spend use of money vs to be killed?
Often it is recommended using metrics expressing metrics such as direct entries in the model as represented by the method of news activities have been dealt with consumers. However, no one is organizing the budget using the Death Data. If you are using the MMM to do well budget configuration, my advice can use the data you control, which means spending.
Loading Information
In our case, we will only use the spending from 5 stations: The newspaper, radio, TV, communication sources and internet shown.
# 1. control variables
CONTROL_COLS = [col for col in raw_df.columns if 'hldy_' in col]# 2. media variables
spends_mapping = {
"mdsp_nsp": "Newspaper",
"mdsp_audtr": "Radio",
"mdsp_vidtr": "TV",
"mdsp_so": "Social Media",
"mdsp_on": "Online Display",
}
MEDIA_COLS = list(spends_mapping.keys())
# 3. sales variables
SALES_COL = "sales"
# 4. Date column
DATE_COL = "wk_strt_dt"
data_df = raw_df[[DATE_COL, SALES_COL, *MEDIA_COLS, *CONTROL_COLS]]
data_df[DATE_COL] = pd.to_datetime(data_df[DATE_COL])
We will count column in their data list so that the meridian can understand. This page CoordToColumns
The object will help us do that, and requires compulsory information:
time
: the time column (usually the day, day or church)controls
: Control methodskpi
: Answer we want the model to predict. In our case, we will give it the amountrevenue
As we want to predict sell.media
: Media Data (which appears, click, etc.) or spending if we do not have murder data. In our case, we will set expenditure.media_spends
: Waste of the media.
There are several other parameters can be used, that is geo
parameter if we have several groups (Exclisters of ex.), population
, reach
, frequency
. Details about this is out of this snow but the documents can be found here.
So we can create our column mappings:
from meridian.data import loadcoord_to_columns = load.CoordToColumns(
time=DATE_COL,
controls=CONTROL_COLS,
kpi=SALES_COL,
media=MEDIA_COLS,
media_spend=MEDIA_COLS,
)
Next, we will use our data data and column mappings to create a data item to be used by the model.
loader = load.DataFrameDataLoader(
df=data_df,
kpi_type='revenue',
coord_to_columns=coord_to_columns,
media_to_channel=spends_mapping,
media_spend_to_channel=spends_mapping
)
data = loader.load()
To View Details
Sale
fig, ax = plt.subplots()
data_df.set_index("wk_strt_dt")[SALES_COL].plot(color=COLORS[1], ax=ax)
ax.set(title="Sales", xlabel='date', ylabel="sales");
fig.tight_layout();
There seems to be a good year with high peaks on Christmas. Trend always lasts OSCILILLATING LEVEL between 50 and 150m.
The dissolution of the media
fig, ax = plt.subplots(5, figsize=(20,30))for axis, channel in zip(ax, spends_columns_raw):
data_df.set_index("wk_strt_dt")[channel].plot(ax=axis, color=COLORS[1])
axis.legend(title="Channel", fontsize=12)
axis.set(title=spends_mapping[channel], xlabel="Date", ylabel="Spend");
fig.tight_layout()
We see the obvious inclination of the newspaper that is linked to the increasing communication practice. Spending looks like it seems or just before Christmas.
Specifies the model
Creating a model and selecting appropriate parameters can be complex as there are many options available. I will share here findings for me but I feel free to examine yourself.
The first part is to choose priors of the spectrum. We will use the PriorDistribution
Section that allows us to explain several variables. You can change the priors about any model parameter (Mu, Tau, Gamma, Beta, etc. If you are using only waste money, to use beta_m
. You can choose roi_m
either mroi_m
But you will need to synchronize the code to use different before that.
import tensorflow_probability as tfp
from meridian import constants
from meridian.model import prior_distributionprior = prior_distribution.PriorDistribution(
beta_m=tfp.distributions.HalfNormal(
0.2,
name=constants.BETA_M,
# If you want to use the ROI vision instead of the coefficients approach
# roi_m=tfp.distributions.HalfNormal(
# 0.2,
# name=constants.ROI_M
)
)
When describing the model details, you will be able to explain:
- priors (cf above).
max_len
: The higher number of lag times (≥ “ 0`) to
Add to the calculation of the ad. I recommend choices between 2 and 6.paid_media_prior_type
: If you choose to model thebeta_m
and selectcoefficient
. Other, chooseroi
eithermroi
.knots
: Meridian works for the rest of the year's change of change for the future, governed by theknots
the value. You can set the 1 (permanent, no modelity modity, or equal to the given number to be below the Data Length. The low amounts may lead to the lower foundation, a high value may lead to excessive reductions and lead the foundation that is all. I recommend to put it at 10% of the amount of data points
It is also possible to explain the division of the train test to avoid extremes with holdout_id
parameter. I will not cover HERE, but it is the best practice of being divided into model selection.
Briefly:
from meridian.model import spec
from meridian.model import modelmodel_spec = spec.ModelSpec(
prior=prior,
max_lag=6,
knots=int(0.1*len(data_df)),
paid_media_prior_type='coefficient',
)
mmm = model.Meridian(input_data=data, model_spec=model_spec)
Running model
Model balances can be slow if you have a large number of data points and variables. I recommend starting with 2 chains, and you have left the default number of samples:
mmm.sample_prior(500)
mmm.sample_posterior(n_chains=2, n_adapt=500, n_burnin=500, n_keep=1000)
Exemplary diagnosis
Once the model is performed, we will make a series of checks to ensure that we can use it with confidence.
- R-hat
IR-HAT next to 1.0 shows modification. R-Hat <1.2 shows an estimated modification and a mere monitoring of many problems.
Lack of conversion is usually one of two opposers. Even if the model is not very badly reduced information, which can be easier (specified model) or previously. Or, no enough mashing, which means n__daptt + n_burnin is not great enough.
from meridian.analysis import visualizermodel_diagnostics = visualizer.ModelDiagnostics(mmm)
model_diagnostics.plot_rhat_boxplot()
We see that all R-HAT prices are below 1.02, indicating empty or problem during training.
2. MODEL Trace
Model Trace contains sample values from chains. A delicious track is when two background distribution (as 2 chains) for a given parameter. On the line below, you can see that blue and dark lines on the left left:
3. Forward VS POSTERIOSATIONS
Knowing that our model learns at the right time, will we compare the pre-vs distribution? If they are completely collecting, this means that our model did not change its previous submission so it may have learned anything, or that priors have missed. To ensure that our model has learned, we would like to see a little change in submission:
Obviously that priors and distributions are exclusive. With TV and Social Media, we see that Oranga Half Officer Private switched to normal blue distribution.
4. R2 and model is appropriate
Finally, we will use matterns to measure our appropriate model. You probably know with metrics such as R2, mapers, etc., so let's look at those standards:
model_diagnostics = visualizer.ModelDiagnostics(mmm)
model_diagnostics.predictive_accuracy_table()
Obviously, R2 of 0.54 is not at all. We can improve that by adding more knots to the foundation, or other information in the model, or playing with the robbers to try to find more information.
Now let's set up model:
model_fit = visualizer.ModelFit(mmm)
model_fit.plot_model_fit()
Media contributions to sell
Remember that one of the MMM goals provide you with media donations vs your sales. This is what we will look at the water drawing:
media_summary = visualizer.MediaSummary(mmm)
media_summary.plot_contribution_waterfall_chart()
We often expect to have a base between 60 and 80%. Keep in mind that this amount can be very sensitive and depends on models of models and parameters. I encourage you to play with Diround knots
Prices and priors and see the impact you may have on the model.
You spend VS donations
The Vermater Versus Donation Charter compares the use of the use of the use and revenue or KPI SPLIT between channels. Green Bar highlights Refunds (ROI) for each channel.
media_summary.plot_roi_bar_chart()
We see that the highest ROI appears to the social media, followed by TV. But with this and when the time of uncertainty is the largest. MMM is not a direct response: It gives you the amounts and uncertainty associated with those. My opinion is that the uncertain times are too big. Maybe we should use many steps to sample or add more variables to the model.
To make my shop
Remember that one of the MMM goals is to propose high budgeting to expend. This can be done first by checking what we call the answer curve. Reply Curves describes the relationship between spending money and income.
We see there:
- The rising income increases as money spending increases
- In some cases, such as the newspaper, growth is slow, meaning 2x increase used for use will not translate 2x income.
The goal of doing well will be to take those curves and wander to find the best combination of the amount that increases our sales of sales. We know that sales = f (media, control, foundation), and we are trying to find media * that increase our work.
We can choose between several problems of using, ex:
- How can I reach the sales level with a SAMED for a small budget?
- Given the same budget, which high amounts can I receive?
Let's use Merarian to use our budget and increase the sale (state 1). We will use the default parameters here but may have roused the issues at each of each station to reduce search limit.
from meridian.analysis import optimizerbudget_optimizer = optimizer.BudgetOptimizer(mmm)
optimization_results = budget_optimizer.optimize()
# Plot the response curves before and after
optimization_results.plot_response_curves()
We see that the Optimizer recommends to reduce the newspaper, an online display and preoccupion of radio spending, social media and TV.
How does it translate the income?
3% increase in revenue increase by reorganizing our budget! This conclusion is fast. First, the past accumulation is easy. You don't have a confirmation that your Basic Sale (60%) will handle the same year later. Think of the coving. Second, our model does not respond to interaction between channels. What we have used here is a simple model, but some methods use a log-log model for a collaborative account between changing things. Third, there is uncertainty in our curves that are not treated by optimizer, because it takes the normal turn of each channel. Responding curves uncertainty looks like a picture below and doing the uncertainty becomes very complicated:
However, it still gives you the idea that you are when perhaps you do not work or spend money underneath.
MMM is a complex but powerful tool that can be opened to your sales information, helping you understand your advertising and help you in budget planning. New ways in reducing Bayesian decks provide a good feature such as advanc and the SATURATION MOLODING, INTRODUCTION OFFICERS, uncertainty standards and performance standards. To enter codes.