Machine Learning

The method of measurement of Gaussian models on the company


Audio processing is one of the most important domains of the digital request (DSP) and the learning of the machine. The acoustic acoustic acoustic empties to create digital audio programs such as: Locumulation, speech development, Acoustic Echo, etc.

Acoustic areas are full of background sounds that can have many sources. For example, when you are sitting in a coffee shop, traveling on a road, or driving your car, hears noises that can be regarded as interrupted or background. Such conflict does not follow the same math model, therefore, a mixture of models can be helpful in imitation.

Those math models can also be useful in division of acoustic areas in each case, background sound quality can be measured using audio sources, one event happens with different acoustic degree.

Other use of such models establish a sound of acoustic areas based on DSP and mechanics not designed to solve certain Audio problems. .

Simple mathematical model that can be useful in these conditions The model of a mixture of Gaussian mixture (GMM) where different sound sources are considered to be followed by some gaussian distribution. Every distribution can be thought that the zero means it is accurate enough for this app, as shown in this article.

Each distribution of GMM there is a habit of contributing to the background. For example, the background sounds can have a corresponding background, while other sources can be among other Windows, etc. All of this should be considered in our Mathematical model.

An example of the GMM data made later (made from time at the time in the area below the gaussian sounds, both formalized. Sign up with high variables.

In some cases and depends on the application, it can be otherwise when a different signature of diversity appear many times (as will be displayed on the latest example in this article). The Python code used for producing and the GMM data will also be displayed later in this regard.

Turning into a formal language, let us think that the background sound sign has been collected (using high-quality microphone) measured as negative (iIid) that are not based on GMM as indicated.

The problem of model thus sticks down to measure the model parameters (ie, P1, σ²1, and σ²2) uses visual data (iIid). In this article, we will use the method of times (MOM) Estimator for such intention.

Making the simplest, we can assume that noise differences (Σ and σ²2) is also known as the entire parameter of mixing (P1) to be measured. MAMA Estimator can be used to estimate more than one parameter (IE, P1, σ²1, and σ²2) as shown in Chapter 9 of the book: “Mathematical Signal processing: The Thorough Thorology “, by Steven Kay. However, in this example, we will assume that only the P1 is unknown and estimated.

As both Gaussians in GMM understand zero, we will start at the second moment and try to find a unknown parameter in the second work.

Be aware of another easy way to find random variations (eg second or more minutes) to use a minute performed a minute (MGF). A good book for a visual look that includes such articles, and more is: “Introduction to Data Science Opportunity“On Stanley H. Chan.

Before continues any other, we would like to reduce the procedure by following basic properties such as bias, variations, consistency, will ensure this later by the Python example.

Starting by EviMarator Bias, we can show that the above estimate of P1 is less restrictive.

We can continue to get the diversity of our ratio next.

It is also clear from the above review associated with what is inevitable and and its difference decreases when the sample size (n) increases. We will also use the above P1 Estimatator difference difference

Now let's launch a particular Python code and do fun things!

First, we make our data following GMM with zero methods and regular deviations equivalent to 2 and 10, respectively, as shown in the code below. In this example, a P1 = 0-mixing paramet is a sample sample size equal to 1000.

# Import the Python libraries that we will need in this GMM example
import matplotlib.pyplot as plt
import numpy as np
from scipy import stats

# GMM data generation
mu = 0 # both gaussians in GMM are zero mean
sigma_1 = 2 # std dev of the first gaussian
sigma_2 = 10 # std dev of the second gaussian
norm_params = np.array([[mu, sigma_1],
                        [mu, sigma_2]])
sample_size = 1000
p1 = 0.2 # probability that the data point comes from first gaussian
mixing_prob = [p1, (1-p1)]
# A stream of indices from which to choose the component
GMM_idx = np.random.choice(len(mixing_prob), size=sample_size, replace=True, 
                p=mixing_prob)
# GMM_data is the GMM sample data
GMM_data = np.fromiter((stats.norm.rvs(*(norm_params[i])) for i in GMM_idx),
                   dtype=np.float64)

We then plan a histogram of produced data comparing the functioning of the organization that may be reflected below. The figure shows the contribution of both Gaussian decisions in the full GMM, for each complicity that is measured by its corresponding object.

The Python code used for producing the above number is shown below.

x1 = np.linspace(GMM_data.min(), GMM_data.max(), sample_size)
y1 = np.zeros_like(x1)

# GMM probability distribution
for (l, s), w in zip(norm_params, mixing_prob):
    y1 += stats.norm.pdf(x1, loc=l, scale=s) * w

# Plot the GMM probability distribution versus the data histogram
fig1, ax = plt.subplots()
ax.hist(GMM_data, bins=50, density=True, label="GMM data histogram", 
        color = GRAY9)
ax.plot(x1, p1*stats.norm(loc=mu, scale=sigma_1).pdf(x1),
        label="p1 × first PDF",color = GREEN1,linewidth=3.0)
ax.plot(x1, (1-p1)*stats.norm(loc=mu, scale=sigma_2).pdf(x1),
        label="(1-p1) × second PDF",color = ORANGE1,linewidth=3.0)
ax.plot(x1, y1, label="GMM distribution (PDF)",color = BLUE2,linewidth=3.0)

ax.set_title("Data histogram vs. true distribution", fontsize=14, loc="left")
ax.set_xlabel('Data value')
ax.set_ylabel('Probability')
ax.legend()
ax.grid()

Then, combine the estimation of parameter to mix the P1 that is found before using my mother and shown here and below for reference.

The Python code used for integrating the above equations using our GMM sample data is displayed below.

# Estimate the mixing parameter p1 from the sample data using MoM estimator
p1_hat = (sum(pow(x,2) for x in GMM_data) / len(GMM_data) - pow(sigma_2,2))
         /(pow(sigma_1,2) - pow(sigma_2,2))

To make good care of this process, we use it Mont Corko Retusionship by producing a lot of GMM data and measuring the P1 for each recognizing as indicated in the Python code below.

# Monte Carlo simulation of the MoM estimator
num_monte_carlo_iterations = 500
p1_est = np.zeros((num_monte_carlo_iterations,1))

sample_size = 1000
p1 = 0.2 # probability that the data point comes from first gaussian
mixing_prob = [p1, (1-p1)]
# A stream of indices from which to choose the component
GMM_idx = np.random.choice(len(mixing_prob), size=sample_size, replace=True, 
          p=mixing_prob)
for iteration in range(num_monte_carlo_iterations):
  sample_data = np.fromiter((stats.norm.rvs(*(norm_params[i])) for i in GMM_idx))
  p1_est[iteration] = (sum(pow(x,2) for x in sample_data)/len(sample_data) 
                       - pow(sigma_2,2))/(pow(sigma_1,2) - pow(sigma_2,2))

Then, we look at the bias and the variations of our average and compare with the Eyori consequences taken before as shown below.

p1_est_mean = np.mean(p1_est)
p1_est_var = np.sum((p1_est-p1_est_mean)**2)/num_monte_carlo_iterations
p1_theoritical_var_num = 3*p1*pow(sigma_1,4) + 3*(1-p1)*pow(sigma_2,4) 
                         - pow(p1*pow(sigma_1,2) + (1-p1)*pow(sigma_2,2),2)
p1_theoritical_var_den = sample_size*pow(sigma_1**2-sigma_2**2,2)
p1_theoritical_var = p1_theoritical_var_num/p1_theoritical_var_den
print('Sample variance of MoM estimator of p1 = %.6f' % p1_est_var)
print('Theoretical variance of MoM estimator of p1 = %.6f' % p1_theoritical_var)
print('Mean of MoM estimator of p1 = %.6f' % p1_est_mean)

# Below are the results of the above code
Sample variance of MoM estimator of p1 = 0.001876
Theoretical variance of MoM estimator of p1 = 0.001897
Mean of MoM estimator of p1 = 0.205141

We can see from the above results what the P1 measure equals 0.2051 is closest to the true parameter p1 = 0.2. This means you are closest to the true parameter as the sample size is increasing. Therefore, we mathematically show that estimation is exterior As confirmed by the previously made theorage effects.

In addition, the sample of the sample of P1 Estimator (0.001876) is almost like aorbade variety (0.001897) that is good.

It's always a happy moment when the idea is like!

All images in this article, unless noted in another way, they are the author.

Source link

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button