Machine Learning

Machine Learning “Lovent Calendar” Day 2: k-nn classifier in Excel

K-NN Rescrees and the concept of prediction based on distance, now we look at K-NN Classifier.

The goal is the same, but the classification allows us to introduce several useful variations, such as Radius nearest neighbors, nearest neighbors, Multi-Class Prediction, and probablistic distance models, and probablistic distance models, and probablistic distance models, and probablistic distance models, and probablistic distance models, and probablistic distance models, and distance models this is beautiful.

So we will start using the K-Nn Classifier, and then discuss how it can be improved.

You can use this Excel /GE / GOGE sheet while reading this article to better follow all the explanations.

k-nn classifier in Excel – image by Author

The Titanic Survival Dataset

We will use the Titanic Survival Dataset, a classic example where each row describes a passenger with characteristics such as class, gender, age, and money to predict, and the goal is to save whether the passenger survived.

Titanic Survival Dataset – image by Author – CC0: Public Domain license

The K-Nn classification system

The Nn Classifier is similar to the K-Nn Restamer moressor which I almost wrote one by one to describe both.

In fact, when we look K Near neighbors, we don't use a value y at all, let alone its nature.

But, there are still some interesting facts about how clasclifries (binary or multiful) are constructed, and that the elements can be handled differently.

We start with the binary classification task, then the multivariate classification.

One constant feature of binary classification

So, very quickly, we can do the same exercise for one continuous feature, with this data.

For the y value, we often use 0 and 1 to separate the two classes. But you can see, or will see that it can be a source of confusion.

k-nn classifier in Excel – One continuous feature – image by Author

Now, think about it: 0 and 1 are also numbers, right? Therefore, we can do exactly the same process as when we do it again.

That's right. Nothing changes in the complication, as you can see on the next screen. And you can try to change the value of the new recognition.

Nn Classifier in Excel – Prediction of one continuous feature – image by Author

The only difference is how we interpret the result. When we take the “average” of the neighbors' y Values, this number is understood as the probability that a new view belongs to class 1.

So in fact, the “medium” value is not a good interpretation, but it is better a class 1 measure.

And we can manually create this association, to show how the predicted change over the list of x values.

Traditionally, to avoid ending up with a 50 percent chance, we choose an odd value Kso that we can always decide on a big vote.

Nn Classifier in Excel – Predictions of one continuous feature – image by Author

The Second Factor of Binary Classification

If we have two features, the operation is also almost the same as K-NN Reslysor.

k-nn classifier in Excel – Two continuous features – image by Author

One feature of the multi-class classification

Now, let's take an example of a three-dimensional y target.

Then we can see that we cannot use the idea of ​​”average” again, because the number that represents the class is not even a number. And we should better call them “category 0”, “category 1”, and category 2 “.

Nn Classifier in Excel – Multi-Class Classifer – Image By Author

From K-Nn to nearest Centroids

When k becomes very large

Now, let's do the big ik. How Big Are They? As big as possible.

Remember, we also did this work with K-Nn Restames, and the conclusion was that if k arises the total number of observations in the training data, then the K-NN Rescreasor is a simple measure-the value of the Alive-value scitator.

For the K-Nn Classifier, it is almost the same. If k represents the total number of observations, then for each class, we will find a fraction of the total within all training data.

Some people, from a Bayesian perspective, call this equivalent priors!

But this does not help us much to distinguish new observations, because these adults are the same in all points.

The creation of centimeters

So let's take one step at a time.

In each category, we can also sum all the attribute values x which belongs to class, and includes their measure.

These avagement factors of averages are what we call them centimeters.

What can we do with these centimeters?

We can use them to separate new observations.

Instead of multiplying the distances from every detail to every new point, we simply measure the distance from each centroid and give the APSITE category.

With the Titanic Survival Dataset, we can start with one feature, yearsand combine the centimeters of two classes: passengers who survived and passengers who did not use.

Nn Classifier in Excel – nearest neighbor – image by Author

Now, it is also possible to use many continuous features.

For example, we can use the two year old fare.

Nn Classifier in Excel – nearest neighbor – image by Author

And we can discuss some important features of this model:

  • Scale is important, as we discussed before K-NN Reslysor.
  • Missing values ​​are not a problem here: when we add each centimeter, each one is calculated with the available (empty) values.
  • We have moved from the most “complex” model (in the sense that the real model is all the training data, so we have to keep all the data) to the simplest model (we only use these values ​​for our model)

It goes from highly nonlinear to well linear

But now, can you think of one big thread?

And the basic K-NN Classifier is not very intuitive, the nearest centroid method is very sequential.

In this 1D example, the two centimeters are simply the middle x values ​​of Class 0 and Class 1. Because these two frames are close, the decision boundary is just between them.

So instead of a piecewise border, a full full border depends on the exact location of many training points (as in k-nn), we get a direct cutoff that depends only on two numbers.

This shows how the nearest centimeter compresses all the details into a very simple and straightforward rule.

NN Classifier in Excel – Nearby Centroids – image by Author

A note on regression: why centroids don't work

Now, this kind of improvement is not possible with K-NN Rescreasor. Why?

In segmentation, each class forms a visual group, so computing the average feature vector for each class makes sense, and this gives us the class sizes.

But in order, it is a target y it continues. There are no discrete groups, no subject boundaries, so there is no logical way to define a “class centroid”.

The continuous target has many more values, so we cannot plan for the elimination of their group y The value of getting centimeters.

The only “centroid” in order will be What does it meanwhich corresponds to the case k = n in the K-NN Reslysor.

And this matter is too simple to be helpful.

In short, the next centimeter classifier is a natural improvement of the classification, but it cannot be exactly equal to the regression.

Some mathematical developments

What else can we do with the basic K-NN Classifier?

Average and variance

For the nearest centimeter, we used the simplest calculation – common. A natural reflex in math is to add diversity like that.

So, now, the distance is no longer euclidean, but The Mahalanobis distance. Using this range, we obtain probabilities based on the distributions represented by the mean and variance of each class.

Employment characteristics behave

For categorical items, we cannot calculate proportions or variances. And with the K-Nn Restamessor, we realized that it is possible to do a single or written / labeling / labeling. But scale is important and not easy to determine.

Here, we can do something equally meaningful, in terms of possibilities: We can Calculate the value of each category within the class.

These ratios work well as probabilities, defining what each category is in each category.

This concept is directly linked to similar models Categories Naive Naive Bayeswhere the classes are visible Frequency Distribution above the categories.

Weighted grade

Another way is to introduce bells, so that the nearest neighbors are more than the far ones. In Skikit-Funda, there is a “weight” argument that allows us to do so.

We can also change from neighbor “K” to fixed radii around the new observation, resulting in a radius-based test.

Radius Nearest Neighbors

Sometimes, we can get the following picture to explain K-NN Classifier. But in fact, with a radius like this, it shows more visibility to the nearest neighbors.

One benefit is neighborhood management. It is especially interesting when we know a concrete definition of distance, such as spatial distance.

RADIS FOR CREALFIFICER – Author Image

But the draw back is that you have to know the radius beforehand.

By the way, this idea of ​​Radius Nearest Neighbors is also worth repeating.

A variety of different returns

All these small changes provide different models, each trying to improve the basic idea of ​​comparing neighbors according to a more complex definition of distance, or a control parameter that allows us to find local neighbors, or the global status of neighbors.

We will not look at all these models here. I just can't help myself from going too slow when a slight difference naturally leads to another point of view.

For now, think of this as an announcement of the models we'll be using later this month.

Variation and optimization of K-Nn Classifier – image by Author

Lasting

In this article, we explored the K-Nn Classifier from its most basic form to several extensions.

The central idea doesn't really change: New views are classified based on how similar they are to the training data.

But this simple idea can take many different forms.

For continuous features, similarity is based on geometric distance.
For distinguishing features, we are looking instead for each class to appear between neighbors.

When k becomes very large, all data points collapse into a few summary statistics, which naturally leads to Near centroids classifier.

Understanding this family of Distaugh-based and Ususe-based probabilities shows that many machine learning models have different ways of answering the same question:

What class is this newest sight?

In the following articles, we will continue to explore reduction-based models, which can be understood as global measures of similarity between observations and classes.

Source link

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button