Carses Matrix makes simple: accuracy, accuracy, recalls and F1-score

We are dealing with Separation of algorithms In the study of a logististic machine, neighboring neighbors, nearby neighborhoods, who support vertor Crickifiers, etc., we do not use metrics.
Instead, we produce a confusion matrix, and based on confused matrix, the subdivision report.
In this blog, we intend to understand what matrix is confused, how to calculate accuracy, accuracy, and F1-points, and how to choose the correct metrics based on data features.
Understanding cardion matrics and division metrics, let us use Wisconsin data for breast cancer.
This data contains 569 lines, and each line provides information on various plant factors and its diagnosis, even if it hurts (noncelivering (cancer).
Now let's develop a model of the separation of this data to separate tumors according to their items.
We now use LocisTic Funeral to train the model in this data.
Code:
import pandas as pd
from sklearn.preprocessing import LabelEncoder
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import confusion_matrix, classification_report
import seaborn as sns
import matplotlib.pyplot as plt
# Load the dataset
column_names = [
"id", "diagnosis", "radius_mean", "texture_mean", "perimeter_mean", "area_mean", "smoothness_mean",
"compactness_mean", "concavity_mean", "concave_points_mean", "symmetry_mean", "fractal_dimension_mean",
"radius_se", "texture_se", "perimeter_se", "area_se", "smoothness_se", "compactness_se", "concavity_se",
"concave_points_se", "symmetry_se", "fractal_dimension_se", "radius_worst", "texture_worst",
"perimeter_worst", "area_worst", "smoothness_worst", "compactness_worst", "concavity_worst",
"concave_points_worst", "symmetry_worst", "fractal_dimension_worst"
]
df = pd.read_csv("C:/wdbc.data", header=None, names=column_names)
# Drop ID column
df = df.drop(columns=["id"])
# Encode target: M=1 (malignant), B=0 (benign)
df["diagnosis"] = df["diagnosis"].map({"M": 1, "B": 0})
# Split features and target
X = df.drop(columns=["diagnosis"])
y = df["diagnosis"]
# Train-test split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, stratify=y, random_state=42)
# Scale the features
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)
# Train logistic regression
model = LogisticRegression(max_iter=10000)
model.fit(X_train, y_train)
# Predict
y_pred = model.predict(X_test)
# Confusion Matrix and Classification Report
conf_matrix = confusion_matrix(y_test, y_pred, labels=[1, 0]) # 1 = Malignant, 0 = Benign
report = classification_report(y_test, y_pred, labels=[1, 0], target_names=["Malignant", "Benign"])
# Display results
print("Confusion Matrix:n", conf_matrix)
print("nClassification Report:n", report)
# Plot Confusion Matrix
sns.heatmap(conf_matrix, annot=True, fmt="d", cmap="Purples", xticklabels=["Malignant", "Benign"], yticklabels=["Malignant", "Benign"])
plt.xlabel("Predicted")
plt.ylabel("Actual")
plt.title("Confusion Matrix")
plt.tight_layout()
plt.show()
Here, after using a logical art in the data, we produced confusion matrix and a classification report to monitor model.
First Let's understand Matrix Cardion
From the above matrix
'60' represents the He predicted bad tumorsreference to them as “The things that are true”.
'4' represents the The intestines are inappropriately predictedreference to them as “The Evil False”.
'1' represents the Unduly predicted the negative tumors actually tumorsreference to them as “False errors”.
'106' represents the Properly predicted Tumors in Benign, reference to them as “Bad Lies”.
Let us now consider what we can do about these standards.
Because we look at the report of separation.

From the above subdivision report, we can say that
For Malignant:
– Accuracy is 0.98, meaning when the model forecasts the tumor, it is only 98% of the time.
– Remember it is 0.94, which means the model is best identified 94% of all the bad tumors.
– F1-Score is 0.96, which measures accuracy and memory.
With Belign:
– Accuracy is 0.96, which means when the model forecasts the tumor as a bin, is good 96% of the time.
– Remember it is 0.99, meaning the model is best identified 99% of all Tumors in Benign.
– F1-score is 0.98.
From the report we can see that the accuracy of the model is 97%.
We also have a macro rating and weight weight, let's look at how this is calculated.
Macro measure
Macro rating calculates all metrics (accuracy, recall and F1-score) in both classes give equal weight in each stage, no matter how many classes they contain.
We use a Macro rating, where we want to know the model performance in all classes, ignore class inequalities.
In this data:

Average weighed
Average rating and counts the rate of all Metrics but we give more weight into a category of multiple samples.
In the above code, we used test_size = 0.3meaning we have set aside 30% of the test meaning we use 171 samples from 569 samples data to find a test set.
The matrix of confusion and the distinctive report is based on this test set.
In 171 samples of a set test, we have 64 bad tumors and 107 tumors in Nign.
Now let's see how this weight rate is calculated on all Metrics.

The weight-weighted value gives us a practical rate of working when we have amazing datasets.
We have now received each term opinion with the process of separation and how to count macro and weight ratings.
Now let's see what the use of the confusion matrix to make a separation report.
For a distinction report we have as different metrics as accuracy, accuracy etc. And these metrics are calculated using cardion matrix valions.
From a matrix of a matrix we have
True positives (TP) = 60
Lies lies (FN) = 4
The Positive Positive (FP) = 1
True Negatives (TN) = 106
Now let's count the divorce matters using these values.

This approach listen to the division metrics using confusion matrix.
But why do we have metrics of four different metrics instead of one metric as accuracy? It is because different metrics show different power and Classifier weak based on data contemporary.
Now let's go back to the Wisconsin cancer database we used here.
When using a visual refinement model in this story, we received the accuracy of 97% high, which can think that the model works well.
But let's look at another metric called 'remembering' for 0.94 of this model, which means all the abusable characteristics in the 94% of them.
Here the model has lost 6% of the bad cases.
In real world cases, especially health services such as cancer, if we missed a direct criterion, it can delay the diagnosis and treatment.
With this we understand that even if we have the accuracy of 97%, we need to look deep based on data center by looking at different metrics.
Therefore, what we can do now, if we have to remember 1.0 The bad tumors are well identified, but if we press to remember 1.0 specifying the model.
When the model plans tumors as Malignant, there will be unnecessary anxiety, and may require additional tests or treatment.
Here we should aim to raise 'remember' by keeping 'accuracy' accuracy.
We can do this by changing the curses based on checkifers to separate the samples.
Most of the old things set the limit to 0.5, and if we change 0.3, we say that even if it's been sure about 30%, divide as Malignant.
Let us now use the custom limit of 0.3.
Code:
# Train logistic regression
model = LogisticRegression(max_iter=10000)
model.fit(X_train, y_train)
# Predict probabilities
y_probs = model.predict_proba(X_test)[:, 1]
# Apply custom threshold
threshold = 0.3
y_pred_custom = (y_probs >= threshold).astype(int)
# Classification Report
report = classification_report(y_test, y_pred_custom, target_names=["Benign", "Malignant"])
# Confusion Matrix
conf_matrix = confusion_matrix(y_test, y_pred_custom, labels=[1, 0])
# Plot Confusion Matrix
plt.figure(figsize=(6, 4))
sns.heatmap(
conf_matrix,
annot=True,
fmt="d",
cmap="Purples",
xticklabels=["Malignant", "Benign"],
yticklabels=["Malignant", "Benign"]
)
plt.xlabel("Predicted")
plt.ylabel("Actual")
plt.title("Confusion Matrix (Threshold = 0.3)")
plt.tight_layout()
plt.show()
Here we entered 0.3 custom limit and we produced confusion matrix and a classification report.

Distinction Report:

Here, the accuracy is expanded to 98% and remember that the bad is up to 97% and accuracy remained similar.
We were just a discussion that there might be a decline in accuracy when we try to prove to remember but here is the same, this depends on (or not), balanced steps and restoring the limit.
For medical media, the magnification of remember is often popular with accuracy or accuracy.
When we look at the datassets such as spam testing or deception, we choose the same accuracy in the way above trying to improve accuracy of limitation and remember.
We use F1-Score where the data is not doing anything, and when we like to be clear and remember when it is ignored or lies or lies neglected.
Data source
Wisconsin Bance Cancer Dataset
Wolberg, W., Mazasarian, O., Road, N., & Street, W. (1993). Wisconsin of breast cancer (diagnosis) [Dataset]. A UCI machine study.
This data has licenses under Creative Commons Recisions 4.0 (CC by 4.0) License and free to use commercial or educational responsibility provided the first source.
Here we discussed the confused matrix and how it is used to calculate the metric counts as accuracy, accuracy, F1-score.
We also checked when we prioritize the metric metric, using Wisconsin Cancer Dataset as an example, when we choose to recall.
I hope you get this a useful block in seeing the confusion matrics and metrics of clear separation.
Thanks for reading.



