Machine Learning

Mnist database: How do I beat 99% accuracy using a part of the data | by Chris Lettieri | Jan, 2025

Notice how the selected samples take different styles of writing and edge.

In some examples such as Cluster 1, 3, and 8 The highest place looks like a variety of unique institution.

Cluster 6 It is a lovely point, showing how hard the pictures are even someone you can guess what they can guess. But you can still do this how this can be in the collection of Centroid as 8.

The latest NEural rules helps to explain why data tune using the “Renewal-from-Centroid” method operates, especially in the MNIS Database.

Data Redundency

Many training examples in large dataset are very effective.

Think about Mnist: How many times' 7's what do we really need? The key to data tune does not have many examples – we have appropriate examples.

The size of selecting vs vs dataset

Alternatively available from the paper above is the relevant data variables from your data size based on:

  • With a “more” data : Select Hard, Various Examples (which arrives more at Cluster Centers).
  • With subtle data: Select Simple, Usual Normal Examples (the closest to Cluster Centers).

This means why our program is “from-from-from-Centroid” effectively efficient.

For examples of 60,000 training, we were in the “Abeating Data” where different examples of, critical challenges are proven to be prove.

Inspiration and goals

I was inspired by the two latest papers (and the fact that I was a data engineer):

Both check the various methods that we can use the data selection strategies to train small data.

Analysis

I have used LENNET-5 as the construction of my model.

Then using one of the tips below I promise MNIST training data and train the model. The test is done against the full set test.

During the obstacles, I only end up 5 exams of each test.

The full code and consequences It is found here in Github.

Strategy # 1: Basis, Fine Dataset

  • Genenn-5 General Construction
  • Training Using 100% Training Data

Strategy # 2: Random sample

  • We sampled some time pictures from the training data

Strategy # 3): K – means to join different selection strategies

Here's how this worked:

  1. Phuprocess animation with PCA to reduce the size. This means just each picture reduced from 784 values ​​(28×28 pixels) into only 50 prices. PCA do this while keeping patterns are very important and removes unwanted information.
  2. The collection uses IK-Way. The number of collections was prepared in 50 and 500 in different tests. The poor CPU could not handle 500 more than 500 to be given all the tests.
  3. Then I checked different ways to choose if the data is final:
  • Cuntroid to-Centroid – This represents a “normal” example.
  • Forthest-from-Centroid – which must be more likely to be edge.
  • Random from each cluster – select randomly within each group.
An instance of the clustering option. Photo by author.
  • PCA reduce the noise and time to combine. At first, I simply had humiliates. The effects and Computers both developed using PCA so I kept with a full test.
  • I have changed from K-means from Minibatchmeans Awsubring at a better speed. The usual algorithm was too slow about my CPU giving all the tests.
  • To set the correct test harreness was the button. The testing forwarding allows yaml, automatic savings results, and there have been O1 Write my visual logical code to make life very easy.

Medium term accuracy and commencement of time

Here are Median Results, comparing our Net-5 found found to complete data with two different strategies using 50% of the data.

The results of the middle time. Photo by author.
Intensive accuracy. Photo by author.

The accuracy of vs is the full run time

The lower charts show the results of my four strategies as compared to the red foundation.

The accuracy of the methods of dating data. Photo by author.
Time to start in the middle of the methods of data tune. Photo by author.

Important discovery on all many runs:

  • Forthest-from-Centroid changes in other ways
  • Definitely is a pleasant place between a strong time and accuracy of model if you want to get it for your charges. Much work needs to be done here.

I'm still shocked that reduce dataset just provides acceptable results when you work well with what you follow.

Future strategies

  1. Check this in my second brain. I want to do the llm well with my perfect obles and test data for dosing and short forests.
  2. Examine other forms of integration. I can actually try to train auto-encoder to embed images instead of using PCA.
  3. Check this in complex and large datasets (CIFAR-10, Imaget).
  4. Try how the model structures affect the operation of data detector.

These findings suggest that we need to reorganize our data information:

  1. The surface data remains better – it seems to be reduced to the main detail / large models.
  2. Pamper of strategies can improve results.
  3. The correct strategy depends on your first data size.

As people start expressing an alarm that runs out of data, I cannot help but wonder if small data is actually key to helpful, income models.

I intend to continue to check the space, please access when you get this exciting – it's fun to connect and talk more 🙂

Source link

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button