Missing Data in Time Series? Machine Learning Techniques (Part 2) | by Sara Nóbrega | January, 2025
Use clustering algorithms to handle missing time series data
(If you haven't read Part 1, check it out here.)
Missing data in time series analysis is a persistent problem.
As we explored in Part 1, simple methods of forcing or even models based on linear regression, decision trees it can take us far.
But what if we you need to handle subtle patterns aand capture the best dynamics in complex time series data?
In this article we will explore K-Nearest Neighbors. The power of this model is to make fewer assumptions about the nonlinear relationships in your data; therefore, it becomes a flexible and robust solution to data scarcity.
We will be using the same dataset for pseudo power generation which you already saw in Part 1, with 10% missing values, presented randomly.
We'll include missing data using a dataset you can easily generate yourself, allowing you to track and implement strategies in real-time as you test the process step-by-step!