I originally prepared this tutorial for students at our Faculty of Electrical Engineering in the scope of Machine Learning course. I had a presentation about AI in Smart Grids and I also prepared a machine learning use case, which was about clustering smart meter data. Besides that, I get a lot of messages on LinkedIn how to perform smart meter clustering, therefore I decided to explain it in this blog.

Dataset together with a Jupyter notebook is available here.

Let’s dive in…

There is a massive smart meter roll-out all over the world. Energy companies are replacing old induction energy meters with the new so-called smart meters. What is a smart meter? A smart meter is a device that measures household energy consumption (usually in 15 min intervals) and helps to gain better insights into consumer behavior and grid operating states.

One of the most exciting questions in the energy industry right now is, what can energy companies know about your behavior at home by analyzing smart metering data. This information can help them with grid planning, targeting the consumers in demand response and energy efficiency programs, providing better energy forecasts, etc.

In this tutorial, you will analyze consumer behavior using clustering.

## What is clustering?

Clustering is an unsupervised machine learning technique that enables grouping (clustering) similar observations together. In the example below, you can identify three smooth daily profiles out of a large group of unstructured daily profiles. The result corresponds to typical daily human behaviors.

## Understanding household behavior

First, let’s try to understand the data we are dealing with. In further analysis, we will use smart metering data with an hourly resolution for 100 consumers and only for wintertime.

### How does household behavior look on a daily basis?

At first, it seems that individual consumers behave randomly on a daily basis. By analyzing multiple consumers at the level of the transformer station or at some other higher level in the network the shape of typical daily load profiles can be identified. The figure below shows the daily load profile for one household, followed by daily load profiles for 10 and 200 households. We can see that by adding the load profiles the shapes become smoother.

Since we want to examine the habits of individual consumer groups, we need to combine only their load profiles. For example, we may want to combine the daily load profiles of only one certain type of household.

### Are you able to distinguish your grandma’s behavior from yours using machine learning?

Below is an example of a suburban area in Slovenia. Every symbol represents one household. Using clustering techniques you can distinguish families, elderly people, students, etc. (symbols on the image does not represent actual consumer behavior from that particular area and are there only for illustration purpose). The idea of the image below is to show how utilities can leverage smart meter data for spatial examination of consumer behavior.

### How to cluster consumers together?

There are various types of clustering techniques that can be applied, depending on your final goal. As mentioned before, clustering is grouping similar objects (observations together). When dealing with smart metering data, we have a demand time series (in kW) for every consumer.

What are observations in this case? How to model this?

Every consumer can be represented in three main ways:

with multiple daily load profiles (task 1),

with one mean daily load profile (task 2),

with one mean weekly load profile (task 3).

Clustering algorithm clusters every observation in a dataset in a particular cluster. When clustering smart meter data, you have to decide what will represent the observation in your dataset. As mentioned above, this can be a daily profile, a mean daily profile or a weekly profile. But keep in mind that when representing consumers with mean profiles, this averages out detailed daily behavior, whereas final results are easier to interpret.

Using mathematical notation, clustering can be written as follows. Let X denote a dataset matrix, where X ∈ R^(mxp) and m is the number of observations and p is the number of features. Clustering algorithm provides the vector l ∈ R^(mx1) denoting labels that connect every observation with the corresponding cluster. Further, cluster centroids can be calculated, by taking the mean of all the observations in every cluster separately.

### Which clustering algorithm is the best for time series clustering?

I think there is no simple answer to this question and I can only talk about what works well from my experience with energy data.

My suggestion:

When dealing with hourly resolution data using simple k-means works very well (used in this tutorial).

When dealing with higher resolution data such as 15-min use k-means with dynamic time warping (DTW) as a distance measure or k-Shape.

k-means with DTW as a distance measure and k-shape are both implemented in tslearn Python library. A great explanation of why DTW is an appropriate distance measure for time series data is available here. For mathematical explanation see this paper.

k-Shape algorithm was proposed here. If you want to understand the underlying concepts read the original paper, whereas authors also provide a comprehensive literature review about clustering time series data.

In this tutorial I used simple k-means implemented in sklearn.

### How to determine the right number of clusters?

k-means (with Euclidian distances and DTW) and k-Shape both need a number of clusters as an input parameter.

I think that the visual examination of the results is still the most convenient approach for determining the right number of clusters.

You can also plot cost depending on a number of clusters as I did in a Jupyter Notebook and try to identify “the elbow”, whereas in many cases it is not clear where the location of the elbow is and it makes it harder to chose a number of clusters. You can watch Andrew's Ng explanation here.

Next, the tutorial is divided into three tasks, depending on how each consumer is represented.

In further examples only winter time is analyzed.

## Task 1: Clustering daily profiles

If every consumer has m hourly measurements and there are i consumers, then j = (m/24)*i denotes the number of observations in a dataset matrix X, where X ∈ R^(jx24). This model has 24 features, which corresponds to 24 hourly values for each day – every row represents the data for one day. The result of a clustering algorithm provides the corresponding cluster for every observation in a dataset matrix X.

Cluster centroids are shown below, so that the shape of different clusters can be compared.

Figure below shows all profiles in a particular cluster, that can be used for detailed evaluation of the results.

## Task 2: Clustering mean daily profiles

If every consumer has m hourly measurements and there are i consumers, mean daily profile has to be calculated for each consumer. The resulting matrix X has i observations (same as number of consumers) and 24 columns (for every hour of the day), such that X ∈ R^(ix24). The result of the clustering algorithm provides the corresponding cluster for every observation / consumer in a dataset matrix X.

Cluster centroids are shown below, so that the shape of different clusters can be compared. Here the data was separated to working days and weekends. Results below are shown only for working days, whereas in a provided Jupyter notebook, weekeends are also analyzed.

Figure below shows all profiles in a particular cluster, that can be used for detailed evaluation of the results.

## Task 3: Clustering mean weekly profiles

If every consumer has m hourly measurements and there are i consumers, mean weekly profile has to be calculated for each consumer. The resulting matrix X has i observations (same as number of consumers) and 24*7 columns, such that X ∈ R^(ix168). The result of a clustering algorithm provides the corresponding cluster for every observation/consumer in a dataset matrix X.

Figure below shows all profiles in a particular cluster, that can be used for detailed evaluation of the results.

## Conclusion

In this blog I explained how to apply clustering to smart meter data. As you can see, the results depend on how you represent each consumer (with daily, mean daily or mean weekly profiles).

Which type of data representation do I suggest?

It depends on your application, keep in mind that using mean profiles averages out the differences between different days/weeks, but the results are easier to interpret. I suggest trying all three types at least to better understand the data. In the end, you can still chose the one that fits your final goal best.

Further you can try using K-means with DTW as a distance measure or k-shape algorithm as mentioned above. Just use the same code I provided and change the algorithm. Let me know in the comments below if you tried it.

If you find this blog useful, please share it with others and let me know your thoughts on a LinkedIn!

If you want to connect with other experts working in this field, join my LinkedIn group AI in Smart Grids where I post about this topic!

## Comments