Skip to contents

This function performs k-means clustering on spatial or non-spatial data and determines the optimal number of clusters based on either the within-cluster sum of squares or the average silhouette width.

Usage

optimal_kmeans_cluster(
  data,
  spatial = TRUE,
  coords = NULL,
  max_cluster = 2L,
  method = c("wss", "silhouette")
)

Arguments

data

The input data for clustering. It can be an 'sf' object, a 'SpatialPoints' object, or a data frame with coordinates.

spatial

Logical indicating whether the input data is spatial (default is TRUE). If set to FALSE, the data is assumed to be non-spatial and the clustering is performed on the provided coordinates.

coords

The column names of the coordinates in the data (required if spatial is set to FALSE).

max_cluster

The maximum number of clusters to consider.

method

The method to determine the optimal number of clusters. It can be "wss" (within-cluster sum of squares) or "silhouette" (average silhouette width).

Value

A list containing the clustering results (data frame) and the plot of the selected method (ggplot object).

Examples

if (FALSE) {
data("landcover")
okc <- optimal_kmeans_cluster(data = landcover,spatial = TRUE,coords = NULL,
                              max_cluster = 15, method =  "wss")
# look at data
okc$data
}