# Cluster Sampling

The entire population of the study is divided into externally homogeneous but internally heterogeneous groups called clusters

## What is Cluster Sampling?

In statistics, cluster sampling is a sampling method in which the entire population of the study is divided into externally, homogeneous but internally, heterogeneous groups called clusters. Essentially, each cluster is a mini-representation of the entire population.

After identifying the clusters, certain clusters are chosen using simple random sampling while the others remain unrepresented in a study. After selecting the clusters, a researcher must choose the appropriate method to sample the elements from each selected group.

### Primary Sampling Methods

There are primarily two methods of sampling the elements in the cluster sampling method: one-stage and two-stage.

In one-stage sampling, all elements in each selected cluster are sampled. In two-stage sampling, simple random sampling is applied within each cluster to select a subsample of elements in each cluster.

The cluster method must not be confused with stratified sampling. In stratified sampling, the population is divided into mutually exclusive groups that are externally heterogeneous but internally homogeneous.

For example, in stratified sampling, a researcher may divide the population into two groups: males vs. females. Conversely, in cluster sampling, the clusters are similar to each other but with different internal composition.

The cluster method comes with a number of advantages over simple random sampling and stratified sampling. The advantages include:

#### 1. Requires fewer resources

Since cluster sampling selects only certain groups from the entire population, the method requires fewer resources for the sampling process. Therefore, it is generally cheaper than simple random or stratified sampling as it requires fewer administrative and travel expenses.

#### 2. More feasible

The division of the entire population into homogenous groups increases the feasibility of the sampling. Additionally, since each cluster represents the entire population, more subjects can be included in the study.

Despite its benefits, this method still comes with a few drawbacks, including:

#### 1. Biased samples

The method is prone to biases. If the clusters representing the entire population were formed under a biased opinion, the inferences about the entire population would be biased as well.

#### 2. High sampling error

Generally, the samples drawn using the cluster method are prone to higher sampling error than the samples formed using other sampling methods.