Horizontal scaling means changing how many instances of an application are running, not changing how big each instance is. Therefore, the best description is C: adding/removing application instances of the same application to meet demand. In Kubernetes, “instances” typically correspond to Pod replicas managed by a controller like a Deployment. When you scale horizontally, you increase or decrease the replica count, which increases or decreases total throughput and resilience by distributing load across more Pods.
Option A is about cluster/node scaling (adding or removing nodes), which is infrastructure scaling typically handled by a cluster autoscaler in cloud environments. Node scaling can enable more Pods to be scheduled, but it’s not the definition of horizontal application scaling itself. Option D describes vertical scaling—adding/removing CPU or memory resources to a given instance (Pod/container) by changing requests/limits or using VPA. Option B is vague and not the standard definition.
Horizontal scaling is a core cloud-native pattern because it improves availability and elasticity. If one Pod fails, other replicas continue serving traffic. In Kubernetes, scaling can be manual (kubectl scale deployment ... --replicas=N) or automatic using the Horizontal Pod Autoscaler (HPA). HPA adjusts replicas based on observed metrics like CPU utilization, memory, or custom/external metrics (for example, request rate or queue length). This creates responsive systems that can handle variable traffic.
From an architecture perspective, designing for horizontal scaling often means ensuring your application is stateless (or manages state externally), uses idempotent request handling, and supports multiple concurrent instances. Stateful workloads can also scale horizontally, but usually with additional constraints (StatefulSets, sharding, quorum membership, stable identity).
So the verified definition and correct choice is C.
=========
Submit