Tasks run in parallel via slots.
Correct. Given the assumption, an executor then has one or more "slots", defined by the equation spark.executor.cores / spark.task.cpus. With the executor's resources divided into slots, each task
takes up a slot and multiple tasks can be executed in parallel.
Slot is another name for executor.
No, a slot is part of an executor.
An executor runs on a single core.
No, an executor can occupy multiple cores. This is set by the spark.executor.cores option.
There must be more slots than tasks.
No. Slots just process tasks. One could imagine a scenario where there was just a single slot for multiple tasks, processing one task at a time. Granted – this is the opposite of what Spark should be
used for, which is distributed data processing over multiple cores and machines, performing many tasks in parallel.
There must be less executors than tasks.
No, there is no such requirement.
More info: Spark Architecture | Distributed Systems Architecture (https://bit.ly/3x4MZZt)
Submit