Caching means storing a copy of a partition on an executor, so it can be accessed quicker by subsequent operations, instead of having to be recalculated. cache() is a lazily-evaluated method of the
DataFrame. Since count() is an action (while filter() is not), it triggers the caching process.
Static notebook | Dynamic notebook: See test 2, QUESTION NO: 20 (Databricks import instructions)
Contribute your Thoughts:
Chosen Answer:
This is a voting comment (?). You can switch to a simple comment. It is better to Upvote an existing comment if you don't have anything to add.
Submit