pyspark.RDD.top¶
- 
RDD.top(num: int, key: Optional[Callable[[T], S]] = None) → List[T][source]¶
- Get the top N elements from an RDD. - New in version 1.0.0. - Parameters
- numint
- top N 
- keyfunction, optional
- a function used to generate key for comparing 
 
- Returns
- list
- the top N elements 
 
 - See also - Notes - This method should only be used if the resulting array is expected to be small, as all the data is loaded into the driver’s memory. - It returns the list sorted in descending order. - Examples - >>> sc.parallelize([10, 4, 2, 12, 3]).top(1) [12] >>> sc.parallelize([2, 3, 4, 5, 6], 2).top(2) [6, 5] >>> sc.parallelize([10, 4, 2, 12, 3]).top(3, key=str) [4, 3, 2]