pyspark.sql.GroupedData¶
- 
class pyspark.sql.GroupedData(jgd, df)[source]¶
- A set of methods for aggregations on a - DataFrame, created by- DataFrame.groupBy().- New in version 1.3. - Methods - agg(*exprs)- Compute aggregates and returns the result as a - DataFrame.- apply(udf)- It is an alias of - pyspark.sql.GroupedData.applyInPandas(); however, it takes a- pyspark.sql.functions.pandas_udf()whereas- pyspark.sql.GroupedData.applyInPandas()takes a Python native function.- applyInPandas(func, schema)- Maps each group of the current - DataFrameusing a pandas udf and returns the result as a DataFrame.- avg(*cols)- Computes average values for each numeric columns for each group. - cogroup(other)- Cogroups this group with another group so that we can run cogrouped operations. - count()- Counts the number of records for each group. - max(*cols)- Computes the max value for each numeric columns for each group. - mean(*cols)- Computes average values for each numeric columns for each group. - min(*cols)- Computes the min value for each numeric column for each group. - pivot(pivot_col[, values])- Pivots a column of the current - DataFrameand perform the specified aggregation.- sum(*cols)- Computes the sum for each numeric columns for each group.