pyspark.pandas.DataFrame.to_records¶
- 
DataFrame.to_records(index: bool = True, column_dtypes: Union[str, numpy.dtype, pandas.core.dtypes.base.ExtensionDtype, Dict[Union[Any, Tuple[Any, …]], Union[str, numpy.dtype, pandas.core.dtypes.base.ExtensionDtype]], None] = None, index_dtypes: Union[str, numpy.dtype, pandas.core.dtypes.base.ExtensionDtype, Dict[Union[Any, Tuple[Any, …]], Union[str, numpy.dtype, pandas.core.dtypes.base.ExtensionDtype]], None] = None) → numpy.recarray[source]¶
- Convert DataFrame to a NumPy record array. - Index will be included as the first field of the record array if requested. - Note - This method should only be used if the resulting NumPy ndarray is expected to be small, as all the data is loaded into the driver’s memory. - Parameters
- indexbool, default True
- Include index in resulting record array, stored in ‘index’ field or using the index label, if set. 
- column_dtypesstr, type, dict, default None
- If a string or type, the data type to store all columns. If a dictionary, a mapping of column names and indices (zero-indexed) to specific data types. 
- index_dtypesstr, type, dict, default None
- If a string or type, the data type to store all index levels. If a dictionary, a mapping of index level names and indices (zero-indexed) to specific data types. This mapping is applied only if index=True. 
 
- Returns
- numpy.recarray
- NumPy ndarray with the DataFrame labels as fields and each row of the DataFrame as entries. 
 
 - See also - DataFrame.from_records
- Convert structured or record ndarray to DataFrame. 
- numpy.recarray
- An ndarray that allows field access using attributes, analogous to typed columns in a spreadsheet. 
 - Examples - >>> df = ps.DataFrame({'A': [1, 2], 'B': [0.5, 0.75]}, ... index=['a', 'b']) >>> df A B a 1 0.50 b 2 0.75 - >>> df.to_records() rec.array([('a', 1, 0.5 ), ('b', 2, 0.75)], dtype=[('index', 'O'), ('A', '<i8'), ('B', '<f8')]) - The index can be excluded from the record array: - >>> df.to_records(index=False) rec.array([(1, 0.5 ), (2, 0.75)], dtype=[('A', '<i8'), ('B', '<f8')]) - Specification of dtype for columns is new in pandas 0.24.0. Data types can be specified for the columns: - >>> df.to_records(column_dtypes={"A": "int32"}) rec.array([('a', 1, 0.5 ), ('b', 2, 0.75)], dtype=[('index', 'O'), ('A', '<i4'), ('B', '<f8')]) - Specification of dtype for index is new in pandas 0.24.0. Data types can also be specified for the index: - >>> df.to_records(index_dtypes="<S2") rec.array([(b'a', 1, 0.5 ), (b'b', 2, 0.75)], dtype=[('index', 'S2'), ('A', '<i8'), ('B', '<f8')])