pyspark.pandas.DataFrame.first¶
- 
DataFrame.first(offset: Union[str, pandas._libs.tslibs.offsets.DateOffset]) → pyspark.pandas.frame.DataFrame[source]¶
- Select first periods of time series data based on a date offset. - When having a DataFrame with dates as index, this function can select the first few rows based on a date offset. - Parameters
- offsetstr or DateOffset
- The offset length of the data that will be selected. For instance, ‘3D’ will display all the rows having their index within the first 3 days. 
 
- Returns
- DataFrame
- A subset of the caller. 
 
- Raises
- TypeError
- If the index is not a - DatetimeIndex
 
 - Examples - >>> index = pd.date_range('2018-04-09', periods=4, freq='2D') >>> psdf = ps.DataFrame({'A': [1, 2, 3, 4]}, index=index) >>> psdf A 2018-04-09 1 2018-04-11 2 2018-04-13 3 2018-04-15 4 - Get the rows for the last 3 days: - >>> psdf.first('3D') A 2018-04-09 1 2018-04-11 2 - Notice the data for 3 first calendar days were returned, not the first 3 observed days in the dataset, and therefore data for 2018-04-13 was not returned.