pyspark.sql.functions.log

pyspark.sql.functions.log(arg1: Union[ColumnOrName, float], arg2: Optional[ColumnOrName] = None) → pyspark.sql.column.Column[source]

Returns the first argument-based logarithm of the second argument.

If there is only one argument, then this takes the natural logarithm of the argument.

New in version 1.5.0.

Changed in version 3.4.0: Supports Spark Connect.

Parameters
arg1Column, str or float

base number or actual number (in this case base is e)

arg2Column, str or float

number to calculate logariphm for.

Returns
Column

logariphm of given value.

Examples

>>>
>>> from pyspark.sql import functions as sf
>>> df = spark.sql("SELECT * FROM VALUES (1), (2), (4) AS t(value)")
>>> df.select(sf.log(2.0, df.value).alias('log2_value')).show()
+----------+
|log2_value|
+----------+
|       0.0|
|       1.0|
|       2.0|
+----------+

And Natural logarithm

>>>
>>> df.select(sf.log(df.value).alias('ln_value')).show()
+------------------+
|          ln_value|
+------------------+
|               0.0|
|0.6931471805599453|
|1.3862943611198906|
+------------------+