pyspark.sql.functions.array_union¶
- 
pyspark.sql.functions.array_union(col1: ColumnOrName, col2: ColumnOrName) → pyspark.sql.column.Column[source]¶
- Collection function: returns an array of the elements in the union of col1 and col2, without duplicates. - New in version 2.4.0. - Changed in version 3.4.0: Supports Spark Connect. - Parameters
- Returns
- Column
- an array of values in union of two arrays. 
 
 - Examples - >>> from pyspark.sql import Row >>> df = spark.createDataFrame([Row(c1=["b", "a", "c"], c2=["c", "d", "a", "f"])]) >>> df.select(array_union(df.c1, df.c2)).collect() [Row(array_union(c1, c2)=['b', 'a', 'c', 'd', 'f'])]