Below are different ways to add new columns to dataframe in PySpark:
- withColumn and lit
- df.withColumn("NewColumnName", lit("default value for new column"))
- withColumn and col (Derived column)
- df.withColumn("NewColumnName", col("Column1") * col("Column2"))
- select
- df.select(lit("default column value").alias("NewColumnName"), col("Column1"), col("Column2"))
No comments:
Post a Comment