Friday, November 18, 2022

How to filter a DataFrame using PySpark

 Multiple ways to filter dataframe data:

  1. filter
    1. df.filter(df.ColumnName ==VALUE)
    2. df.filter(col("ColumnName") == VALUE)
    3. df.filter((col("ColumnName1") == VALUE) | (col("ColumnName2") == VALUE))
    4. df.filter((col("ColumnName1") == VALUE) & (col("ColumnName2") == VALUE))
    5. df.filter(col("ColumnName") != VALUE)

No comments:

Post a Comment