|
In Spark, column expressions (e.g. current_date()) do not show results until they are put into dataframes as columns and then the dataframe is instructed to be shown.
Consider the following examples:
spark.range(1) - creating a dataframe
.select(F.current_date()) - selecting a column created using function current_date
.show() - printing the dataframe
from pyspark.sql import functions as F
spark.range(1).select(F.current_date()).show()
# +--------------+
# |current_date()|
# +--------------+
# | 2023-08-04|
# +--------------+
spark.sql("select current_date()") - creating both dataframe and column using SQL expression
.show() - printing the dataframe
spark.sql("select current_date()").show()
# +--------------+
# |current_date()|
# +--------------+
# | 2023-08-04|
# +--------------+
.head() - accessing the dataframe's first row (as a pyspark.sql.types.Row object)
[0] - accessing the first element ("column") of the row
spark.sql("select current_date()").head()[0]
# datetime.date(2023, 8, 4)
In Databricks, display(df) should also work, but for this you must create the df, e.g.:
display(spark.sql("select current_date()"))
(责任编辑:)
|