Products
In-IDE
IDE extension that lets you fix coding issues before they exist!
Discover SonarQube for IDE
SaaS
Setup is effortless and analysis is automatic for most languages
Discover SonarQube Cloud
Self-Hosted
Fast, accurate analysis; enterprise scalability
Discover SonarQube Server

Python static code analysis

Unique rules to find Bugs, Vulnerabilities, Security Hotspots, and Code Smells in your PYTHON code

Filtered: 13 rules found

pyspark

Impact

Clean code attribute

PySpark Window functions should always specify a frame

consistency - conventional

maintainability

Code Smell

This rule raises an issue when a PySpark Window function is used without defining a frame.

Why is this an issue?

How can I fix it?

More Info

In PySpark, a window defines a set of rows related to the current row, enabling calculations like running totals or rankings across these rows. It is useful for performing complex data analysis tasks by allowing computations over partitions of data while preserving the context of each row.

Depending on the operation you’re willing to compute, you need to define a frame for the window. A frame defines the range of rows that are used in each computation. If you don’t define a frame, a default frame is used.

The default frame that will be used depends on whether ordering is defined. When ordering is not defined, an unbounded window frame (rowFrame, unboundedPreceding, unboundedFollowing) is used by default. When ordering is defined, a growing window frame (rangeFrame, unboundedPreceding, currentRow) is used by default.

This can lead to unexpected results if the default frame is not what you intended. It is recommended to always define a frame when using a window function to avoid confusion and ensure the expected results.

Available In:

Catch issues on the fly,
in your IDE

Detect issues in your GitHub, Azure DevOps Services, Bitbucket Cloud, GitLab repositories

Analyze code in your
on-premise CI

In-IDE

SaaS

Self-Hosted