Static Code Analysis in JuliaHub

At JuliaHub, we are committed to helping you write high-quality, secure, and efficient Julia code. That's why we've integrated static code analysis capabilities directly into our platform.

What is Static Code Analysis?

Static code analysis is like having an intelligent proofreader for your software. It's a method of examining your code for potential issues, bugs, and vulnerabilities without actually running it. By doing so, it helps you identify and address problems early in the development cycle, much like catching a typo before a document is published.

Benefits of Static Code Analysis

Improved Security and Code Quality: Proactively identifying vulnerabilities and errors leads to more secure and reliable software.
Early Issue Detection: Catching problems during development saves significant time and effort compared to discovering them during testing or after deployment.
Resource Efficiency: By reducing the need for extensive runtime testing, static analysis conserves computational resources and developer time.
Enforcing Standards: It helps ensure your code adheres to established coding standards and best practices, resulting in more consistent and maintainable codebases.

In essence, static code analysis is an invaluable tool for enhancing the overall quality, security, and efficiency of your software development process.

Semgrep in JuliaHub

JuliaHub leverages Semgrep, a fast, open-source static analysis tool. Semgrep is designed to support multiple programming languages, including Julia, and allows for the creation and reuse of rules to match specific patterns or "anti-patterns" within your code. It operates primarily at the syntactic level, providing deep insights into your codebase.

How Semgrep is Used in JuliaHub

Semgrep is integrated into JuliaHub in two key ways to provide comprehensive code analysis:

On-demand for Projects: As a developer, you have the flexibility to initiate a Semgrep scan on your project code whenever you need to check its quality and security.
Periodically for Packages: Semgrep automatically runs on packages served on JuliaHub, ensuring continuous monitoring and maintenance of code quality for the broader Julia ecosystem.

Analysis Categories

Semgrep's analysis on JuliaHub covers a wide range of important categories, including:

Security: Identifying potential security vulnerabilities.
Correctness: Highlighting logic errors or incorrect implementations.
Best Practices: Ensuring adherence to recommended coding standards.
Math: Checking for common mathematical pitfalls.
Formatting: Assisting with consistent code style.

Issue Severity Levels

Semgrep reports issues at different levels of severity to help you prioritize your attention:

Error: Indicates a serious problem found against a rule, requiring immediate attention.
Warning: Signals a notable problem found against a rule, suggesting an area for improvement.
Note: Points out a minor problem or an opportunity to enhance your code.

Comprehensive Ruleset

JuliaHub utilizes a robust set of Semgrep rules for its static analysis, designed to cover a broad spectrum of potential issues:

60+ Public and Proprietary Rules: This extensive collection includes 23 rules specifically targeting Common Weakness Enumerations (CWEs), addressing a wide array of potential security vulnerabilities.
Support for Custom Rulesets: For even greater flexibility, you can extend the existing rules with your own tailored rules to align with specific project requirements and internal coding practices.

These rules encompass a variety of critical areas, such as:

AWS-related vulnerabilities
HTTP/WebSockets interactions
JSON Web Token (JWT) handling
Julia Language-specific constructs
MbedTLS/OpenSSL usage
Operating System interactions
SQL queries
XML parsing

You can explore the public ruleset that JuliaHub utilizes by visiting: https://github.com/JuliaComputing/semgrep-rules-julia

Tutorial

Static Analysis of Projects

Let's start by creating a simple project.

Go to https://nightly.juliahub.dev/ui/Projects.

Click on Create Project -> Generic Project -> "Hello Scan" -> Next -> Create.

Launch the project in the VSCode IDE.

Create a file called example.jl and paste the following code:

module Example

const pi = 3.14

export hello, domath

"""
    hello(who::String)

Return "Hello, $who".
"""
hello(who::String) = "Hello, $who"

"""
    domath(x::Number)

Return x + 5.
"""
domath(x::Number) = x + 5

end

Go to the project page, go to Static Analysis, and click Start Scan.

Once the scan is finished, you should be able to see the scan results. As per the report, we have one issue; specifically, there is an instance of a global variable which is lowercase.

To check the exact location of the issue, go to Locations and click the eye icon, which displays more details about the issue.

Looks like const pi = 3.14 should be changed to const PI = 3.14.

Now run the scan again. It should show no issues.

Note: Users can run a scan of projects only if they are either Owner or Editor.

Static Analysis of Registry

Once a registry is configured for static analysis, a scan is run for the latest version of every package in the registry. Users can view the results of the scan on the package page.

Viewing scan results for a package in a registry

Users can also browse all the scans at the level of the registry. For example, if we want to see results for a test registry called JuliaTeamDemoRegistry, go to Registries -> JuliaTeamDemoRegistry -> Click on Code Scan.

You should be able to see a histogram of severity counts across time.

Histogram of severity counts for a registry

Below this, you can see the scan results for all packages in the registry.

Detailed scan results for all packages in a registry