VS Code extension

A key tool for leveraging JuliaHub capabilities is the JuliaHub VSCode extension which provides a seamless way to execute julia scripts in the cloud directly from a local IDE. This tutorial highlights the key steps and features to best take advantage of its capabilities.

Getting started

To use JuliaHub through the VSCode extension, an account must first have been created on JuliaHub.

The Julia language support must also have been added to VSCode, as documented in the Julia extension documentation.

Then, the JuliaHub extension can be added (search for JuliaHub in the View > Extensions menu). After its installation, the JuliaHub extension logo should appear in the left menu panel. juliahub

Code development is made locally like for any other project. First clone the tutorial repo:

 git clone https://github.com/JuliaComputing/JuliaHubVSCodeExtensionTutorial.jl

Then open the tutorial folder in VSCode and ensure that the working directory points to the tutorial's root folder (using pwd() in Julia's REPL). To activate the project, press ] key in the REPL to use the package mode, then type the following command:

pkg> activate .

Accessing DataSets and Private Registries

The JuliaHub IDEs automatically configure your session to use JuliaHub as a DataSet store and have any private registries available. When using a local IDE, you must take a few additional steps to connect these features.

First, ensure that the Package Server is pointing to JuliaHub or your local instance. This may be configured either as an environment variable (JULIA_PKG_SERVER) or as the Julia: Package Server setting in the Julia VS Code extension. You may need to restart Julia for this to take affect, and confirm using Pkg.pkg_server():

julia> using Pkg

julia> Pkg.pkg_server()
"https://juliahub.com"

Note that if you have a dedicated company JuliaHub like example.juliahub.com, you want to use that here.

Registries

To access private registries, you need to add them:

julia> using Pkg

julia> Pkg.Registry.add()

Note that if you see a Warning: could not download https://juliahub.com/registries, you need to authenticate into JuliaHub. Do that with the JuliaHub package:

julia> Pkg.add("JuliaHub")

julia> using JuliaHub

julia> JuliaHub.authenticate()
JuliaHub.Authentication("https://juliahub.com", "user", *****)

DataSets

To connect to the JuliaHub DataSets, you must use both DataSets and the JuliaHubData packages:

julia> Pkg.add(["DataSets","JuliaHubData"])

julia> JuliaHubData.connect_juliahub_data_project()
JuliaHubData.JuliaHubDataProject [https://juliahub.com]:
  📄 # ...

If you see an error that JuliaHubData is not found in the project, manifest or registry, you must ensure you've added the private registries.

Executing a script

The code of this tutorial performs a simple simulation to estimate the value of π. The idea is to randomly generate 2-dimensional points over a square and then assess whether those points belongs or not to the inscribed circle (figure generated with plot-simul.jl).

simulating pi

The value of π can be inferred based on the following relationship between the areas of the square and circle:

\[\begin{aligned} \frac{n_○}{n_□} \approx \frac{A_○}{A_□} = \frac{\pi r^2}{4*r^2} = \frac{\pi}{4} \end{aligned}\]

The associated program to perform this estimation is in src/VSCodeExtension.jl:

function estimate_pi_single(n)
    n > 0 || throw(ArgumentError("number of iterations must be > 0, got $n"))
    incircle = 0
    for i = 1:n
        x, y = rand() * 2 - 1, rand() * 2 - 1
        if x^2 + y^2 <= 1
            incircle += 1
        end
    end
    return 4 * incircle / n
end

To use that code, a launcher script ./script_single.jl is included in the main folder and is adapted for a job launched on JuliaHub:

using VSCodeExtension
using JSON3

n = 1_000_000_000

stats = @timed begin
    estimate_pi_single(n)
end

@info "Finished computation. π estimate: " stats[:value]

results = Dict(
    :pi => stats[:value],
    :num_trials => n,
    :compute_time => stats[:time]
)

open("results.json", "w") do io
    JSON3.pretty(io, results)
end

ENV["RESULTS"] = JSON3.pretty(results)
ENV["RESULTS_FILE"] = "results.json"

The script can be submitted after configuring the JuliaHub extension menu: juliahub-menu-single

  • Package server: should already be set to default https://juliahub.com.
  • Bundle directory: refers to the path of the project or package (where Project.toml and Manifest.toml can be found). If not specified, the script will be executed in standalone, making the dependency packages or source code not available.
  • Julia script: the script to be executed.

Accessing results

The above script is like any other valid one for running locally, except for two environment variables with a special interpretation that get defined at the end:

  • RESULTS: an environment that accepts a json string (using for example JSON.json or JSON3.write). Values of that string are displayed in the in the logs accessible from the Actions menu associated with each job found in the Job List section.
  • RESULTS_FILE: a string path to a file containing the desired results. This results file is also accessible from the JuliaHub menu in the VSCode extension (Actions > Download results). The results file can also be downloaded on the online platform under the Compute > Job List section.

If multiple result files are desired, you can simply specify a directory as your RESULTS_FILE. JuliaHub will tar up everything that appears in that directory. After downloading it, the resulting tar file can then be unpacked using tar -xvf <tarball-file-name>

path_results = "$(@__DIR__)/results"
mkdir(path_results)
ENV["RESULTS_FILE"] = path_results
# ...save some files in path_results...

Distributed job

While the π simulation represents a trivial use case as it could have been easily run locally on any laptop, other jobs require higher compute resources. JuliaHub makes it seamless to execute computations on large clusters, thanks notably to the Distributed.jl functionalities.

using Distributed
function estimate_pi_distributed(n)
    n > 0 || throw(ArgumentError("number of iterations must be > 0, got $n"))
    incircle = @distributed (+) for i in 1:n
        x, y = rand() * 2 - 1, rand() * 2 - 1
        Int(x^2 + y^2 <= 1)
    end
    return 4 * incircle / n
end

To execute the job on a cluster, a distributed CPU job must be selected instead of the previous single-process CPU job. This will bring new options in the launcher configuration for the number of nodes (up to 100) and whether a Julia process should be launcher for each node or each vCPU.

By launching the script ./script_distributed.jl with the same configuration as the above, but this time using a distributed job with 10 nodes, the simulation completes in 1.35s, which is a 6.6 folds improvement over the previous single-process that took 8.94s.

Logging

As some jobs can take a long time to complete, being able to track their progress is a desirable feature. Such monitoring capabilities are made available simply through the usage of the common macros @info and @warn.

Live logs can be accessed by clicking 'Actions' -> 'Show logs' in JuliaHub's Job List section:

log-access

For example, in the context of an iterative job such as a simulation or the training of a model, the iteration iter and current evaluation measure metric can be logged by adding the following code within the loop:

@info "progress logging:" iter = i metric = stats[:value]

By displaying the desired tracked items using the above format, the variables iter and metric will be made available in the job logs, which has a built-in plotting capability. The plot below was captured from the logs of the script script-logging.jl and shows the estimate of π against the iteration id for various number of trials (ranging from 1K to 1G):

vscode-logging