Jobs
JuliaHub.jl can be used to both submit new jobs, and to inspect running or finished jobs.
- Submitting batch jobs
- Query, extend, kill
- Waiting on jobs
- Accessing job outputs
- Opening ports on batch jobs
Submitting batch jobs
A common use case for this package is to programmatically submit Julia scripts as batch jobs to JuliaHub, to start non-interactive workloads. In a nutshell, these are Julia scripts, together with an optional Julia environment, that get executed on the allocated hardware.
The easiest way to start a batch job is to submit a single Julia script, which can optionally also include a Julia environment with the job. However, for more complex jobs, with multiple inputs files etc., appbundles are likely more suitable.
Script jobs
The simplest job one can submit is a humble Julia script, together with an optional Julia environment (i.e. Project.toml
, Manifest.toml
, and/or Artifacts.toml
). These jobs can be created with the JuliaHub.@script_str
string macro, for inline instantiation:
JuliaHub.submit_job(
JuliaHub.script"""
@warn "Hello World!"
""",
)
JuliaHub.Job: jr-xf4tslavut (Completed)
submitted: 2023-03-15T07:56:50.974+00:00
started: 2023-03-15T07:56:51.251+00:00
finished: 2023-03-15T07:56:59.000+00:00
files:
- code.jl (input; 3 bytes)
- code.jl (source; 3 bytes)
- Project.toml (project; 244 bytes)
- Manifest.toml (project; 9056 bytes)
outputs: "{}"
Alternatively, they can be created with the script
function, which can load the Julia code from a script file:
JuliaHub.submit_job(
JuliaHub.script("myscript.jl"),
)
JuliaHub.Job: jr-xf4tslavut (Completed)
submitted: 2023-03-15T07:56:50.974+00:00
started: 2023-03-15T07:56:51.251+00:00
finished: 2023-03-15T07:56:59.000+00:00
files:
- code.jl (input; 3 bytes)
- code.jl (source; 3 bytes)
- Project.toml (project; 244 bytes)
- Manifest.toml (project; 9056 bytes)
outputs: "{}"
The string macro also picks up the currently running environment (i.e. Project.toml
, Manifest.toml
, and Artifacts.toml
files), which then gets instantiated on JuliaHub when the script is started. If necessary, this can be disabled by appending the noenv
suffix to the string macro.
JuliaHub.script"""
@warn "Hello World!"
"""noenv
JuliaHub.BatchJob:
code = """
@warn "Hello World!"
"""
With the script
function, you can also specify a path to directory containing the Julia package environment, if necessary.
If an environment is passed with the job, it gets instantiated on the JuliaHub node, and the script is run in that environment. As such, any packages that are not available in the package registries or added via public Git URLs will not work. If that is the case, appbundles can be used instead to submit jobs that include private or local dependencies.
Appbundles
A more advanced way of submitting a batch job is as an appbundle, which "bundles up" a whole directory and submits it together with the script. The Julia environment in the directory is also immediately added into the bundle.
An appbundle can be constructed with the appbundle
function, which takes as arguments the path to the directory to be bundled up, and a script within that directory. This is meant to be used for project directories where you have your Julia environment in the top level of the directory or repository.
For example, suppose you have a script at the top level of your project directory, then you can submit a bundle as follows:
JuliaHub.submit_job(
JuliaHub.appbundle(@__DIR__, "script.jl"),
ncpu = 4, memory = 16,
)
JuliaHub.Job: jr-xf4tslavut (Completed)
submitted: 2023-03-15T07:56:50.974+00:00
started: 2023-03-15T07:56:51.251+00:00
finished: 2023-03-15T07:56:59.000+00:00
files:
- code.jl (input; 3 bytes)
- code.jl (source; 3 bytes)
- Project.toml (project; 244 bytes)
- Manifest.toml (project; 9056 bytes)
outputs: "{}"
The bundler looks for a Julia environment (i.e. Project.toml
, Manifest.toml
, and/or Artifacts.toml
files) at the root of the directory. If the environment does not exist (i.e. the files are missing), one is created. When the job starts on JuliaHub, this environment is instantiated.
A key feature of the appbundle is that development dependencies of the environment (i.e. packages added with pkg> develop
or Pkg.develop()
) are also bundled up into the archive that gets submitted to JuliaHub (including any current, uncommitted changes). Registered packages are installed via the package manager via the standard environment instantiation, and their source code is not included in the bundle directly.
When the JuliaHub job starts, the working directory is set to the root of the unpacked appbundle directory. This should be kept in mind especially when launching a script that is not at the root itself, and trying to open other files from the appbundle in that script (e.g. with open
). You can still use @__DIR__
to load files relative to the script, and include
s also work as expected (i.e. relative to the script file).
Finally, a .juliabundleignore
file can be used to exclude certain directories, by adding the relevant globs, similar to how .gitignore
files work. In addition, .git
directories are also automatically excluded from the bundle.
Examining job configuration
The dryrun
option to submit_job
can be used to inspect the full job workload configuration that would be submitted to JuliaHub.
JuliaHub.submit_job(
JuliaHub.script"""
println("hello world")
""",
ncpu = 4, memory = 8,
env = Dict("ARG" => "value"),
dryrun = true
)
JuliaHub.WorkloadConfig:
application:
JuliaHub.BatchJob:
code = """
println("hello world")
"""
sha256(project_toml) = 62aca0c4b58726ab88c7beaa448e4ca3d51ba68c2d4f9c244b22e09dfe2919d1
sha256(manifest_toml) = 8a45e28aaeac067142b495ff5d7037cb795afe1c4ef7277c25359f2c45b73a1d
compute:
JuliaHub.ComputeConfig
Node: 3.5 GHz Intel Xeon Platinum 8375C
- GPU: no
- vCores: 4
- Memory: 16 Gb
- Price: 0.33 $/hr
Process per node: true
Number of nodes: 1
timelimit = 1 hour,
env:
ARG: value
Query, extend, kill
The package has function that can be used to interact with running and past jobs. The jobs
function can be used to list jobs, returning an array of Job
objects.
julia> js = JuliaHub.jobs(limit=3)
3-element Vector{JuliaHub.Job}: JuliaHub.job("jr-eezd3arpcj") JuliaHub.job("jr-novcmdtiz6") JuliaHub.job("jr-3eka6z321p")
julia> js[1]
JuliaHub.Job: jr-eezd3arpcj (Completed) submitted: 2023-03-15T07:56:50.974+00:00 started: 2023-03-15T07:56:51.251+00:00 finished: 2023-03-15T07:56:59.000+00:00 files: - code.jl (input; 3 bytes) - code.jl (source; 3 bytes) - Project.toml (project; 244 bytes) - Manifest.toml (project; 9056 bytes) outputs: "{}"
If you know the name of the job, you can also query the job directly with job
.
julia> job = JuliaHub.job("jr-eezd3arpcj")
JuliaHub.Job: jr-eezd3arpcj (Completed) submitted: 2023-03-15T07:56:50.974+00:00 started: 2023-03-15T07:56:51.251+00:00 finished: 2023-03-15T07:56:59.000+00:00 files: - code.jl (input; 3 bytes) - code.jl (source; 3 bytes) - Project.toml (project; 244 bytes) - Manifest.toml (project; 9056 bytes) - outdir.tar.gz (result; 632143 bytes) outputs: "{\"result_variable\": 1234, \"another_result\": \"value\"}\n"
julia> job.status
"Completed"
julia> JuliaHub.isdone(job)
true
Similarly, the kill_job
function can be used to stop a running job, and the extend_job
function can be used to extend the job's time limit.
Waiting on jobs
A common pattern in a script is to submit one or more jobs, and then wait until the jobs complete, to then process their outputs. isdone
can be used to see if a job has completed.
julia> job = JuliaHub.job("jr-novcmdtiz6")
JuliaHub.Job: jr-novcmdtiz6 (Running) submitted: 2023-03-15T07:56:50.974+00:00 started: 2023-03-15T07:56:51.251+00:00 finished: 2023-03-15T07:56:59.000+00:00 files: - code.jl (input; 3 bytes) - code.jl (source; 3 bytes) - Project.toml (project; 244 bytes) - Manifest.toml (project; 9056 bytes) outputs: "{}"
julia> JuliaHub.isdone(job)
false
The wait_job
function also provides a convenient way for a script to wait for a job to finish.
julia> job = JuliaHub.wait_job("jr-novcmdtiz6")
JuliaHub.Job: jr-novcmdtiz6 (Completed) submitted: 2023-03-15T07:56:50.974+00:00 started: 2023-03-15T07:56:51.251+00:00 finished: 2023-03-15T07:56:59.000+00:00 files: - code.jl (input; 3 bytes) - code.jl (source; 3 bytes) - Project.toml (project; 244 bytes) - Manifest.toml (project; 9056 bytes) outputs: "{}"
julia> JuliaHub.isdone(job)
true
Accessing job outputs
There are two ways a JuliaHub job can store outputs that are directly related to a specific job[1]:
- Small, simple outputs can be stored by setting the
ENV["RESULTS"]
environment variable. Conventionally, this is often set to a JSON object, and will act as a dictionary of key value pairs. - Files or directories can be uploaded by setting the
ENV["RESULTS_FILE"]
to a local file path on the job. Note that directories are combined into a single tarball when uploaded.
The values set via the RESULTS
environment variable can be accessed with the .results
field of a Job
object:
julia> job.results
"{\"user_param\": 2, \"output_value\": 4}\n"
As the .results
string is often a JSON object, you can use the the JSON.jl or JSON3.jl packages to easily parse it. For example
julia> import JSON
julia> JSON.parse(job.results)
Dict{String, Any} with 2 entries: "user_param" => 2 "output_value" => 4
When it comes to job result files, they can all be accessed via the .files
field.
julia> job.files
4-element Vector{JuliaHub.JobFile}: JuliaHub.job_file(JuliaHub.job("jr-novcmdtiz6"), :input, "code.jl") JuliaHub.job_file(JuliaHub.job("jr-novcmdtiz6"), :source, "code.jl") JuliaHub.job_file(JuliaHub.job("jr-novcmdtiz6"), :project, "Project.toml") JuliaHub.job_file(JuliaHub.job("jr-novcmdtiz6"), :project, "Manifest.toml")
The job_files
function can be used to filter down to specific file types.
julia> JuliaHub.job_files(job, :result)
JuliaHub.JobFile[]
And if you know the name of the file, you can also use the job_files
to get the specific JobFile
object for a particular file directly.
julia> jobfile = JuliaHub.job_file(job, :result, "outdir.tar.gz")
To actually fetch the contents of a file, you can use the download_job_file
function on the JobFile
objects.
Opening ports on batch jobs
If supported for a given product and user, you can expose a single port on the job serving a HTTP server, to do HTTP requests to the job from the outside. This could be used to run "interactive" jobs that respond to user inputs, or to poll the job for data.
For example, the following job would run a simple Oxygen.jl-based server that exposes a simple API at the /
path.
job = JuliaHub.submit_job(
JuliaHub.script"""
using Oxygen, HTTP
PORT = parse(Int, ENV["PORT"])
@get "/" function(req::HTTP.Request)
return "success"
end
serve(; host="0.0.0.0", port = PORT)
""",
expose = 8080,
)
JuliaHub.Job: jr-xf4tslavut (Completed)
submitted: 2023-03-15T07:56:50.974+00:00
started: 2023-03-15T07:56:51.251+00:00
finished: 2023-03-15T07:56:59.000+00:00
hostname: afyux.launch.juliahub.app
files:
- code.jl (input; 3 bytes)
- code.jl (source; 3 bytes)
- Project.toml (project; 244 bytes)
- Manifest.toml (project; 9056 bytes)
outputs: "{}"
Note that, unlike a usual batch job, this job has a .hostname
property, that will point to the DNS hostname that can be used to access the server exposed by the job (see also the relevant reference section).
Once the job has started and the Oxygen-based server has started serving the page, you can perform HTTP.jl requests against the job with the JuliaHub.request
function, which is thin wrapper around the HTTP.request
function that sets up the necessary authentication headers and constructs the full URL.
julia> JuliaHub.request(job, "GET", "/")
HTTP.Messages.Response: """ HTTP/1.1 200 OK Content-Type: text/plain; charset=utf-8 Content-Length: 7 success"""
When the job is starting up or if the HTTP server in the job is not running, you can expect a 502 Bad Gateway
HTTP response from the job domain.
If the server can serve a HTML page, then you can also access the job in the browser. The web UI will also have a "Connect" link, like for other interactive applications.
Jobs that expose ports may be priced differently per hour than batch jobs that do not open ports.
- 1You can also e.g. upload datasets etc. But in that case the resulting data is not, strictly speaking, related to a specific job.