Neptune comes with a simple queuing mechanism which can be used for job remote execution.
NOTE: To execute Neptune job on a remote infrastructure you need to have a shared storage. This can be any storage that can be mounted as a file system on your operating system. Enqueuing environment and execution environment have to have access to the shared storage. This storage is used for storing snapshots of your code.
When you want to execute the job on a remote infrastructure you should use enqueue command.
neptune.yaml, you should set base path of Neptune Storage to the shared storage:
enqueue an experiment, it is created as queued.
neptune enqueue \ > > Job enqueued, id: 78b7ba83-9bb8-4405-b0b7-793fae2b566b > > To browse the job, follow: > https://YOUR_NEPTUNE_IP:YOUR_NEPTUNE_PORT/#dashboard/job/78b7ba83-9bb8-4405-b0b7-793fae2b566b >
Then on a remote host you can run neptune exec.
neptune exec 78b7ba83-9bb8-4405-b0b7-793fae2b566b
To avoid getting access and logging to the remote execution environment to execute some job
you can run simple script that will execute enqueued jobs automatically.
To do that you should use neptune exec with
Example of a worker script:
#!/bin/bash while true; do neptune exec --resources gpu scikit-learn tensorflow sleep 1m done
This script will infinitely execute jobs that were enqueued with requirements
which are a subset of resources defined in the
For example a job that was enqueued with
--requirements gpu tensorflow
will be executed by this worker script, but a job that
was enqueued with
--requirements gpu keras will not be executed by this worker, because
worker did not declare
keras as a resource.