Neptune Architecture

Diagram of Neptune's architecture


Neptune Components

Neptune is a machine learning platform consisting of the following components:

Neptune Server is a component that stores jobs’ metadata in a database and enables their retrieval. It exposes two interfaces: REST API for storing and loading jobs’ metadata and WebSocket API for real-time updates.

Neptune Web UI is a web-based application for browsing and managing jobs executed with Neptune. It exposes information about jobs including: real-time charts, channels’ values, execution status and job parameters. It also enables managing registered jobs and browsing through history of all jobs executed using Neptune. Neptune Web UI uses both REST API and WebSockets API exposed by Neptune Server.

Neptune CLI is a command line interface for managing and displaying information about Neptune jobs. Among others, it allows to enqueue, run and abort jobs. It is a CLI’s responsibility to save the snapshot of job’s source code on Shared Storage before execution.

Neptune Client Library is a programming library to be imported by and used in a job’s code. The library exposes an API that allows the user to communicate with Neptune from the job’s code. Neptune Client Library communicates with Neptune server via REST API and WebSockets. Currently, there are Client Libraries for Python 2.7 & 3.5, R and Java.

External Components

The architecture diagram also contains external components:

User’s Desktop represents the machine where the user develops his code and enqueues the jobs. This machine must have both Neptune CLI and Neptune Client Library installed.

Compute Node represents the machine where Neptune jobs are run. It can be any machine with Neptune CLI and Neptune Client Library installed. Neptune can communicate with multiple Compute Nodes running at the same time. In many cases Compute Node is the same machine as User’s Desktop.

Shared Storage represents the storage layer shared between User’s Desktop and Compute Nodes. Neptune supports any storage that can be mounted on the local filesystem, for example: NFS, HDFS. Shared Storage is a component used to save and retrieve job’s source code, logs and any data generated during job execution.

Execution of a Neptune Job

Let’s track the flow of a Neptune job from enqueueing it on User’s Desktop until its execution on Compute Node is completed.

  1. The user writes Python code to be executed as a Neptune job.

  2. The user enqueues the job on User’s Desktop using Neptune CLI. Neptune CLI sends the job metadata to Neptune Server and saves the job’s source code on Shared Storage. From now on, the user can access the new job’s dashboard in a Web browser, using Neptune Web UI.

  3. The user starts the previously enqueued job on Compute Node using Neptune CLI. Neptune CLI retrieves the job’s metadata from Neptune Server and the job’s source code from Shared Storage, spawns the job process and notifies Neptune Server that the job has started.

  4. During job execution, Neptune Client Library sends real-time job data to Neptune Server. All updates can be viewed live in the user’s browser.

  5. When the job is finished, the user gets a notification in his browser. The job’s logs and the generated data are saved on Shared Storage.