Flower Species Prediction

Overview

In this example we will show you how easy it is to integrate a TensorFlow experiment with Neptune.

We will adapt the Quick Start from the TensorFlow library to show the ease of integration between the two tools. The example consists of a single Python file and uses TensorFlow to train and evaluate a simple neural network that predicts flower species (iris).

Integrating the code with the Neptune Client Library will allow us to run the code as a Neptune job, and to browse the experiment’s results in the Web UI. Neptune will store and display all metrics and graphs generated by TensorFlow in real time.

Dataset Information

Dataset: Iris.

Dataset size: 150 samples (120 samples of the training set and 30 samples of the test set).

Dataset description: The data set consists of 50 samples from each of three species of iris (Iris setosa, Iris virginica, and Iris versicolor). Each row contains the following data for each flower sample: sepal length, sepal width, petal length, petal width, and flower species. The flower species are represented as integers with 0 denoting Iris setosa, 1 denoting Iris versicolor, and 2 denoting Iris virginica.

Business purpose: Predict species of flowers.

Dataset credits: Ronald Fisher (1936) “The use of multiple measurements in taxonomic problems”.

From left to right: Iris setosa, Iris versicolor, and Iris virginica

Let’s Start Editing the Code!

To run the code from this example, you need to have the following installed:

If you want to download the code that is ready to run in Neptune, it’s available on GitHub.

To create Neptune job step by step follow this example.

Create a directory named flower-species-prediction. Below you can see the full code for the neural network classifier obtained from the TensorFlow documentation. Save this code in a file named main.py in the created flower-species-prediction directory.

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import tensorflow as tf
import numpy as np

# Data sets
IRIS_TRAINING = "iris_training.csv"
IRIS_TEST = "iris_test.csv"

# Load datasets.
training_set = tf.contrib.learn.datasets.base.load_csv_with_header(
    filename=IRIS_TRAINING,
    target_dtype=np.int,
    features_dtype=np.float32)
test_set = tf.contrib.learn.datasets.base.load_csv_with_header(
    filename=IRIS_TEST,
    target_dtype=np.int,
    features_dtype=np.float32)

# Specify that all features have real-value data
feature_columns = [tf.contrib.layers.real_valued_column("", dimension=4)]

# Build 3 layer DNN with 10, 20, 10 units respectively.
classifier = tf.contrib.learn.DNNClassifier(feature_columns=feature_columns,
                                            hidden_units=[10, 20, 10],
                                            n_classes=3,
                                            model_dir="/tmp/iris_model")

# Fit model.
classifier.fit(x=training_set.data,
               y=training_set.target,
               steps=2000)

# Evaluate accuracy.
accuracy_score = classifier.evaluate(x=test_set.data,
                                     y=test_set.target)["accuracy"]
print('Accuracy: {0:f}'.format(accuracy_score))

# Classify two new flower samples.
new_samples = np.array(
    [[6.4, 3.2, 4.5, 1.5], [5.8, 3.1, 5.0, 1.7]], dtype=float)
y = list(classifier.predict(new_samples, as_iterable=True))
print('Predictions: {}'.format([p for p in y]))

To see the charts and graphs generated by TensorFlow in Neptune, all you need to do is add a few lines of code after imports:

from deepsense import neptune
context = neptune.Context()
context.integrate_with_tensorflow()

The first line is an import of the Neptune client library. The second line will simply create a Neptune Context. The invocation of context.integrate_with_tensorflow() will extend TensorFlow to communicate with Neptune.

Once we have the source file, we need to prepare a short configuration file describing the job:

neptune.yaml

name: Iris
description: Flower Species Prediction
project: Iris
parameters:
  - name: data_dir
    description: Directory with input data
    type: string
    required: false
    default: data/
  - name: model_dir
    description: Directory to save model parameters, graph and etc.
    type: string
    required: false
    default: /tmp/iris_model

The configuration file contains: the job’s name, description and the project that the job belongs to. We also define additional parameters to make our job more generic. The parameter data_dir allows users to change the location of the input datasets. The parameter model_dir represents the directory where data generated by TensorFlow will be stored. Save the configuration in a neptune.yaml file in the flower-species-prediction directory.

Now we can use defined parameters in the code. Let’s replace declarations of the paths to datasets:

IRIS_TRAINING = "iris_training.csv"
IRIS_TEST = "iris_test.csv"

with

IRIS_TRAINING = context.params.data_dir + "iris_training.csv"
IRIS_TEST = context.params.data_dir + "iris_test.csv"

and classifier declaration:

classifier = tf.contrib.learn.DNNClassifier(feature_columns=feature_columns,
                                            hidden_units=[10, 20, 10],
                                            n_classes=3,
                                            model_dir="/tmp/iris_model")

with

classifier = tf.contrib.learn.DNNClassifier(feature_columns=feature_columns,
                                            hidden_units=[10, 20, 10],
                                            n_classes=3,
                                            model_dir=context.params.model_dir)

One last thing needed to run our code is datasets. Let’s download train and test datasets to the data directory where you will run this example.

Your directory structure should look like this:

├── data
│   ├── iris_test.csv
│   └── iris_training.csv
├── main.py
└── neptune.yaml

Let’s Run the Job!

Now we can run the job using the neptune run command. From the flower-species-prediction directory, run:

$ neptune run

The job is now running. It will take a few seconds to complete.

The command’s output will contain a link to the job’s dashboard in the Web UI.

> Job enqueued, id:
>
> To browse the job, follow:
> https://[your Neptune IP address]/#dashboard/job/f458797d-1579-4f90-bf4c-b29efcef3377
>

The job should finish after a few seconds and report accuracy of 0.966667.

Browse the Job’s Dashboard

Monitoring metrics reported by TensorFlow

Let’s follow the link displayed by neptune run and see the job’s dashboard.

Metrics reported by TensorFlow

We can see all the metrics reported by TensorFlow, displayed as charts. We can go the TensorFlow tab, and see graphs:

Graph registered by TensorFlow

Summary

We saw how easy it is to integrate TensorFlow code with Neptune. Thanks to that, we could monitor the execution of our experiment in real time. We had access to all metrics and graphs reported by TensorFlow. We could take advantage of features provided by both of these tools.