Red Wine Quality prediction using AzureML, AKS with TensorFlow Keras
What are we trying to do
Predict the Quality of Red Wine using Tensorflow Keras deep learning framework given certain attributes such as fixed acidity, volatile acidity, citric acid, residual sugar, chlorides, free sulfur dioxide, total sulfur dioxide, density, pH, sulphates, and alcohol
We divide our approach into 2 major blocks:
- Building the Model in Azure ML
- Inference from the Model in Azure ML
Building the model in Azure ML has the following steps:
- Create the Azure ML workspace
- Upload data into the Azure ML Workspace
- Create the code folder
- Create the Compute Cluster
- Create the Model
- Create the Compute Environment
- Create the Estimator
- Create the Experiment and Run
- Register the Model
Inferencing from the model in Azure ML has the following steps:
- Create the Inference Script
- Create the Inference Dependencies
- Create the Inference Config
- Create the Inference Clusters
- Deploy the Model in the Inference Cluster
- Get the predictions
Please read the other post Red Wine Quality prediction using AzureML, AKS. This was done using machine learning techniques and not using deep learning. The same thing is accomplished here but using the deep learning framework Keras. Most of the things remain the same compared to the machine learning method, but a few steps change. I am going to highlight the changed aspects here only so that it is easy to follow.
Step | Change / No Change |
---|---|
Create the Azure ML workspace | No Change |
Upload data into the Azure ML Workspace | No Change |
Create the code folder | No Change |
Create the Compute Cluster | No Change |
Create the Model | Change |
Create the Compute Environment | Change |
Create the Estimator | Change |
Create the Experiment and Run | No Change |
Register the Model | No Change |
Create the Model
The model that we create here makes use of a repeatable block
made up of
- Dense Layer
- Dropout
- BatchNormalization
The last layer is a Dense layer of a single neuron.
This block is repeated 3
times.
%%writefile $folder_training_script/train.py
import argparse
import os
import numpy as np
import pandas as pd
import glob
from azureml.core import Run
# from utils import load_data
# let user feed in 2 parameters, the dataset to mount or download, and the regularization rate of the logistic regression model
parser = argparse.ArgumentParser()
parser.add_argument('--data-folder', type=str, dest='data_folder', help='data folder mounting point')
args = parser.parse_args()
###
data_folder = os.path.join(args.data_folder, 'winedata')
print('Data folder:', data_folder)
red_wine = pd.read_csv(os.path.join(data_folder, 'winequality_red.csv'))
from tensorflow import keras
from tensorflow.keras import layers, callbacks
# Create training and validation splits
df_train = red_wine.sample(frac=0.7, random_state=0)
df_valid = red_wine.drop(df_train.index)
X = df_train.copy()
X = X.drop(columns = ["quality"])
df_train_stats = X.describe()
df_train_stats = df_train_stats.transpose()
def norm(x):
return (x - df_train_stats['mean']) / df_train_stats['std']
# Split features and target
X_train = df_train.drop('quality', axis=1)
X_valid = df_valid.drop('quality', axis=1)
X_train = norm(X_train)
X_valid = norm(X_valid)
y_train = df_train['quality']
y_valid = df_valid['quality']
early_stopping = callbacks.EarlyStopping(
min_delta=0.001, # minimium amount of change to count as an improvement
patience=20, # how many epochs to wait before stopping
restore_best_weights=True,
)
input_shape=X_train.shape[1]
model = keras.Sequential([
# the hidden ReLU layers
layers.Dense(units=512, activation='relu', input_shape=[input_shape]),
layers.Dropout(0.3),
layers.BatchNormalization(),
layers.Dense(units=512, activation='relu'),
layers.Dropout(0.3),
layers.BatchNormalization(),
layers.Dense(units=512, activation='relu'),
layers.Dropout(0.3),
layers.BatchNormalization(),
# the linear output layer
layers.Dense(units=1)
])
model.compile(
optimizer='adam',
loss='mae',
)
history = model.fit(
X_train, y_train,
validation_data=(X_valid, y_valid),
batch_size=256,
epochs=500,
callbacks=[early_stopping], # put your callbacks in a list
verbose=0, # turn off training log
)
history_df = pd.DataFrame(history.history)
# Get the experiment run context
run = Run.get_context()
run.log('min_val_loss', np.float(history_df['val_loss'].min()))
os.makedirs('outputs', exist_ok=True)
# note file saved in the outputs folder is automatically uploaded into experiment record
model.save('outputs/my_model')
run.complete()
Create the Compute Environment
The compute environment makes use of the curated environment AzureML-TensorFlow-2.2-GPU
provided by AzureML.
from azureml.core import Environment
curated_env_name = 'AzureML-TensorFlow-2.2-GPU'
tf_env = Environment.get(workspace=ws, name=curated_env_name)
Create the Estimator
We change the estimator to make use of the curated environment environment_definition = tf_env
from azureml.train.estimator import Estimator
script_params = {
'--data-folder': ds.as_mount()
}
# Create an estimator
estimator = Estimator(source_directory=folder_training_script,
script_params=script_params,
compute_target = compute_target, # Run the experiment on the remote compute target
environment_definition = tf_env,
entry_script='train.py')
Predict the data
Predicting or Inferencing from the model in Azure ML has the following steps:
- Create the Inference Script
- Create the Inference Dependencies
- Create the Inference Config
- Create the Inference Clusters
- Deploy the Model in the Inference Cluster
- Get the predictions
Same as in the previous section, we do not explain in detail the steps which are the same in building the model using the machine learning work. We would highlight only the step which has changed from the previous implementation.
Step | Change / No Change |
---|---|
Create the Inference Script | Change |
Create the Inference Dependencies | Not required |
Create the Inference Config | Change |
Create the Inference Clusters | No Change |
Deploy the Model in the Inference Cluster | No Change |
Get the predictions | No Change |
Create the Inference Script
%%writefile $folder_training_script/score.py
import json
from tensorflow import keras
import numpy as np
from azureml.core.model import Model
# Called when the service is loaded
def init():
global model
# Get the path to the registered model file and load it
model_path = Model.get_model_path('wine_model')
model = keras.models.load_model(model_path)
# Called when a request is received
def run(raw_data):
try:
# Get the input data as a numpy array
data = np.array(json.loads(raw_data)['data'])
# Get a prediction from the model
predictions = model.predict(data)
log_txt = 'Data:' + str(data) + ' - Predictions:' + str(predictions)
print(log_txt)
# Return the predictions as any JSON serializable format
return predictions.tolist()
except Exception as e:
result = str(e)
# return error message back to the client
return json.dumps({"error": result})
Create the Inference Dependencies
We do not require this step since we use a curated environment
Create the Inference Config
from azureml.core.model import InferenceConfig
classifier_inference_config = InferenceConfig(source_directory = './winecode',
entry_script="score.py", environment=tf_env)
Here we use the curated environment tf_env
created earlier to prepare the Inference Config
Conclusion
In this post, we have highlighted the code which would change when using the Keras deep learning framework compared to the method which we have used for building the model using machine learning.
Thank you for reading. Please do leave your comments.