Sagemaker In A Nutshell

Background

Machine learning can be quite complex, what is machine learning ?
Machine learning (ML) is the scientific study of algorithms and statistical models that computer systems utilise to progressively improve their performance on a specific task. Machine learning algorithms build a mathematical model of sample data, known as "training data", in order to make predictions or decisions without being explicitly programmed to perform the task. (Courtesy Wikipedia).

Use Case

AWS Sagemaker is an environment created by AWS based on Jupyter labs that facilitates a Machine Learning environment as an AWS managed service.

For the machine learning process to be executed properly. We need to consider a few factors.

Raw data needs to be preprocessed to a legible format.

AWS Glue is a fully managed extract, transform, and load (ETL) service that makes it easy for customers to prepare and load their data for analytics.

Based on your use case existing or new machine learning algorithms need to be created or selected.

These algorithms run on certain frame works, some of these
frameworks are :
TensorFlow
Apache Spark ML Library
Theano
Torch
Caffe
Microsoft (CNTK) Cognitive Toolkit
Keras
scikit-learn
Accord.NET
Microsoft Azure ML Studio
AML (Amazon Machine Learning)

AWS Sage Maker uses a subset of these frameworks such as
Tensorflow and Spark etc.

A machine learning model needs to be created.

This model needs to be trained and tuned based on data repetitively and hyper parameters.

Once the model is trained, the model needs to be deployed to a production environment for consumption. Training is done via the Sage maker "jobs" option.

The deployment of the trained model is accessed through an "endpoint" for consumption.

AWS Sagemaker takes care of all of the above processes. Also note that all of the above services can be utilized individually if necessary, independently of each other. You can also bring your own pre-trained models and import them to Sagemaker and run them as endpoints. If a framework is not supported within Sagemaker you can also make a container with the said framework and libraries and run it on Sagemaker this makes Sagemaker extremely powerful. This utilizes AWS ECR. (ECR AWS container registry).

An example of this would be:
https://github.com/aws
labs/amazon-sagemaker-examples/tree/master/
advanced
_functionality/
scikit_bring_your_own

Further examples for Sagemaker:
https://github.com/aw
slabs/amazon-sagemaker-examples

Note Book Instance

These are machine learning supported EC2 instances with enhanced GPUs/CPUs for complex scientific processes & calculations etc.

These are prefixed with ml, and they have variations based on characteristics such as CPU processing power etc, they are classed as, Standard, Compute Optimized and Accelerated computing instances. These contain standard AWS security options such as IAM, VPC, encryption etc. They also have the option of mapping your home directory to Git repositories or cloning the Git repositories to your home directory. An IAM role gives the notebook to specific S3 Buckets. All data lives on S3 hence the requirement for bucket access.

Note:
Ensure that the bucket is in the same region as the ML instance. (Otherwise it may throw an error).

Notebooks give a developer access to a Jupyter lab environment.

https://jupyter.org/

These environments support Spark Magic (Apache), Tensorflow (Google), Python etc. You have multiple IDE environment kernels to choose from based on the Framework chosen for the algorithm. A notebook can be defined as a programming environment based on a Machine Learning IDE kernel with Machine learning algorithms. AWS includes sample algorithm code as samples that can be utilized, or one can upload new algorithm code to these notebooks.

In AWS terms:
For training data exploration and preprocessing, Amazon SageMaker provides fully managed instances running Jupyter notebooks that include example code for common model training and hosting exercises.

Sample ML Algorythms

As stated previously AWS Sagmaker has provided a multitude of sample ML algorithms for consumption through the implementation of these notebooks with the algorithm code. These populate your home directory of the ML instance when selected. The sample code algorithms can be utilized to train Models to machine learning from your transformed data sets. Finally the trained model can be deployed to your production environment through the deployment process.

Containers and ML Algorthm images

The final piece of the puzzle being that all aws sample algorithms are docker images.
For example once a notebook based on an AWS sample algorithm is create the notebook contains the following

from sagemaker.amazon.
amazon_estimator import get_image_uri
container = get_image_uri(boto3.
Session().region_name, 'xgboost')

Which is a docker image. Training parameters need to be configured for the data set.

An example is as follows:

common_training_params = \
{
"AlgorithmSpecification": {
"TrainingImage": container,
"TrainingInputMode": "File"
},
"RoleArn": role,
"OutputDataConfig": {
"S3OutputPath": bucket_path + "/"+ prefix + "/xgboost"
},
"ResourceConfig": {
"InstanceCount": 1,
"InstanceType": "ml.m4.10xlarge",
"VolumeSizeInGB": 5
},
"HyperParameters": {
"max_depth":"5",
"eta":"0.2",
"gamma":"4",
"min_child_weight":"6",
"silent":"0",
"objective": "multi:softmax",
"num_class": "10",
"num_round": "10"
},
"StoppingCondition": {
"MaxRuntimeInSeconds": 86400
},
"InputDataConfig": [
{
"ChannelName": "train",
"DataSource": {
"S3DataSource": {
"S3DataType": "S3Prefix",
"S3Uri": bucket_path + "/"+ prefix+ '/train/',
"S3DataDistributionType": "FullyReplicated"
}
},
"ContentType": "libsvm",
"CompressionType": "None"
},
{
"ChannelName": "validation",
"DataSource": {
"S3DataSource": {
"S3DataType": "S3Prefix",
"S3Uri": bucket_path + "/"+ prefix+ '/validation/',
"S3DataDistributionType": "FullyReplicated"
}
},
"ContentType": "libsvm",
"CompressionType": "None"
}
]
}

As you can see from the above example, Hyperparameters in the example above are

"HyperParameters": {
"max_depth":"5",
"eta":"0.2",
"gamma":"4",
"min_child_weight":"6",
"silent":"0",
"objective": "multi:softmax",
"num_class": "10",
"num_round": "10"
},

In addition to that, the S3 bucket and ML instance parameters need to be configured. The parameters that need to be configured can be found in the algorythm documentation. Once a machine learning job/s is executed you are able to see the jobs in progress. All job details are logged to cloudwatch logs.

Finally multiple trained models can be utilized behind the same endpoint for A/B testing.

Links Relating To Subject Matter

Log Groups & Log Streams Lambda Context Python Lambda Context NodeJS IAM Roles