What You Will Learn In This Lesson
n this lesson you’ll get an introduction to Machine Learning. You will learn about Generative AI and AWS DeepComposer. You’ll also learn how to build a custom Generative Adversarial Network.

AWS Account is Required
To complete the exercises in this course you will need an AWS Account ID.
To set up a new AWS Account ID, follow the directions here: How do I create and activate a new Amazon Web Services account?
You will be required to provide a payment method when you create the account. You can learn about which services are available at no cost in the the AWS Free Tier documentation
Will The Exercises Cost Anything?
Generate an Inference Exercise (Required)
Your AWS account includes free access for up to 500 inference jobs in the 12 months after you first use the AWS DeepComposer service. You can use one of these free instances to complete the exercise at no cost.
You can learn more about DeepComposer costs in the AWS DeepComposer pricing documentation
Build a Custom GAN Demo (Optional)
Amazon SageMaker is a separate service and has its own service pricing and billing tier. To create the custom GAN, our instructor uses an instance type that is not covered in the Amazon SageMaker free tier. If you want to code along with the instructor and build a custom GAN on your own you may incur a cost.
Please note that creating your own custom GAN is completely optional. You are not required to do this exercise to complete the course.
You can learn more about SageMaker costs in the Amazon SageMaker pricing documentation
AWS Mission
Put machine learning in the hands of every developer.
Why AWS?
- AWS offers the broadest and deepest set of AI and ML services with unmatched flexibility.
- You can accelerate your adoption of machine learning with AWS SageMaker. Models that previously took months and required specialized expertise can now be built in weeks or even days.
- AWS offers the most comprehensive cloud offering optimized for machine learning.
- More machine learning happens at AWS than anywhere else.

AWS machine learning stack
More Relevant Enterprise Search With Amazon Kendra
- Natural language search with contextual search results
- ML-optimized index to find more precise answers
- 20+ Native Connectors to simplify and accelerate integration
- Simple API to integrate search and easily develop search applications
- Incremental learning through feedback to deliver up-to-date relevant answers
Online Fraud Detection with Amazon Fraud Detector
- Pre-built fraud detection model templates
- Automatic creation of custom fraud detection models
- One interface to review past evaluations and detection logic
- Models learn from past attempts to defraud Amazon
- Amazon SageMaker integration
Amazon Code Guru

Better Insights And Customer Service With Contact Lens
- Identify common call types
- Identify recurring themes based on customer call feedback
- Alert supervisors when customers are having a poor experience
- Assist agents with a knowledge base to answer questions as they are being asked
How to Get Started?
- AWS DeepLens: deep learning and computer vision
- AWS DeepRacer and the AWS DeepRacer League: reinforcement learning
- AWS DeepComposer: Generative AI.
- AWS ML Training and Certification: Curriculum used to train Amazon developers
- Partnerships with Online Learning Providers: Including this course and the Udacity AWS DeepRacer course!

ML Techniques and Generative AI
Machine Learning Techniques
- Supervised Learning: Models are presented wit input data and the desired results. The model will then attempt to learn rules that map the input data to the desired results.
- Unsupervised Learning: Models are presented with datasets that have no labels or predefined patterns, and the model will attempt to infer the underlying structures from the dataset. Generative AI is a type of unsupervised learning.
- Reinforcement learning: The model or agent will interact with a dynamic world to achieve a certain goal. The dynamic world will reward or punish the agent based on its actions. Overtime, the agent will learn to navigate the dynamic world and accomplish its goal(s) based on the rewards and punishments that it has received.


Generative AI
Generative AI is one of the biggest recent advancements in artificial intelligence technology because of its ability to create something new. It opens the door to an entire world of possibilities for human and computer creativity, with practical applications emerging across industries, from turning sketches into images for accelerated product development, to improving computer-aided design of complex objects. It takes two neural networks against each other to produce new and original digital works based on sample inputs.Generative AI Opens the Door to Possibilities
AWS DeepComposer
AWS Composer and Generative AI
AWS Deep Composer uses Generative AI, or specifically Generative Adversarial Networks (GANs), to generate music. GANs pit 2 networks, a generator and a discriminator, against each other to generate new content.
The best way we’ve found to explain this is to use the metaphor of an orchestra and conductor. In this context, the generator is like the orchestra and the discriminator is like the conductor. The orchestra plays and generates the music. The conductor judges the music created by the orchestra and coaches the orchestra to improve for future iterations. So an orchestra, trains, practices, and tries to generate music, and then the conductor coaches them to produced more polished music

AWS DeepComposer Workflow
- Use the AWS DeepComposer keyboard or play the virtual keyboard in the AWS DeepComposer console to input a melody.
- Use a model in the AWS DeepComposer console to generate an original musical composition. You can choose from jazz, rock, pop, symphony or Jonathan Coulton pre-trained models or you can also build your own custom genre model in Amazon SageMaker.
- Publish your tracks to SoundCloud or export MIDI files to your favorite Digital Audio Workstation (like Garage Band) and get even more creative.
Compose Music with AWS DeepComposer Models
Now that you know a little more about AWS DeepComposer including its workflow and what GANs are, let’s compose some music with AWS DeepComposer models. We’ll begin this demonstration by listening to a sample input and a sample output, then we’ll explore DeepComposer’s music studio, and we’ll end by generating a composition with a 4 part accompaniment.
- To get to the main AWS DeepComposer console, navigate to AWS DeepComposer. Make sure you are in the US East-1 region.
- Once there, click on Get started
- In the left hand menu, select Music studio to navigate to the DeepComposer music studio
- To generate music you can use a virtual keyboard or the physical AWS DeepComposer keyboard. For this lab, we’ll use the virtual keyboard.
- To view sample melody options, select the drop down arrow next to Input
- Select Twinkle, Twinkle, Little Star
- Next, choose a model to apply to the melody by clicking Select model
- From the sample models, choose Rock and then click Select model
- Next, select Generate composition. The model will take the 1 track melody and create a multitrack composition (in this case, it created 4 tracks)
- Click play to hear the output
Now that you understand a little about the DeepComposer music studio and created some AI generated music, let’s move on to an exercise for generating an interface. There, you’ll have an opportunity to clone a pre-trained model to create your AI generated music!
Prerequisites/Requirements
- Chrome Browser: You will need to use the Chrome Browser for this exercise. If you do not have Chrome you can download it here: www.google.com/chrome
- AWS Account ID: You will need an AWS Account ID to sign into the console for this project. To set up a new AWS Account ID, follow the directions here: How do I create and activate a new Amazon Web Services account?
Your AWS account includes free access for up to 500 inference jobs in the 12 months after you first use the AWS DeepComposer service. You can use one of these free instances to complete the exercise at no cost.
You can learn more about DeepComposer costs in the AWS DeepComposer pricing documentation
Access AWS DeepComposer console:
Click on DeepComposer link to get started: https://us-east-1.console.aws.amazon.com/deepcomposer
Enter AWS account ID, IAM Username and Password provided
Click Sign In

Note: You must access the console in N.Virginia (us-east-1) AWS region You can use the dropdown to select the correct region.

Get Started:
Click Music Studio from the left navigation menu

Click play to play the default input melody

Click Generate composition to generate a composition. AI generated composition will be created
Click play to play the new AI generated musical composition

Input melody:
Click record to start recording

Play the notes on the physical keyboard provided
Stop recording by clicking the record button again
Play the recorded music to verify the input. In case you don’t like recorded music, you may start recording again by clicking record

Select Jazz model from Model

Click Generate Composition to generate a composition based on the input melody you provided. Note: This step will take few minutes to generate a composition inspired by the chosen genre
Click play to play the composition and enjoy the AI generated music
Try experimenting with different genres or sample input melody
Model Training with AWS DeepComposer
AWS DeepComposer uses a GAN

Each iteration of the training cycle is called an epoch. The model is trained for thousands of epochs.
Loss Functions
In machine learning, the goal of iterating and completing epochs is to improve the output or prediction of the model. Any output that deviates from the ground truth is referred to as an error. The measure of an error, given a set of weights, is called a loss function. Weights represent how important an associated feature is to determining the accuracy of a prediction, and loss functions are used to update the weights after every iteration. Ideally, as the weights update, the model improves making less and less errors. Convergence happens once the loss functions stabilize.
We use loss functions to measure how closely the output from the GAN models match the desired outcome. Or, in the case of DeepComposer, how well does DeepComposer’s output music match the training music. Once the loss functions from the Generator and Discriminator converges, this indicates the GAN model is no longer learning, and we can stop its training.
We also measures the quality of the music generated by DeepComposer via additional quantitative metrics, such as drum pattern and polyphonic rate.

GAN loss functions have many fluctuations early on due to the “adversarial” nature of the generator and discriminator.
Over time, the loss functions stabilizes to a point, we call this convergence. This convergence can be zero, but doesn’t have to be.

How It Works
- Input melody captured on the AWS DeepComposer console
- Console makes a backend call to AWS DeepComposer APIs that triggers an execution Lambda.
- Book-keeping is recorded in Dynamo DB.
- The execution Lambda performs an inference query to SageMaker which hosts the model and the training inference container.
- The query is run on the Generative AI model.
- The model generates a composition.
- The generated composition is returned.
- The user can hear the composition in the console.
- The user can share the composition to SoundCloud.
Training Architecture
How to measure the quality of the music we’re generating:
- We can monitor the loss function to make sure the model is converging
- We can check the similarity index to see how close is the model to mimicking the style of the data. When the graph of the similarity index smoothes out and becomes less spikey, we can be confident that the model is converging
- We can listen to the music created by the generated model to see if it’s doing a good job. The musical quality of the model should improve as the number of training epochs increases.
Training architecture
- User launch a training job from the AWS DeepComposer console by selecting hyperparameters and data set filtering tags
- The backend consists of an API Layer (API gateway and lambda) write request to DynamoDB
- Triggers a lambda function that starts the training workflow
- It then uses AWS Step Funcitons to launch the training job on Amazon SageMaker
- Status is continually monitored and updated to DynamoDB
- The console continues to poll the backend for the status of the training job and update the results live so users can see how the model is learning

Challenges with GANs
- Clean datasets are hard to obtain
- Not all melodies sound good in all genres
- Convergence in GAN is tricky – it can be fleeting rather than being a stable state
- Complexity in defining meaningful quantitive metrics to measure the quality of music created
Generative AI
Generative AI has been described as one of the most promising advances in AI in the past decade by the MIT Technology Review.
Generative AI opens the door to an entire world of creative possibilities with practical applications emerging across industries, from turning sketches into images for accelerated product development, to improving computer-aided design of complex objects.
For example, Glidewell Dental is training a generative adversarial network adept at constructing detailed 3D models from images. One network generates images and the second inspects those images. This results in an image that has even more anatomical detail than the original teeth they are replacing.

Generative AI enables computers to learn the underlying pattern associated with a provided input (image, music, or text), and then they can use that input to generate new content. Examples of Generative AI techniques include Generative Adversarial Networks (GANs), Variational Autoencoders, and Transformers.
What are GANs?
GANs, a generative AI technique, pit 2 networks against each other to generate new content. The algorithm consists of two competing networks: a generator and a discriminator.
A generator is a convolutional neural network (CNN) that learns to create new data resembling the source data it was trained on.
The discriminator is another convolutional neural network (CNN) that is trained to differentiate between real and synthetic data.
The generator and the discriminator are trained in alternating cycles such that the generator learns to produce more and more realistic data while the discriminator iteratively gets better at learning to differentiate real data from the newly created data.

Like the collaboration between an orchestra and its conductor
The best way we’ve found to explain this is to use the metaphor of an orchestra and conductor. An orchestra doesn’t create amazing music the first time they get together. They have a conductor who both judges their output, and coaches them to improve. So an orchestra, trains, practices, and tries to generate polished music, and then the conductor works with them, as both judge and coach.
The conductor is both judging the quality of the output (were the right notes played with the right tempo) and at the same time providing feedback and coaching to the orchestra (“strings, more volume! Horns, softer in this part! Everyone, with feeling!”). Specifically to achieve a style that the conductor knows about. So, the more they work together the better the orchestra can perform.
The Generative AI that AWS DeepComposer teaches developers about uses a similar concept. We have two machine learning models that work together in order to learn how to generate musical compositions in distinctive styles.

Introduction to U-Net Architecture
Training a machine learning model using a dataset of Bach compositions
AWS DeepComposer uses GANs to create realistic accompaniment tracks. When you provide an input melody, such as twinkle-twinkle little star, using the keyboard U-Net will add three additional piano accompaniment tracks to create a new musical composition.
The U-Net architecture uses a publicly available dataset of Bach’s compositions for training the GAN. In AWS DeepComposer, the generator network learns to produce realistic Bach-syle music while the discriminator uses real Bach music to differentiate between real music compositions and newly created ones
Listen to sample of Bach’s music from the training dataset
Bach training sample 1
Bach training sample 2
Symphony-inspired composition created by U-Net architecture
Input melody
Generated composition
Apply your learning in AWS DeepComposer
Try generating a musical composition in Music studio
How U-Net based model interprets music
Music is written out as a sequence of human readable notes. Experts have not yet discovered a way to translate the human readable format in such a way that computers can understand it. Modern GAN-based models instead treat music as a series of images, and can therefore leverage existing techniques within the computer vision domain.
In AWS DeepComposer, we represent music as a two-dimensional matrix (also referred to as a piano roll) with “time” on the horizontal axis and “pitch” on the vertical axis. You might notice this representation looks similar to an image. A one or zero in any particular cell in this grid indicates if a note was played or not at that time for that pitch.

ML Techniques and Generative AI
Machine Learning Techniques
- Supervised Learning: Models are presented wit input data and the desired results. The model will then attempt to learn rules that map the input data to the desired results.
- Unsupervised Learning: Models are presented with datasets that have no labels or predefined patterns, and the model will attempt to infer the underlying structures from the dataset. Generative AI is a type of unsupervised learning.
- Reinforcement learning: The model or agent will interact with a dynamic world to achieve a certain goal. The dynamic world will reward or punish the agent based on its actions. Overtime, the agent will learn to navigate the dynamic world and accomplish its goal(s) based on the rewards and punishments that it has received.

Generative AI
Generative AI is one of the biggest recent advancements in artificial intelligence technology because of its ability to create something new. It opens the door to an entire world of possibilities for human and computer creativity, with practical applications emerging across industries, from turning sketches into images for accelerated product development, to improving computer-aided design of complex objects. It takes two neural networks against each other to produce new and original digital works based on sample inputs.
