Beruflich Dokumente
Kultur Dokumente
Each region is a separate geographic area, completely independent, isolated from the other regions
& helps achieve the greatest possible fault tolerance and stability
Communication between regions is across the public Internet
Each region has multiple Availability Zones
Each AZ is physically isolated, geographically separated from each other and designed as an independent
failure zone
AZs are connected with low-latency private links (not public internet)
Edge locations are locations maintained by AWS through a worldwide network of data centers for the
distribution of content to reduce latency.
Consolidate Billing
Paying account with multiple linked accounts
Paying account is independent and should be only used for billing purpose
Paying account cannot access resources of other accounts unless given exclusively access through Cross
Account roles
All linked accounts are independent and soft limit of 20
One bill per AWS account
provides Volume pricing discount for usage across the accounts
allows unused Reserved Instances to be applied across the group
Free tier is not applicable across the accounts
Tags & Resource Groups
are metadata, specified as key/value pairs with the AWS resources
are for labelling purposes and helps managing, organizing resources
can be inherited when created resources created from Auto Scaling, Cloud Formation, Elastic Beanstalk
etc
can be used for
Cost allocation to categorize and track the AWS costs
Conditional Access Control policy to define permission to allow or deny access on resources
based on tags
Resource Group is a collection of resources that share one or more tags
IDS/IPS
Promiscuous mode is not allowed, as AWS and Hypervisor will not deliver any traffic to instances this is
not specifically addressed to the instance
IDS/IPS strategies
Host Based Firewall – Forward Deployed IDS where the IDS itself is installed on the instances
Host Based Firewall – Traffic Replication where IDS agents installed on instances which
send/duplicate the data to a centralized IDS system
In-Line Firewall – Inbound IDS/IPS Tier (like a WAF configuration) which identifies and drops
suspect packets
DDOS Mitigation
Minimize the Attack surface
use ELB/CloudFront/Route 53 to distribute load
maintain resources in private subnets and use Bastion servers
Scale to absorb the attack
scaling helps buy time to analyze and respond to an attack
auto scaling with ELB to handle increase in load to help absorb attacks
CloudFront, Route 53 inherently scales as per the demand
Safeguard exposed resources
user Route 53 for aliases to hide source IPs and Private DNS
use CloudFront geo restriction and Origin Access Identity
use WAF as part of the infrastructure
Learn normal behavior (IDS/WAF)
VPC
helps define a logically isolated dedicated virtual network within the AWS
provides control of IP addressing using CIDR block from a minimum of /28 to maximum of /16 block size
supports IPv4 and IPv6 addressing
cannot be extended once created
can be extended by associating secondary IPv4 CIDR blocks to VPC
Components
Internet gateway (IGW) provides access to the Internet
Virtual gateway (VGW) provides access to on-premises data center through VPN and Direct
Connect connections
VPC can have only one IGW and VGW
Route tables determine where network traffic from subnet is directed
Ability to create subnet with VPC CIDR block
A Network Address Translation (NAT) server provides outbound Internet access for EC2 instances
in private subnets
Elastic IP addresses are static, persistent public IP addresses
Instances launched in the VPC will have a Private IP address and can have a Public or a Elastic IP
address associated with it
Security Groups and NACLs help define security
Flow logs – Capture information about the IP traffic going to and from network interfaces in your
VPC
Tenancy option for instances
shared, by default, allows instances to be launched on shared tenancy
dedicated allows instances to be launched on a dedicated hardware
Route Tables
defines rules, termed as routes, which determine where network traffic from the subnet would
be routed
Each VPC has a Main Route table, and can have multiple custom route tables created
Every route table contains a local route that enables communication within a VPC which cannot
be modified or deleted
EC2
provides scalable computing capacity
Features
Virtual computing environments, known as EC2 instances
Preconfigured templates for EC2 instances, known as Amazon Machine Images (AMIs), that
package the bits needed for the server (including the operating system and additional software)
Various configurations of CPU, memory, storage, and networking capacity for your instances,
known as Instance types
Secure login information for your instances using key pairs (public-private keys where private is
kept by user)
Storage volumes for temporary data that’s deleted when you stop or terminate your instance,
known as Instance store volumes
Persistent storage volumes for data using Elastic Block Store (EBS)
Multiple physical locations for your resources, such as instances and EBS volumes,
known as Regions and Availability Zones
A firewall to specify the protocols, ports, and source IP ranges that can reach your instances
using Security Groups
Static IP addresses, known as Elastic IP addresses
Metadata, known as tags, can be created and assigned to EC2 resources
Virtual networks that are logically isolated from the rest of the AWS cloud, and can optionally
connect to on premises network, known as Virtual private clouds (VPCs)
Amazon Machine Image
template from which EC2 instances can be launched quickly
does NOT span across across regions, and needs to be copied
can be shared with other specific AWS accounts or made public
Data Pipeline
orchestration service that helps define data-driven workflows to automate and schedule regular data
movement and data processing activities
integrates with on-premises and cloud-based storage systems
allows scheduling, retry, and failure logic for the workflows
EMR
is a web service that utilizes a hosted Hadoop framework running on the web-scale infrastructure of EC2
and S3
launches all nodes for a given cluster in the same Availability Zone, which improves performance as it
provides higher data access rate
seamlessly supports Reserved, On-Demand and Spot Instances
consists of Master Node for management and Slave nodes, which consists of Core nodes holding data and
Task nodes for performing tasks only
is fault tolerant for slave node failures and continues job execution if a slave node goes down
does not automatically provision another node to take over failed slaves
supports Persistent and Transient cluster types
Persistent which continue to run
Transient which terminates once the job steps are completed
supports EMRFS which allows S3 to be used as a durable HA data storage
Kinesis
enables real-time processing of streaming data at massive scale
provides ordering of records, as well as the ability to read and/or replay records in the same order to
multiple Kinesis applications
data is replicated across three data centers within a region and preserved for 24 hours, by default and can
be extended to 7 days
streams can be scaled using multiple shards, based on the partition key, with each shard providing the
capacity of 1MB/sec data input and 2MB/sec data output with 1000 PUT requests per second
Kinesis vs SQS
real-time processing of streaming big data vs reliable, highly scalable hosted queue for storing
messages
ordered records, as well as the ability to read and/or replay records in the same order vs no
guarantee on data ordering (with the standard queues before the FIFO queue feature was released)
data storage up to 24 hours, extended to 7 days vs up to 14 days, can be configured from 1
minute to 14 days but cleared if deleted by the consumer
supports multiple consumers vs single consumer at a time and requires multiple queues to
deliver message to multiple consumers
SQS
extremely scalable queue service and potentially handles millions of messages
helps build fault tolerant, distributed loosely coupled applications
stores copies of the messages on multiple servers for redundancy and high availability
guarantees At-Least-Once Delivery, but does not guarantee Exact One Time Delivery which might result
in duplicate messages (Not true anymore with the introduction of FIFO queues)
AWS: Amazon Web Services (AWS), it is a collection of various cloud computing services and application that offers
flexible, reliable, easy to use and cost-effective solutions
Cloud Computing: It is an internet-based computing service in which various remote servers are networked to
allow centralized data storage and online access to computer services and resources
Types of cloud:
Here we have the list of topics if you want to jump right into a specific one:
Cloud Computing
Need AWS Lambda?
What is AWS Lambda?
How does it work?
Building Blocks
Using AWS Lambda with S3
Why not other compute services but Lambda?
AWS Lambda VS AWS EC2
AWS Lambda VS AWS Elastic Beanstalk
Before proceeding towards AWS Lambda, let’s understand its domain cloud computing where AWS originated
from.
Cloud computing is simply a practice of using a network of remote servers hosted on the Internet to store,
manage, and process data, rather than a local server or a personal computer. For more information on cloud
computing, you can refer to this informative blog on Cloud computing.
But why are we talking about AWS when there are numerous cloud computing vendors. Here are some of the
major players in the marketplace when it comes to Cloud Computing.
If we look into the stats, currently AWS is a pioneer in providing cloud services, as is evident from the google trends
graph below:
We are going to talk about AWS Lambda today, which is a very reliable serverless compute service.
But why do we need AWS Lambda, when we already have 2 other reliable computing services?
Don’t worry, we will answer all your questions regarding AWS Lambda today in this blog.
Need of AWS Lambda
As you all know how cloud works, so let’s take an example and understand the need of AWS Lambda. Let’s take an
example of a website, suppose this website is hosted on AWS EC2, in which currently 90-100 users are reading a
blog, and in the back-end, admin user uploads 10 videos on the website for processing.
This increases the load on the server, which triggers the auto scaling feature, so EC2 provisions more number of
instances for this task, as hosting and back-end changes are both taking place in the same instance. Autoscaling
takes a long time to provision more instances, which eventually becomes a reason for slow actions in the website
when the initial spike in the task is received.
While the users are reading a blog on the website and the admin user uploads those 10 videos on the website. The
website will forward the task of video processing to another instance. This makes the website resistant to video
processing, and therefore website performance will not be impacted. But video processing still took a lot of time,
when the load increased because auto-scaling took time on EC2.
We needed a stateless system to solve this problem, and AWS did exactly this with the launch of AWS Lambda!
Lambda Function: Whatever the custom codes and libraries that you’ve created are what a Function
Event Source: Any AWS or custom service that triggers your function and helping in executing its logic.
Log Streams: As we know that Lambda monitors your function automatically and one can view its metric
on CloudWatch directly, but you can also code your function in a way that allows you a custom logging statements
to let you analyze the flow of execution and performance of your function to check if it’s working properly.
Using AWS Lambda with S3
In this section, we will see how AWS S3 can be used with AWS Lambda.Let’s take an example where User is
uploading an Image in the Website for resizing it.
The user creates a Lambda function.
User uploads the code to the Lambda function.
Then uploads the image from the Website in the S3 bucket as an object.
After receiving the object, our S3 bucket triggers the Lambda Function.
Then the Lambda Function does its job by resizing the image in the back-end and sends a successful
completion email through SQS.
Pseudo Code for Lambda function:
<code to resize image>
<once the image is resized, send the email for successful operation through SQS>
So, from this example, you must be able to figure out that how AWS Lambda performs its tasks in the back-end.
Check the below diagram for a summary to it:
As we are aware that AWS Lambda is one of the computing services provided by Amazon and if we talk about
other computing services to execute a task, like AWS EC2 & AWS Elastic beanstalk, why should we choose Lambda
in place of them. Let’s try to understand this:
AWS EC2 VS AWS Lambda
AWS Elastic Beanstalk VS AWS EC2
AWS Lambda VS AWS EC2
As we know that in AWS EC2, one can host a website as well as run and execute the back-end codes.
AWS Lambda is a Platform as a Service (PaaS) with AWS EC2 is an Infrastructure as a Service (IaaS) provided
a remote platform to run & execute your back-end with the virtualized computing resources.
code.
No flexibility to log in to compute instances, choose Offers the flexibility to choose the variety of instances,
customized operating system or language runtime. custom operating systems, network & security patches
etc.
Deploy and manage the apps on AWS Cloud Whereas AWS Lambda is used only for running and
without worrying about the infrastructure that executing your Back-end code, it can’t be used to deploy
runs those applications. an application.
Freedom to select AWS resources, like, choose an Whereas in Lambda, you cannot select the AWS resources,
EC2 instance type which is optimal for your like a type of EC2 instance, lambda provides you resources
application. based on your workload.
It is a stateful system.
It is a stateless system.
Now that we understand how AWS Lambda plays its part, let’s take a sneak peek on its pros and cons.
Benefits of AWS Lambda
Due to its serverless architecture, one need not worry about provisioning or managing servers.
No need to set up any Virtual Machine (VM).
There are several limitations of AWS Lambda due to its hardware as well as its architecture:
The maximum execution duration per request is set to only 300 seconds (15 Minutes).
In the case of hardware, the maximum disk space provided is 512 MB for the runtime environment, which
is very less
Its memory volume varies between 128 to 1536 MB.
Event request body cannot exceed more than 128 KB.
Its code execution timeout is only 5 minutes.
Lambda functions write their logs only to CloudWatch, which is the only tool available in order to monitor
or troubleshoot your functions.
So, these are the limitations of AWS Lambda which are basically there to ensure that the services are used as
intended.
Now that we’ve discussed its pricing, now let’s move forward and investigate its very general use cases.
Use Cases of AWS Lambda
Building a serverless website allows you to focus on your website code. You don’t have to manage and operate its
infrastructure. Sounds cool, isn’t it? Yes, this is the most common and interesting use case of AWS Lambda, where
people are actually taking advantage of its pricing model. Hosting your static website over S3 and then making it
available using AWS Lambda makes it easier for the users to keep track of their resources being used and to
understand if their code is that feasible or not. That even with the ability to troubleshoot and fix the problem in no
time.
Automated Backups of everyday tasks
One can easily schedule the Lambda events and create back-ups in their AWS Accounts. Create the back-ups, check
if there are any idle resources or not, generate reports and other tasks which can be implemented by Lambda in no
time.
Filter and Transform data
One can easily use it for transferring the data between Lambda and other Amazon Services like S3, Kinesis, Redshift
and database services along with the filtering of the data. One can easily transform and load data between Lambda
and these services. When we investigate its Industrial use cases, then a very apt implementation of Lambda can be
found in a company name Localytics.
Use case in Localytics
Localytics is a Boston-based, web and mobile app analytics and engagement company. Its marketing and analytics
tools are being extensively used by some major brands such as ESPN, eBay, Fox, Salesforce, RueLaLa and the New
York Times to understand and evaluate the performance of the apps and to engage with the existing as well as the
Regardless of how popular Localytics is now, Localytics faced some serious challenges before they started using
lambda.
Let’s see what those challenges were before we discuss how lambda came to the rescue and helped Localytics
overcome these challenges, let’s check what were the challenges.
Challenges
Billions of data points, that are uploaded every day from different mobile applications running Localytics
analytics software are fed to the pipeline that they support.
Additional capacity planning, utilization monitoring, and infrastructure management were required since
the engineering team had to access subsets of data in order to create new services.
Platform team was more inclined towards enabling self-service for engineering teams.
Every time a microservice was added, the main analytics processing service for Localytics had to be
updated.
Rest you can understand from the below diagram for the same:
With all the benefits it provides, Lambda has contributed to the popularity of Localytics.
Benefits
Lambda rules out the need to provision and manage infrastructure in order to run each Microservice.
Processing tens of billions of data aren’t as big of a hassle as it were before as Lambda automatically
scales up and down with the load.
Lambda enables the creation of new Microservices to access data stream by decoupling the product
engineering efforts from the platform analytics pipeline eliminating the need to be bundled with the main analytics
applications.
After addressing AWS Lambda, its function along with its workflow and use cases, now let’s end our tutorial with
running our first lambda function.
Hands-On
In this Hands-on, we will take you through the step wise guide on how to create a lambda function using lambda
console.
Enter the name and all the credentials, now in case of the runtime, you can choose any based on your
understanding of that language, we’re choosing NodeJS 8.10, here you can choose from any option like python,
java, .Net, Go (these are the languages it supports).
As here we have already defined our role with the name of service-role/shubh-intel.
The next step after this is Writing Code for your Lambda Function.
We’re choosing Lambda Console here, you can choose from different code editors like Cloud9 Editor or
on your Local machine.
You can check your function being created, as here we have created it with the name of example-lambda.
The AWS IoT acts as the mediator between the components and the network. It hence, gathers
information from those things and works on them. AWS IoT is defined as a platform which enables you to
connect devices to AWS Services along with the other devices, secure data and interactions, process and
act upon device data, and also allows applications to interact with devices even when they are offline.
AWS IoT has the following gears :
Gears Description
Message It gives safety for the IoT and its uses. For issuing and donating we have to use
broker
Correlation of different services of the same kind is done by the rules engine.
For processing and transferring information, SQL-based code like Amazon
Rules engine S3,DB,Lambda, etc. has to be choosen.
It is also called as a device shadow. It gives the present data of any component
connected to IoT. For that a JSON material is used.
Thing shadow
It takes care of the protection in a shared basis. The message transfer to and
Security and from the things are done using these services with confidential credentials.
identity service
It allows the gadgets to interact with the IoT in a safe and sound environment.
Device gateway
When you created the function, you will be directed to a Function Code screen, where you will be
defining your function, either you can choose the code from below or you can make your own template for it, it’s
quite easy.
};
If you want to define the key value then you can, like here we’ve defined key1 and key2 and key 1= ‘Hello
from Lambda!’
Then create the event like we created it with the name of mytestevent click Save and Test in order to run
your function.
After running it, you will get an output where you check the details and you will get the output as below:
The maximum size of the dataset is 1 MB and that of the identity is 20 MB.
Developing a dataset and and put keys is done by the following command:
As the applications in the cloud increases, the cost of Cognito also increases. We will be able to use use 10 GB of
storeroom for the first year.
The mobile SDK allows easy building of the AWS applications. A Few characteristics of them are as follows:
Object Mapper
It helps in admitting the DynamoDB from the applications. It assists us to program for converting substance into
tables and vice versa. The substances allow reading, writing and removing services on the items and support
questions.
S3 Transfer Manager
It helps for uploading and downloading documents from S3 by increasing the performance level and dependency.
Operations on file transfer now can be further modified. Using BFTask this tool is repaired and now transformed
into a better and cleaner boundary.
iOS / Objective-C Enhancements
The Mobile SDK helps ARC with BFTask for better utilization of Objective C. and Cocoapods.
Applications of AWS are used by developer tools brilliantly. Tools used by the developers are as below:
AWS Management Console : It manages the quickly growing Amazon architecture.It controls your
calculation, storing and also some cloud based activities using a very simple graphical border.
AWS Toolkit for Eclipse : It is a tool for using Java with AWS. It helps in installing, unfolding as well as
developing Java with AWS. Various services from Java can be communicated by making use of the explorer. It even
consists of the most up-to-date edition of the Java SDK.
AWS Toolkit for Microsoft Visual Studio : This tool makes .NET functions to be easily used in AWS.
Various services from the Visual Studio IDE can be communicated by making use of the explorer. It even consists of
the most up-to-date edition of the .NET SDK. Along with all these it also supports services related to the cloud also
Some of the tool and their description are as follow.
Tools Description
It make it easy to work with AWS APIs in any of the preferred
AWS SDKs programming language as well as platform.
AWS Command AWS also offers the AWS Command Line Interface (CLI). It is a single tool
Line Tools for controlling as well as managing multiple AWS services
The AWS Toolkits gives specialized cloud tools integration into your
IDE Toolkits development environment.
In order to download the AWS SDKs, you can go to Tools for Amazon Web Services.
In order to download the AWS CLI or the PowerShell tools, you can go to Tools for Amazon Web Services.
Applications of AWS are used by developer tools brilliantly. Tools used by the developers are as below:
AWS Management Console : It manages the quickly growing Amazon architecture.It controls your
calculation, storing and also some cloud based activities using a very simple graphical border.
AWS Toolkit for Eclipse : It is a tool for using Java with AWS. It helps in installing, unfolding as well as
developing Java with AWS. Various services from Java can be communicated by making use of the explorer. It even
consists of the most up-to-date edition of the Java SDK.
AWS Toolkit for Microsoft Visual Studio : This tool makes .NET functions to be easily used in AWS.
Various services from the Visual Studio IDE can be communicated by making use of the explorer. It even consists of
the most up-to-date edition of the .NET SDK. Along with all these it also supports services related to the cloud
also.
Tier 1: AWS Global Infrastructure Tier 2: Networking Tier 3: Compute, Storage, Databases Tier 4: Analytics, Security
and ID, Management Tools Tier 5: App Services, Dev Tools, Mobile Services Tier 6: Enterprise Applications. Internet
of things
There are a number of different regions spread across the world for AWS.
What are edge locations? CDN locations for CloudFront. CloudFront is AWS CDN service. Currently over 50 edge
locations.
Networking
VPC: Virtual Private Cloud - A virtual data center. You can have multiple VPCs. Basically a data center in your AWS
account. Isolated set of resources.
Route53 - Amazons DNS service. 53 because of the port that the service sits on.
Compute
EC2 Container Service - Sometimes ECS. A highly scalable service for Docker.
Lambda - By far the most powerful service. Let's you run code without provisioning or managing servers. You only
pay for the compute time. You pay for execution time.
Storage
S3 - Object Based Storage as a place to store your flat files in the cloud. It is secure, scalable and durable. You pay
for the storage amount.
CloudFront - AWS CDN service. It integrates with other applications. It is an edge location for cacheing files.
Glacier - Low cost storage for long term storage and back up. Up to 4 hours to access it. Think of it as an archiving
service.
EFS - Elastic File Storage - used for EC2. NAS for the cloud. Connects up to multiple EC2 instanges. Block level. It is
still in preview, and not currently in exams.
Snowball - Import/Export service. It allows you to send in your own hard disks and they will load the data onto the
platform using their own internal network. Amazon give you the device and you pay for it daily.
Storage Gateway - The service connecting on-premise storage. Essentially a little VM you run in your office or data
centers and replicates AWS.
Databases
DMS: Database Migration Services - Essentially a way of migrating DB into AWS. You can even convert DBs.
Analytics
EMR: Elastic Map Reduce - This can come up in the exam. It's a way of processing big data.
Data Pipeline - Moving data from one service to another. Required for pro.
Elastic Search: A managed service that makes it easy to deploy, operate and scale Elastic Search in the AWS cloud.
A popular search and analytics option.
Kinesis - Streaming data on AWS. This is a way of collecting, storing and processing large flows of data.
Machine Learning - Service that makes it easy for devs to use machine learning. Amazon use it for things like
products you might be interested in etc.
Quick Sight - A new service. It's a business intelligence service. Fast cloud-powered service.
IAM: Identity Access Management - Where you can control your users, groups, roles etc. - multifactor auth etc etc
Inspector - allows you to install agents onto your EC2 instances. It searches for weaknesses.
Management Tools
Cloud Formation - Deploying a Wordpress Site. Does an amazing amount of autonomous work for you.
Cloud Trail - A way of providing audit access to what people are doing on the platform. Eg. changes to EC2
instances etc.
OpsWorks - Configuration Management service using Chef. We will create our own OpsWork Stack.
Config - Relatively new service. Fully managed service with a AWS history and change notifications for security and
governance etc. Auto checks the configuration of services. Eg. ensure all encrypted services attached to EC2 etc.
Trusted Advisor - Does come up in the exam. Automated service that scans the environment and gives way you can
be more secure and save money.
Application Services
AppStream - AWS version of ZenApp. Stream Windows apps from the cloud.
CloudSearch - Managed service on the cloud that makes it manageable for a scale solution and supports multiple
languages etc.
Elastic Transcoder - A media transcoding service in the cloud. A way to convert media into a format that will play
on varying devices.
SES: Simple Email Service - Transactional emails, marketing messages etc. Also can be used to received emails that
can be integrated.
SWF: Simple Workflow Services - Think of when you place an order on AWS, they use SWF so that people in the
warehouse can start the process of collecting and sending packages.
CodeCommit - Host secure private .git repositories CodeDeploy - A service that deploys code to any instance
CodePipeline - Continuous Delivery Services for fast updates. Based on code modes you define.
Mobile Services
Device Farm - Improve the quality of apps tested against real phones.
SNS: Simple Notification Service - Big topic in the exam. Sending notifications from the cloud. You use it all the time
in production.
Enterprise Apps
WorkDocs - Fully managed enterprise equivalent to Dropbox etc. (safe and secure)
Internet of Things
Internet of things - A new service that may become the most important.
IAM 101
It allows you to manage users and their level of access to the AWS Console. It is important to understand IAM and
how it works, both for the exam and for administrating a companies AWS account in real life.
You'll find the IAM users sign-in link near the top.
Go through the Security Status and tick off all the boxes!
Configuring a Role
It'll make sense when you start using EC2. It's about having resources access other resources in AWS.
Create a role.
We'll choose Amazon EC2 for our role. Select S3 full access as your policy for now.
It's easy to use, has a simple web services interface to store and retrieve any amount of data from anywhere on
the web. Think of a hard drive on the web.
Data is stored across multiple devices and facilities. It's built for failure.
Not a place for databases etc. you need block storage for that.
Files from 1 byte to 5TB. You can store up to the petabyte if you wanted to.
Files are stored in buckets - like the directories.
When you create a bucket, you are reserving that name.
Read after Write consistency for PUTS of new objects
Eventual Consistency for overwrite PUTS and DELETES (can take some time to propagate)
The basics
S3 Storage Tiers/Classes
1. S3 - Normal
2. S3 IA (Infrequently Accessed) - retrieval fee
3. RRS - Reduced Redundancy Storage. Great for objects that can be lost.
4. Glacier - Very cheap, but 3-5 hours to restore from Glacier! As low $0.01 per gigabytes per month
S3 is charged for the amount of storage, the number of requests and data transfer pricing.
If it is the first time accessing it, you will be greeted with a screen that simply allows you to create a bucket.
Top right hand side gives you options for what the side bar view is.
Under properties, you can change things such as Permissions, Static Website Hosting etc.
For the Static Websites, you don't have to worry about load balances etc.
o You can't server-side scripts on these websites eg. php etc.
Logging can be used to keep a log
Events are about triggering something on a given action eg. notifications etc.
You can allow versioning
Cross-Region Replication can be done for other regions
If you click on a bucket, you can see details of what is there. By default, permissions are set that access is denied.
You can set the Storage class and Server-Side Encryption from here too.
{
"Version": "2008-09-17",
"Statement": [
{
"Sid": "AllowPublicRead",
"Effect": "Allow",
"Principal": {
"AWS": "*"
},
"Action": "s3:GetObject",
"Resource": "arn:aws:s3:::dok-basics/*"
}
]
}
If you turn Versioning on for the bucket, you can only suspend it you. You cannot turn it off. Click it and you can set
the versions.
To add files, you can go and select "Actions" and select upload.
If you show the Versions, it will give you the version ID.
If you delete the file, it will show you the file and the delete markers. You can restore the file by selecting the
delete marker and selecting actions and delete.
Bear in mind, if you do versioning, you will have copies of the same file.
Cross-Region Replication
To enable this, go to your properties. You will need to Enable Versioning for this to be enabled.
In order for this to happen, you also need to Create/Select IAM Roles for policies etc.
Existing Objects will not be replicated, only uploads from then on.
Amazon handle all the secure transfers of the data for you.
Versioning integrates with Lifecycle rules. You can turn on MFA so that it requires an auth code to delete.
AWSCSA-7: CloudFront
A CDN is a system of distributed servers (network) that deliver webpages and other web content to a user based
on the geographic locations of the user, the origin of the webpage and a content delivery server.
Key Terms
Edge Location: The location where content will be cached. This is separate to a AWS Region/Avail Zone.
Origin: This is the origin of all the files that the CDN will distribute. This can be a S3 bucket, EC2, Route53, Elastic
Load Balancer etc. that comes from the source region.
Distribution: This is the name given to the CDN which consists of a collection of Edge Locations.
TTL (Time to Live): TTL is the time that is remains cached at this CDN. This cacheing will make is more useful for
other users.
Summary
__*Amazon CloudFront can be used to deliver your entire website, including dynamic, static, streaming and
interactive content using a global network of edge locations. Requests for your content are automatically routed to
the nearest edge location, so content is delivered with the best possible performance.
CloudFront is optimized to work with other AWS. CloudFront also works seamlessly with any non-AWS origin
server, which stores the original, definitive versions of your files.*__
You can also removed cached objects, but it will cost you.
You can restrict the bucket access to come only from the CDN.
Follow the steps to allow things like Read Permissions and Cache Behaviour.
Distribution Settings
If you want to use CNAMEs, you can upload the certificate for SSL.
After it is done, you can use the domain name provided to start accessing the cached version.
You can create multiple Origins for the CDN, and you can also updates behaviours etc. for accessing certain files
from certain types of buckets.
You can also create custom error pages, restrict content based on Geography etc.
Bucket Policies
Access Control Lists
S3 buckets can be configured to create access logs which log all requests made to the S3 bucket
Encryption
2 types
AWS Storage Gateway is a service that connects an on-premises software appliance with cloud-based storage to
provide seamless and secure integration between an organization's on-premises IT environment and AWS's storage
infrastructure. The service enables you to securely store data to the AWS cloud for scalable and cost-effective
storage.
Essentially replicates data from your own data center to AWS. You install it as a host on your data center.
You can use the Management Console to create the right console for you.
1. Gateway Stored Volumes - To keep your entire data set on site. SG then bacs this data up asynchronously
to Amazon S3. GS volumes provide durable and inexpensive off-site backups that you can recover locally
or on Amazon EC2.
2. Gateway Cached Volumes - Only your most frequently accessed data is stored. Your entire data set is
stored in S3. You don't have to go out and buy large SAN arrays for your office/data center, so you can get
significant cost savings. If you lose internet connectivity however, you will not be able to access all of your
data.
3. Gateway Virtual Tape Libraries (VTL) - Limitless collection or VT. Each VT can be stored in a VTL. If it is
stored in Glacier, is it a VT Shelf. If you use products like NetBackup etc you can do away with this and just
the VTL. It will get rid of your physical tapes and create the virtual ones.
AWSCSA-10: Import/Export
AWS Import/Export Disk accelerates moving large amounts of data into and out of the AWS cloud using portable
storage devices for transport. AWS Import/Export Disk transfers your data directly onto and off of storage devices
using Amazon's high-speed internal network and bypassing the Internet.
You essentially go out and buy a disk and then send it to Amazon who will then import all that data, then send your
disks back.
There is a table that shows how connection speed equates to a certain amount of data uploaded in a time frame to
give you an idea of what is worthwhile.
Amazon's product that you can use to transfer large amounts of data in the Petabytes. It can be as little as one fifth
the price.
Summary
Import/Export Disk:
Import to EBS
Import to S3
Import to Glacier
Export from S3
Import/Export Snowball:
Only S3
Only currently in the US (check this out on the website for the latest)
Uses the CloudFront Edge Network to accelerate the uploads to S3. Instead of uploading directly to S3, you can use
a distinct URL to upload directly to an edge location which will then transfer that file to S3. You will get a distinct
URL to upload to.
eg prefix.s3-accelerate.amazonaws.com
From the console, access your bucket. From here, what you want to do is "enable" Transfer Acceleration. This
endpoint will incur an additonal fee.
You can check the speed comparison and it will show you how effective it is depending on distance from a region.
If you see similar results, the bandwith may be limiting the speed.
EC2 is a web service that provides resizable compute capacity in the cloud. Amazon EC2 reduces the time required
to obtain and boot new server instances to minutes, allowing you to quickly scale capacity, both up and down, as
your computing requirements change.
To get a new server online used to take a long time, but then public cloud came online and you could provision
virtual instances in a matter of time.
EC2 changes the economics of computer by allow you to pay only for capacity that you actually use. Amazon EC2
provides developers the tools to build failure resilient applications and isolate themselves from common failure
scenarios.
Pricing Models
1. On Demand - allows you to pay a fixed rate by the hour with no commitment.
2. Reserved - provide you with a capacity reservation, and offer a significant discount on the hourly charge
for an instance. 1 year or 3 year terms.
3. Spot - enable you to bid whatever price you want for instance capacity, providing for even greater savings
if your applications have flexible start and end times.
You would use Reserved if you have a steady state eg. two web servers that you must always have running.
Users are able to make upfront payments to reduce their total computing costs even further.
On Demand would be for things like a "black Friday" sale where you spin up some web servers for a certain
amount of time.
This is for low cost and flexible EC2 without any up-front payment or long-term commitment. Useful for
Applications with short term, spiky, or unpredictable workloads that cannot be interrupted, or being tested or
developed on EC2 for the first time.
Spot Instances go with your bidding, but if the spot price goes above your bid, you will be given an hour notice
before the instance is terminated. Large compute requirements are normally used this way. They basically time
these instances and search where to get the best pricing.
You can check Spot Pricing on the AWS website to see the history etc. to make an educated guess.
This is for applications on feasible at low costs, and for users with urgent computing needs for large amounts of
additional capacity.
Instance Types
DIRTMCG
D for density I for IOPS R for RAM T for cheap general purpose (this T2) M for main choice for apps C for compute G
for Graphics
Amazon EBS allows you to create storage volumes and attach them to Amazon EC2 instances. Once attached, you
can create a file system on top of these volumes, run a database, or use them in any other way you would use a
block device.
It is basically a disk in the cloud. The OS is installed on here + any DB and applications. You can add multiple EBS
instances to one EC2 instance.
Volume Types
Click on "Launch Instance" and will take you to choose an AMI (Amazon Machine Image).
For this example, we will choose Amazon Linux because it comes with a suite of things already available. This will
then take you to choose the Instance Types.
#!/bin/bash
yum update -y
In Step 4: Add Storage, we can change the IOPS by changing the size of the instance and we can alter the Volume
Type.
Step 5: Tag Instance is used to give a key value pair for the tag.
This is useful for things like resource tagging and billing purposes. You can monitor which staff are using
what resources.
After reviewing and Selecting launch, you will need to download a new key pair.
rom here, we are back to the EC2 Dashboard and that shows us the status.
Once the status is available, head to terminal and configure the file the way you normally would in your
/.ssh/config file (for easy access - refer to the SSH-Intro.md SSH-7 file for more info).
Note: Ensure that you change the permissions on the .pem key first!
Go to the EC2 section in the AWS console and select Security Groups.
You can edit Security Groups on the fly that takes effect immediately.
In terminal
1. SSH in
2. Turn to root
3. Install apache yum install httpd -y
From the EC2 Dashboard, you can select "Volumes" and then set a name for the volume. It is best practice to name
your volumes.
mkdir /fileserver
mount /dev/xvdf /fileserver
cd /fileserver
ls // shows lost+found
rm -rf lost+found/
ls
nano helloworld.txt
nano index.html
Create a snapshot
We can then attach the new volume. Now we can go through the above process and mount again.
Using the command file -s /dev/xvdf we can check the files available.
Security
To create a snapshot for Amazon EBS volumes that serve as root devices, you should stop the instance before
taking the snapshot.
You specify the AMI you want when you launch an instance, and you can launch as many instances from the AMI
as you need.
Three Components
If you have a running instance (from before), we can create a snapshot of the volume that is attached to that
instance.
From here, we can select that Snapshot and select "Create Image". Set the name and choose settings, then select
create.
Under the Images > AMIs on the left hand bar, you can see the images owned by you and the public image. You
can share these AMIs if you know the AWS Account Number.
If you are looking to make a public AMI, there are a few things that you would like to consider. Check the website
for that.
history -c
Summary
EBS backed and Instance Store backed are two different types.
We can select our AMI based on Region, OS, Architecture, Launch Permissions and Storage for the Root Device.
When launching an instance, it will also mention what type of AMI type it is.
After IS have been launched, you can only add additional EBS from then on.
You cannot stop an IS. However, with EBS, you can. So why would I want to stop an instance? If the underlining
hypervisor is in a failed state, you can stop and start and it will start on another hypervisor. However, you cannot
do that on an IS. You also cannot detach. Better for provisioning speed times.
IS volumes are created from a template stored in S3 (and may take more time). Also know as Ephemeral Storage.
EBS you can tell to keep the root device volume if you so wish.
In the terminal
cd /var/www/html
Now head back to the console for EC2. Head to the load balancer section.
Leave the other defaults and move on to choose the security groups.
The Check will hit the file. The unhealthy thresh hold will check twice before bringing in the load balancer. The
healthy threshold will then wait for 10 successful checks before bringing back our service.
Repose Timeout: How long to check for the response Interval: How often to check Unhealthy Threshold: How many
fails before the load balancer comes in. Healthy Threshold: How many successes it needs to take load balancer off.
Back on the dashboard, it should then come InService after the given time limit.
If this isn't working...
The DNS name in the information is what you will use to resolve the load balancer.
CloudWatch looks after all of your obervations for things like CPU usage etc.
In CloudWatch
EC2 metrics are only on a Hypervisor level. Memory itself is actually missing.
You can update the time frame on the top right hand side.
This whole thing is about being to keep a heads up on the different instances etc.
Events
CW Events help you react to changes in the state of the AWS environment. Auto invoke things like a AWS Lambda
function to update DNS entries for that event.
Logs
You can store an agent on an EC2 instance and it will monitor data about that and send it to CloudWatch. These
will be things like HTTP response codes in Apache logs.
Alarms
You can select the Period and Statistics on how you want this to work etc.
Summary
If we create a new user in IAM with certain permissions. After downloading the keys for this new user, we can
create a group that has full access for S3.
Back on the EC2, we can see on the dashboard a running instance that has no IAM role. You can only assign this
role when you create this instance.
aws configure
If it needs the Access Key ID and Access Key, then copy and paste it in. Then choose the default region. You do not
to put anything for the output format.
cd ~
cd .aws
ls
#shows config and credentials
You can nano into credentials. Others could easily get into this. You can access someone's environment using these
credentials. Therefore, it can unsafe to store these here.
That's where roles come in. An EC2 instance can assume a role.
Then go back to EC2 and launch a new instance. You can select the role for IAM role and go through and create the
instance.
Again, roles can have permissions changed that will take effect, but you cannot assign a new role to an EC2
instance after launching again, this is important.
Now if we ssh into our instance, we will find that in the root file there is no .aws file. Therefore, there are no
credentials that we have to be worried about.
Summary
These are scripts that our EC2 instances will run when they are created.
In the AWS console itself, go into S3, create a bucket. This bucket will contain all the website code. Upload the
code.
Create a new instance of the EC2 instance, use T2 micro and then go into advanced details and add in some text.
#!/bin/bash
yum install httpd -y
yum update -y
aws s3 cp s3://dok-example/index.html /var/www/html
service httpd start
chkconfig httpd on
Now after the instance is up, we should be easily about to navigate to the IP address and every thing should be
running.
This is data about the EC-2 instance and how we can access this data from the command line.
Elevate the privileges and the use the following to get some details.
curl http://169.254.169.254/latest/meta-data/
What is returned is the meta-data values you can pass after the url again to recieve data about.
I am healthy.
Drop that guy into the relevant bucket. Ensure the load balancer is set up.
From here, you can select the AMIs related. Select the T2 micro from Amazon if you wish.
Add the roles etc and add in the advanced details if required.
Note: Using the aws command line, you can copy files from a bucket.
Select the security group, and then getting a warning about the file.
After, you will create the Auto Scaling Group. You can choose the group size too. Using the subnet, you can create
a subnet and network.
If you have three groups and three availability zones, it will put each group in each availability zone.
In the advanced details, we can set up the Elastic Load Balance and Health Check Type (ELB in this case).
Health Grace Period is how often the health is checked. The health check will also fail until Apache is turned on.
Scaling Policies
This allows us to automatically increase and decrease the number of instances depending on the number of
settings that we do.
We can scale between 1 and 5 instances. You will choose to create and alarm and execute a policy at a certain
time. These will grow and shrink depending on the settings.
Once they are up, you can check each IP address and see if they are up. If you choose the DNS, it will go towards
one of the addresses.
We can use things like Route 53 and use it to help start sending traffic to other parts of the world.
What is it? A logical grouping of instances within a single available zones. This enables applications to participate in
a low-latency, 10 Gbps network. Placement groups are recommended for applications that benefit from low
network latency, high network throughput, or both.
Recommended for things like group computer and you need a low latency for things like Cassandra Notes.
EFS is a file storage service for EC2. It is easy to use and has a simple interface for configuration. Storage grows and
shrinks as needed.
Set up is similar to other set ups. We can predetermine our IP addresses and security groups.
While the set up is created, you can create a two or more instances and provision them.
Once they are all up, head back to the EFS and if it is ready in the availability zones, go and note down the public
ips for the instances.
Note: Make sure that the instances are in the same security group as the EFS.
If they are all set up, we can head back to EC2. Again, grab the ips and run the two instances in two different
windows.
Once you've ssh'd into the instanes, instance Apache to run for the webserver. Start the server up!
You can select the EC2 Mount Instructions, then run the command to mount the EFS and ensure that it moves to
/var/www/html/.
Now these will be mounted on EFS, and if we now nano index.html and create a home page.
This will move the files across both instances!
What is Lambda? It's a compute servie where you can upload your code and create a Lambda function. AWS
Lambda takes care of provisioning and managing the servers that you use to run the code. You don't have to worry
about OS, patching, scaling, etc.
Event driven compute service. Lambda can run code in response to events. eg. uploading a photo to S3
and then Lambda triggers and turns the photo into a thumbnail etc.
Transcoding for media
As a compute service to run your code in response to HTTP requests using Amazon API Gateway or API
calls made using AWS SDKs.
The Structure
Data Centres
Hardware
Assembly Code/Protocols
High Level Languages
OS
App Layer/AWS APIs
Lambda captures ALL of this. You only have to worry about the code, Lambda looks after everything else.
Pricing is ridiculously cheap. You pay for number of requests and duration. You only pay for this when the code
executes.
Why is it cool?
No servers. No worry for security vulnerabilities etc! Continuously scales. No need to worry about auto scaling.
AWSCSA-23: Route53
DNS is used to convert human friendly domain names into an Internet Protocol address (IP).
Domain Registrars
Names are registered with InterNIC - a service of ICANN. They enforce the uniqueness.
Route53 isn't free, but domain registrars include things like GoDaddy.com etc.
SOA Records
NS Records
used by Top Level Domains to direct traffic to the Content DNS servers which contains the authoritative
DNS records.
A Records
Address record - used to translate from a domain name to the IP address. A records are always IPv4. IPv6 is AAA.
TTL
Length that the DNS is cached on either the Resolving Server or on your PC. This is important from an architectural
point of view.
CNAMES
Canonical Name (CName) can be used to resolve one domain name to another. eg. you may have a mobile website
m.example.com that is used for when users browse to your domain on a mobile. You may also want
mobile.example.com to point there as well.
Alias Records
Used to map resource record sets in your hosted zone to Elastic Load Balancers, CloudFront Distribution, or S3
buckets that are configured as websites.
Alias records work like a CNAME record in that you can map one DNS name (www.example.com) to another
'target' DNS name (aeijrioea.elb.amazonaws.com)
The naked domain name MUST always be an A record, not a C name. eg dennis.com.
Summary
For an ELB, you need a DNS name to resolve to an ELB. You will always need an IPv4 domain to resolve this... which
is why you have the Alias Record.
Records with alias records won't have you charged, whereas CName will.
Bootstrap Script
#!/bin/bash
yum install httpd -y
service httpd start
yum update -y
echo "Hello Cloud Gurus" > /var/www/html/index.html
After moving through and launching the instance, create a load balancer.
Create a Record Set. You need to create an alias and the target address will have your ELB DNS addresses.
Once this is created, we should be able to type in the domain name and visit the website!
Simple: This is the default. Most commonly used when you have a single resource that performs a given function
for your domain eg. one web server that serves content for the a website.
Weighted: Example you can send a percentage of users to be directed to one region, and the rest to others. Or
split to different ELBs etc.
Used for splitting traffic regionally or if you want to do some A/B testing on a new website.
Latency: This is based on the lowest network latency for your end user (routing to the best region). You create a
latency resource set for each region.
Route53 will select the latency resource set for the region that will give them the best result and repond with be
resource set.
User -> DNS -> the better latency for an EC2 instance
Failover: When you want to create an active/passive set up. Route53 will monitor the health of your primary site
using a health check.
Summary
AWSCSA-24: Databases
There is a Bootstrap bash script you can use for practising purposes from this course.
#!/bin/bash
yum install httpd php php-mysql -y
yum update -y
chkconfig httpd on
service httpd start
echo "<?php phpinfo();?>" > /var/www/html/index.php
cd /var/www/html
wget https://s3-eu-west-1.amazonaws.com/acloudguru/connect.php
To create an RDS instance
Choose an instance class, "No" for Multi-AZ Deployment and leave everything else as default.
Back in EC2
We want to check if the bootstrap has worked successfully. We can do so by checking the site.
In the security group, we want to allow security MYSQL/aurora to be able to connect and work for our security
group. It's an inbound rule.
Automated Backups
2 Different Types
1. Automated Backups
2. Database Snapshots
Auto back ups allow you to recover your database to any point in time within a retention period. That can last
between one and 35 days.
Auto Backups will take a full daily snapshot and will also store transaction logs throughout the day.
When you do a recovery, AWS will first choose the most recent daily back up and then apply transaction logs
relevant to that day.
This allows you to do a point in time recovery down to a second, within a retention period.
Snapshots are done manually, and they are stored even after you delete the original RDS instance, unlike
automated back ups.
Whenever you restore the database, it will come on a new RDS instance with a new RDS endpoint.
Encryption
Once your RDS instance is encrypted the data stored at rest in the underlying storage is encrypted, as are its
automated backups, read replicas and snapshots.
To restore, you can click in and restore. It will create a new database.
For Point In Time, you can select the option and click back to a point in time.
You can migrate the Snapshot onto another database, you can also copy it and move it to another region and you
can of course restore it.
If it is encrypted, you will need to then use the KMS key ID and more.
Multi-AZ
If you have three instances, they can connect to another database which will then move data across to another
database. You will then not need to move the entry point over.
It is for Disaster Recovery only. It is not primarily used for improving performance. For performance improvement,
you need Read Replicas.
Read Replica
Different to Multi-AZ. Again with the three instances to an RDS, it creates an exact copy that you can read from.
Multi-AZ is more for Disaster Recovery. This way you can improve your performance.
You can change the connection strings to your instances to read from the read replicas etc.
They allow you to have a read-only replica. This is achieved using Async replication form the primary RDS instance
to the read replica. You use read replica's primarily for very read-heavy database workloads.
Remember, RR is used for SCALING. You can also use things like Elasticache. This comes later.
YOU MUST have auto back ups turned on in order to deploy a read replica. You can have up to 5 read replica
copies of any database. You can have more read replicas, but then you could have latency issues.
Each replica will have its own DNS end point. You cannot have a RR with Multi-AZ.
DynamoDB vs RDS
What is it?
A fast and flexible noSQL database service for all applications that need consistent, single-digit millisecond latency
at any scale.
It is a fully managed database and support both document and key-value data models. The flexible data model and
reliable performance make it a great fit for mobile, web, gaming, ad-tech, IoT and many other applications.
Facts
Creating DynamoDB
You can even go into the tables and start creating an item from the dashboard. rom here, you can start creating
fields.
You can then add in more columns for documents as your grow.
Data warehousing service in the cloud. You can start small and then scale for $1000/TB/year.
Config
You can also have a Multi Node (think Toyota level) - Leader Node (manages client connections and receives
queries). - Compute Node (store data and perform queries and computations). Up to 128 compute nodes. - These
two Nodes work in tandem.
Instead of storing data as a series of rows, Redshift organises data by column. While row-based is ideal for
transaction processing, column-based are ideal for data warehousing and analytics where queries often involve
aggregate performed over large data sets.
Since only the columns involved in the queries are processed and columnar data is stored sequentially on the
storage media, column-based systems require far fewer I/Os, greatly improving query performance.
Advanced Compression - columnar data stores can be compressed much more than row-based data stores because
similar data is stored sequentially on disk.
Redshift employs multiple compression techniques and can often achieve significant compression relative to
traditional relational data stores.
In addition, Amazon Redshift doesn't require indexes or materialised views and so uses less space than traditional
relational database systems. When loading into an empty table, Amazon Redshift automatically samples your data
and selects the most appropriate compression scheme.
Redshift auto distributes data and query load across all nodes. Redshift makes it easy to add nodes to your data
warehouse and enables you to maintain fast query performance as your data warehouse grows.
This whole thing is priced on compute nodes. 1 unit per node per hour. Eg. a 3-node data warehouse cluster
running persistently for an entire month would incur 2160 instance hours. You will not be charged for the leader
node hours; only compute nodes will incur charges.
Security
Design
What is it?
It's a web service that makes it easy to deploy, operate and scale an in-memory cache in the cloud. The service
improves the performance of web applications by allowing you to retrieve info from fast, managed, in-memory
caches, instead of relying entirely on slow disk-based databases.
It can be used to significantly improve latency and throughput for many read-heavy application workload or
compute-intensive workloads.
Caching improves application performance by storing critical pieces of data in memory for low-latency access.
Cached info may include the results of I/O-intensive database queries or the results of computationally intensive
calculations.
1. Memcached
o Widely adopted memory object caching system. ElastiCache is protocol compliant with
Memcached, so popular tools that you use today with existing Memcached environments will
work seamlessly with the service.
2. Redis
o Open-source in-memory key-value store that supports data structures such as sorted sets and
lists. ElastiCache supports Master/Slave replication and Multi-AZ which can be used to achieve
cross AZ redundancy.
ElastiCache is a good choice if your database is particularly read heavy and not prone to frequent change.
AWSCSA-25: VPC
Most important thing for that CSA exam. You should know how to build this from memory for the exam.
What is it?
It lets you provision a logically isolated section of AWS where you can launch AWS resources in a virtual network
that you define.
You have complete control over your virtual networking environment, including selection of your own IP address
range, creation of subnets, and configuration of route tables and network gateways.
You can also create Hardware Virtual Private Network (VPN) connections between your corporate datacenter and
your VPC and leverage the AWS cloud as an extension of your corporate datacenter.
VPC Peering
Connect one VPC with another via a direct network route using a private IP address.
You can peer VPCs with other AWS accounts as well as with other VPCs in the same account.
Peering is a star config, 1 VPC that peers with 4 others. The outer four can only contact with the middle. No such
thing as transitive peering.
From the dashboard, head to the VPC section and create a VPC.
Under Subnet, we want to create some subnets. Select the Test-VPC and the availability zone. REALLY
IMPORTANT - the subnet is always mapped to one availability zone.
Under Subnet > Route Table, we can see the target. Now if we deploy those 3 subnets, they could all communicate
to each other through the Route Table.
Now we need to create an Internet Gateway. It allows you Internet Access to your EC2 Instances.
Create this. There is only one Internet Gateway per VPC. Then attach that to the desired VPC.
Now we need to create another Route Table. Now under Route Tables > Routes, edit this. Set the Destination to
0.0.0.0/0 and set the target to our VPC.
Now in Route Table > Subnet Associations, we need to decide which subnet we want to be internet accessible.
Select one through edit and save. Now in the Route Table > Routes section for other table we created, it no longer
has that Internet Route associated.
So now we effectively have a subnet 10.0.1.0 with internet access, but not for the other 2.
If we deploy instances into one with internet access and one without, we select the network as the new VPC and
then select the correct subnet. We can auto assign the public IP.
For the security group, we can set HTTP to be accessed for everywhere.
Ensure that you create a new key pair. With that key value, copy that to the clip board and launch another
"database" server. This will sit inside a subnet that is not internet accessible. Select the correct VPC and put it into
10.0.2.0 (other the other) and name is "MyDBServer". Stick it inside the same security group.
Hit review and launch, then use the existing key pair (the new one that was created).
Once you have SSH'd in, you can use the update to install and prove you have Internet access.
If you head back to the instance with the DBServer, you can see there is no IP address. SSH in from the first
WebServer (ssh to that, then ssh to the DBServer). Once in, the private subnet (not internet accessible) will fail and
then we can create a NAT instance so we can actually download things like updates etc.
Move into the security groups and create a new one for the NAT instance.
We want http and https to flow from public subnets to private subnets.
For Inbound, let's create HTTP and set our DBServer IP. Do the same for HTTPS.
For the Outbound, we could configure it to only allow the HTTP and HTTPS traffic out for anywhere.
Head back to EC2, and launch an instance and use a community AMI. Search for nat and find the top result.
Deploy to our VPC, disable to IP and select the web accessible VPC.
For the disabled IP, even if you have instances inside a public subnet, it doesn't mean they're internet accessible.
You need to give it either a public IP or an ELB.
We can call it "MyNATVM" and add it to the NAT security group and then review and launch. We don't need to
connect into this instance which is cool. It's a gateway. Use an existing key pair and launch.
Head to Elastic IPs and allocate a new address. This does incur a charge if it isn't associated with a EC2 instance.
Associate the address with the NATVM.
Now, go back to the Instance, select Actions, choose Networking and select Change Source Dest. Check. Each EC2
instance performs checks by default, however NAT needs to be able to send source info - we need to disable this!
Otherwise, they will not communicate with each other.
Jump into VPC and look at the default Route Tables. The main Route Table without the Internet Route, go to
Routes, Edit and add another route. Select MyNATVM and set the destination as 0.0.0.0/0.
Now we can start running updates etc into the instance with a private IP.
You use the NAT to translate into the computer that communicates to each other and enable that private IP to run
internet commands.
ACLs act like a firewall that allow you set up network rules between subnets. When you set an ACL, it overrides
security groups.
Amazon suggest setting rules in % 100s. By default, each custom ACL starts out as closed.
The key thing to remember is that each subnet must be associated with a network ACL - otherwise it is associated
with the default ACL.
How to do it
Under VPC > Network ACL, you check the Test-VPC and see the Inbound and Outbound rules set to 0.0.0.0/0.
For rules, the lowest is evaluated first. This means 100 is evaluated before 200.
Everything by default is denied for Inbound and Outbound. When you associate subnets with the ACL, it will only
be associated with that ACL.
When you disassociate the subnets, it will default back to the "default" ACL.
Created a VPC
o Defined the IP Address Range using CIDR
o Default this created a Network ACL & Route table
Created a custom Route Table
Created 3 subnets
Created a internet gateway
Attached to our custom route
Adjusted our public subnet to use the newly defined route
Provisioned an EC2 instance with an Elastic IP address
NAT Lecture
ACL
Gives you access to a message queue that can be used to store messages while waiting for a computer to process
them.
Eg. uploading an image file and that a job needs to be done on it. That message is on SQS. It will queue that system
job and those app services will then access that image from eg s3 and will do something like adding a watermark
and then when it's done, it will remove that message from the queue.
Therefore, if you've lost a web server that message will still stay in that queue and other app services can go ahead
and do that task.
You can decouple components. A message can contain 256KB of text in any format. Any component can then
access that message later using the SQS API.
The queue acts as a buffer between the component producing and saving data, and the component receiving the
data for processing. This resolves issues that arise if the producer is producing work faster than the consumer can
process it, or if the producer or consumer are only intermittently connected to the network. - referencing
autoscaling or fail over.
Ensures delivery of each message at least once. A queue can be used simultaneously. Great for scaling outwards.
SQS does not guarantee first in, first out. As long as all messages are delivered, sequencing isn't important. You can
enable sequencing.
Eg. a image encode queue. A pool of EC2 instances running the needed image processing software does the
following:
1. Async pull task messages from the queue. ALWAYS PULLS. (Polling)
2. Retrieves the named file.
3. Processes the conversion.
4. Writes the image back to Amazon S3.
5. Writes a "task complete" to another queue.
6. Deletes the original task.
7. Looks for new tasks.
Example
You can configure auto scaling. It will see the group growing fast and start autoscaling in response to what you
have set. Some of the back bones of the biggest websites out there. SQS works in conjunction with autoscaling.
Summary
It's a web service that makes it easy to coordinate work across distributed application components. SWF enables
applications for a range of use cases, including media processing, web application back-ends, business process
workflows, and analytics pipelines, to be designed as a coordination of tasks.
Tasks represent invocations of various processing steps in an application which can be performed by executable
code, web service calls, human actions and scripts.
Amazon use this for things like processing orders online. They use it to organise how to get things to you. If you
place an order online, that transaction then kicks off a new job to the warehouse where the member of staff can
get the order and then you need to get that posted to them.
That person will then have the task of finding the hammer and then package the hammer etc.
SQS has a retention period of 14 days. SWF has up to a year for workflow executions.
Amazon SWF presents a task-oriented API, whereas Amazon SQS offers a message-orientated API.
SWF ensures that a task is assigned only once and is never duplicated. With Amazon SQS, you need to handle
duplicated messages and may also need to ensure that a message is processed only once.
SWF keeps track of all tasks in an application. With SQS, you need to implement your own application level
tracking.
SWF Actors
1. Workflow starters - an app that can initiate a workflow. Eg. e-commerce website when placing an order.
2. Deciders - Control the flow of activity tasks in a workflow execution. If something has finished in a
workflow (or fails) a Decider decides what to do next.
3. Activity workers - carry out the activity tasks.
This is a mobile service. It's a service that make it easy to send and operate notifications from the cloud. It's a
scalable, flexible and cost-effective way to publish message from an application and immediately deliver them to
subscribers or other applications.
Useful for things like SNS. That can email you or send you a text letting you know that your instance is growing.
SNS can also deliver notifications by text, email or SQS queues or any HTTP endpoint. It can also launch Lambda
functions that will be invoked. The message input is a payload that is reacts to.
Benefits
SNS vs SQS
Both messaging services, but SNS is push while SQS is pulls (polls).
Relatively new service. It converts media files form their original source format in to different formats.
Example uploading media to an S3 bucket will trigger a Lambda function that then uses the Elastic Transcoder to
put it into a bunch of different formats. The transcoded files could then be put back into the S3 Bucket.
SQS
It is a web service that gives you access to a message queue that can be used to store messages while
waiting for a computer to process them.
Eg. Processing images for memes and then storing in a S3 bucket.
Not FIFO.
12 hour visability time out.
At least once messaging.
Messages are 256kb but billed in 6kb chunks
You can use two SQS queues and the premium can be polled first and when it is emptied then you can use
the second queue.
SWF
SNS
Set up the two security groups: one for the EC2 and the other for the RDS.
After that has been creating, go into the web security group and ensure that you can allow port 80 for HTTP and 22
for SSH.
For the RDS group, allow MySQL traffic and choose the source as the web security group.
Head to S3 and create a new bucket for the WordPress code. Choose the region for the security groups that you
just made.
Once the bucket has been created, sort out the CDN. So head to CloudFront and create a new distribution. Use the
web distribution and use the bucket domain. The Origin path would be the subdirectory. Restrict the Bucket Access
to hit CloudFront only and not S3. Ensure that you update the bucket policy so that it always has read permissions
for the public.
Head and create the RDS instance. Launch the MySQL instance. You can use a free-tier if you want, but multi-AZ
will require an incurred cost.
Ensure you've set the correct settings that you would like.
Add that to the RDS security group and ensure that it is not publicly accessible.
Head over to EC2 and provision a load balancer. Put that within the web security group and configure the health
checks.
Once the Load Balancer is up, head to Route 53 and set up the correct info for the naked domain name. You will
need to set an alias record, and set that to the ELB.
Head to EC2 and provision an instance. The example uses the Amazon Linux of course.
Ensure you assign the S3 role to this. Add the bootstrap script from the course if you want.
Ensure that you have given the correct user rights to wp-content.
In the uploads directory for wp-content, you'll notice that it hasn't nothing. If you do upload the image through
wp-admin, it'll be available (as you could imagine).
Back in the console in uploads, you can then ls and see the file there. We want it so that all of our images head to
S3 and they will be served out of CloudFront.
cd /var/www/html
List out the s3 bucket to start synchronising the content.
Options +FollowSymlinks
RewriteEngine on
rewriterule ^wp-content/uploads/(.*)$ [cloudfront-link] [r=301,nc]
# BEGIN Wordpress
# END Wordpress
Now, we actually want to edit all this.
Creating an Image
Select the EC2 instance and then select that EC2 and create an image. After it is finished in the AMIs, you can
launch it on another computer.
When you move that file across, you can launch the instance and have a bash script to run the updates and then
do a AWS sync.
When the instance is live, we should be able to just go straight to the IP address.
Create an autoscaling group from the menu. Initial create a launch config.
There is a TemplateWPWebServer AMI from A Cloud Guru that they use here.
In the scaling policies, we want 2 to 4. Sort out the rest of the stuff for the ASG, review and launch.
Once the instances are provisioned, you'll also note that the ELB will have no instances until everything has set up.
Now we can run a stress test call stress that was installed in the A Guru Bootstrap.
A lot of what you will be doing for services. CF is like Bootstrapping for AWS.
In the EC2 instance, ensure that you've done a tear down and then remove the RDS instance as well.
Create a new stack. Here, we can design a template ourselves or we can hit a template designer.
From here, you will specify details. You can set the parameters from here!
Running this we can actually have the CF Stack provision everything for us.
Key items you should know before you take the exam:
1. How to configure and troubleshoot a VPC inside and out, including basic IP subnetting. VPC is arguably
one of the more complex components of AWS and you cannot pass this exam without a thorough
understanding of it.
2. The difference in use cases between Simple Workflow (SWF), Simple Queue Services (SQS), and Simple
Notification Services (SNS).
3. How an Elastic Load Balancer (ELB) interacts with auto-scaling groups in a high-availability deployment.
4. How to properly secure a S3 bucket in different usage scenarios
5. When it would be appropriate to use either EBS-backed or ephemeral instances.
6. A basic understanding of CloudFormation.
7. How to properly use various EBS volume configurations and snapshots to optimize I/O performance and
data durability.
Excellent understanding of typical multi-tier architectures: web servers, caching, application servers, load
balancers, and storage
Understanding of Relational Database Management System (RDBMS) and NoSQL
Knowledge of message queuing and Enterprise Service Bus (ESB)
Familiarity with loose coupling and stateless systems
Understanding of different consistency models in distributed systems
Knowledge of Content Delivery Networks (CDN)
Hands-on experience with core LAN/WAN network technologies
Experience with route tables, access control lists, firewalls, NAT, HTTP, DNS, IP and OSI Network
Knowledge of RESTful Web Services, XML, JSON
Familiarity with the software development lifecycle
Work experience with information and application security concepts, mechanisms, and tools
Awareness of end-user computing and collaborative technologies
Change driven
Essences Plan driven
Value driven
Plan, process and change control
People, collaboration and shared value
Project Success Measured against final outcome Measured against the plan
Projects with fast changing environments (e.g. Large projects with relatively fixed requirements
Ideal For
software development) (e.g. construction)
1. Act as an advocate for Agile principles with customers and the team to ensure a shared Agile mindset
2. Create a common understanding of the values and principles of Agile through practising Agile practices
and using Agile terminology effectively.
3. Educate the organization and influence project and organizational processes, behaviors and people to
support the change to Agile project management.
4. Maintain highly visible information radiators about the progress of the projects to enhance transparency
and trust.
5. Make it safe to experiment and make mistakes so that everyone can benefit from empirical learning.
6. Carry out experiments as needed to enhance creativity and discover efficient solutions.
7. Collaborate with one another to enhance knowledge sharing as well as removing knowledge silos and
bottlenecks.
8. Establish a safe and respectful working environment to encourage emergent leadership throughself-
organization and empowerment.
9. Support and encourage team members to perform their best by being a servant leade
1) Which of the following is an Agile Manifesto principle?
A) Welcome changing requirements, early in development. Agile processes handle changes for the customer's
competitive advantage.
B) Welcome changing priorities, early in development. Agile processes harness change for the customer's
competitive advantage.
C) Welcome changing priorities, even late in development. Agile processes handle changes for the customer's
competitive advantage.
D) Welcome changing requirements, even late in development. Agile processes harness change for the
customer's competitive advantage.
Answer: D
Explanation: The correct wording of the principle is “Welcome changing requirements, even late in
development. Agile processes harness change for the customer's competitive advantage.” The agile principles
do not speak to changing priorities or to welcoming only early changes.
2) When managing an agile software team, engaging the business in prioritizing the backlog is an example of:
Answer: B
Explanation: We engage the business in prioritizing the backlog to better understand and incorporate
stakeholder values. Although such engagement will likely impact technical risk reduction, vendor
management, or stakeholder story mapping, these are not the main reasons we engage the business.
Answer: C
Explanation: Product demonstrations provide the benefits of learning about feature suitability and usability,
and they can prompt discussions of new requirements. They are not typically used to learn about feature
estimates, however, since estimating is done during estimation sessions, rather than during demonstrations.
4) Choose the correct combination of XP practice names from the following options:
Answer: C
Explanation: The XP practices include test-driven development, refactoring, and pair programming. “Test-
driven design,” “reforecasting,” and “peer programming” are not XP practice names.
5) An agile team is planning the tools they will use for the project. They are debating how they should show
what work is in progress. Of the following options, which tool are they most likely to select?
Answer: C
Explanation: Of the options presented, the best tool to show work in progress is a task board. The user story
backlog shows what work is still remaining to be done on the project. The product roadmap shows when work
is planned to be completed. Work breakdown structures are not commonly used on agile projects.
6) When using a Kanban board to manage work in progress, which of the following best summarizes the
philosophy behind the approach?
A) It is a sign of the work being done and should be maximized to boost performance.
B) It is a sign of the work being done and should be limited to boost performance.
C) It is a sign of the work queued for quality assurance, which should not count toward velocity.
D) It is a sign of the work queued for user acceptance, which should not count toward velocity.
Answer: B
Explanation: The correct answer is “It is a sign of the work being done and should be limited to boost
7) Which of the following is not true of how burn up charts that also track total scope differ from burn down
charts?
A) Burn up charts separate out the rate of progress from the scope fluctuations
B) Burn up charts and burn down charts trend in opposite vertical directions
C) Burn up charts can be converted to cumulative flow diagrams by the addition of WIP
D) Burn down charts indicate whether rate of effort changes are due to changes in progress rates or scope
Answer: D
Explanation: It is true that burn up charts can be converted to cumulative flow diagrams by the addition of
WIP, and they trend in the opposite vertical direction from burn down charts. It is also true that burn up charts
that also track total scope (rather than burn down charts) separate out the rate of progress from the scope
fluctuations. So the option that is not true is “Burn down charts indicate whether rate of effort changes are
due to changes in progress rates or scope.”
8) As part of stakeholder management and understanding, the team may undertake customer persona
modeling. Which of the following would a persona not represent in this context?
A) Stereotyped users
B) Real people
C) Archetypal description
D) Requirements
Answer: D
Explanation: Personas do represent real, stereotyped, composite, and fictional people. They are archetypal
(exemplary) descriptions, grounded in reality, goal-oriented, specific, and relevant to generate focus. Personas
are not a replacement for requirements on a project, however.
9) An agile team is beginning a new release. Things are progressing a little slower than they initially estimated.
The project manager is taking a servant leadership approach. Which of the following actions is the project
manager most likely to do?
Answer: C
Explanation: In taking a servant leadership approach, the project manager is most likely to do administrative
activities for the team. As implied by the term, the role of a servant leader is focused on serving the team. A
servant leader recognizes that the team members create the business value and does what is necessary to
10) The PMO has asked you to generate some financial information to summarize the business benefits of
your project. To best describe how much money you hope the project will return, you should show an
estimate of:
Answer: B
Explanation: Since we are being asked to show how much the project will return, the metric to choose is the
return on investment (ROI). You might have been tempted to choose net present value (NPV) since this
calculation accounts for inflation, but the question did not ask for an adjusted value. Instead, it simply asked
how much money the project would return. IRR and GDP would not provide the information the PMO has
asked for.
Answer: C
Explanation: Risk burn down graphs do not show the impacts of the risks on the schedule or budget, but they
do show the cumulative risk severities over time. Tracking just the probabilities over time would be of little use
without knowing the impacts of these risks. After all, they could all be trivial, and in that case, why would we
need to be concerned?
12) What is the process cycle efficiency of a 2-hour meeting if it took you 2 minutes to schedule the meeting
in the online calendar tool and 8 minutes to write the agenda and e-mail it to participants?
A) 90%
B) 8%
C) 92%
D) 96%
Answer: C
Explanation: The formula for finding process cycle efficiency is: Total value-added time / total cycle time. In
this question, the value-added time is 2 hours, and the total cycle time is 2 minutes + 8 minutes + 120 minutes
= 130 minutes. So the correct answer is 120 / 130 = 92%.
Answer: B
Explanation: The only option here that is a step in value stream analysis is “Create a value stream map of the
current process, identifying steps, queues, delays, and information flows.” None of the other options are valid
steps in value stream mapping.
14) When we practice active listening, what are the levels through which our listening skills progress?
Answer: D
Explanation: The progression is internal listening (how will this affect me?) to focused listening (what are they
really trying to say?) and then finally to global listening (what other clues do I notice to help me understand
what they are saying?).
15) The Agile Manifesto value “customer collaboration over contract negotiation” means that:
A) Agile approaches encourage you not to focus too much on negotiating contracts, since most vendors are
just out for themselves anyway.
B) Agile approaches focus on what we are trying to build with our vendors, rather than debating the details
of contract terms.
C) Agile approaches prefer not to use contracts, unless absolutely necessary, because they hamper our ability
to respond to change requests.
D) Agile approaches recommend that you only collaborate with vendors who are using agile processes
themselves.
Answer: B
Explanation: Valuing customer collaboration over contract negotiation means we look for mutual
understanding and agreement, rather than spend our time debating the fine details of the agreement.
16) To ensure the success of our project, in what order should we execute the work, taking into account the
necessary dependencies and risk mitigation tasks?
Explanation: It is largely the business representatives who outline the priority of the functional requirements
on the project. That prioritization is then a key driver for the order in which we execute the work.
Answer: D
Explanation: Incremental delivery means that we deploy functional increments over the course of the project.
It does not relate to retrospectives, testing, or changes to the process, so the other options are incorrect, or
“less correct”.
Answer: D
Explanation: In agile approaches, negotiation is viewed as a healthy process of give and take rather than a
zero-sum game, a competitive challenge, or a fail proof win-win scenario.
Answer: D
Explanation: The whole team, including the development team, product owner, and Scrum Master, is
responsible for creating a shared definition of “done.” Since “process owner” is a made-up term, this is the
correct choice for someone who would NOT be involved in defining done.
20) When working with a globally distributed team, the most useful approach would be to:
A) Bring the entire team together for a diversity and sensitivity training day before starting the first
iteration.
B) Bring the entire group together for a big celebration at the end of the project
Answer: D
Explanation: Having the teamwork together for an iteration would be a great way to help integrate a globally
distributed team. Diversity training and get-to-know-you sessions are nice, but having the team members
actually work together would be the best opportunity for them to learn each other's work habits and
interaction modes.