Sie sind auf Seite 1von 90

AWS Region, AZs, Edge locations

 Each region is a separate geographic area, completely independent, isolated from the other regions
& helps achieve the greatest possible fault tolerance and stability
 Communication between regions is across the public Internet
 Each region has multiple Availability Zones
 Each AZ is physically isolated, geographically separated from each other and designed as an independent
failure zone
 AZs are connected with low-latency private links (not public internet)
 Edge locations are locations maintained by AWS through a worldwide network of data centers for the
distribution of content to reduce latency.
Consolidate Billing
 Paying account with multiple linked accounts
 Paying account is independent and should be only used for billing purpose
 Paying account cannot access resources of other accounts unless given exclusively access through Cross
Account roles
 All linked accounts are independent and soft limit of 20
 One bill per AWS account
 provides Volume pricing discount for usage across the accounts
 allows unused Reserved Instances to be applied across the group
 Free tier is not applicable across the accounts
Tags & Resource Groups
 are metadata, specified as key/value pairs with the AWS resources
 are for labelling purposes and helps managing, organizing resources
 can be inherited when created resources created from Auto Scaling, Cloud Formation, Elastic Beanstalk
 can be used for
 Cost allocation to categorize and track the AWS costs
 Conditional Access Control policy to define permission to allow or deny access on resources
based on tags
 Resource Group is a collection of resources that share one or more tags
 Promiscuous mode is not allowed, as AWS and Hypervisor will not deliver any traffic to instances this is
not specifically addressed to the instance
 IDS/IPS strategies
 Host Based Firewall – Forward Deployed IDS where the IDS itself is installed on the instances
 Host Based Firewall – Traffic Replication where IDS agents installed on instances which
send/duplicate the data to a centralized IDS system
 In-Line Firewall – Inbound IDS/IPS Tier (like a WAF configuration) which identifies and drops
suspect packets
DDOS Mitigation
 Minimize the Attack surface
 use ELB/CloudFront/Route 53 to distribute load
 maintain resources in private subnets and use Bastion servers
 Scale to absorb the attack
 scaling helps buy time to analyze and respond to an attack
 auto scaling with ELB to handle increase in load to help absorb attacks
 CloudFront, Route 53 inherently scales as per the demand
 Safeguard exposed resources
 user Route 53 for aliases to hide source IPs and Private DNS
 use CloudFront geo restriction and Origin Access Identity
 use WAF as part of the infrastructure
 Learn normal behavior (IDS/WAF)

Sensitivity: Internal & Restricted

 analyze and benchmark to define rules on normal behavior
 use CloudWatch
 Create a plan for attacks
AWS Services Region, AZ, Subnet VPC limitations
 Services like IAM (user, role, group, SSL certificate), Route 53, STS are Global and available across regions
 All other AWS services are limited to Region or within Region and do not exclusively copy data across
regions unless configured
 AMI are limited to region and need to be copied over to other region
 EBS volumes are limited to the Availability Zone, and can be migrated by creating snapshots and copying
them to another region
 Reserved instances are limited to Availability Zone and (can be migrated to other Availability Zone now)
cannot be migrated to another region
 RDS instances are limited to the region and can be recreated in a different region by either using
snapshots or promoting a Read Replica
 Placement groups are limited to the Availability Zone
 Cluster Placement groups are limited to single Availability Zones
 Spread Placement groups can span across multiple Availability Zones
 S3 data is replicated within the region and can be move to another region using cross region replication
 DynamoDB maintains data within the region can be replicated to another region using DynamoDB cross
region replication (using DynamoDB streams) or Data Pipeline using EMR (old method)
 Redshift Cluster span within an Availability Zone only, and can be created in other AZ using snapshots
Disaster Recovery Whitepaper
 RTO is the time it takes after a disruption to restore a business process to its service level
and RPO acceptable amount of data loss measured in time before the disaster occurs
 Techniques (RTO & RPO reduces and the Cost goes up as we go down)
 Backup & Restore – Data is backed up and restored, within nothing running
 Pilot light – Only minimal critical service like RDS is running and rest of the services can be
recreated and scaled during recovery
 Warm Standby – Fully functional site with minimal configuration is available and can be scaled
during recovery
 Multi-Site – Fully functional site with identical configuration is available and processes the load
 Services
 Region and AZ to launch services across multiple facilities
 EC2 instances with the ability to scale and launch across AZs
 EBS with Snapshot to recreate volumes in different AZ or region
 AMI to quickly launch preconfigured EC2 instances
 ELB and Auto Scaling to scale and launch instances across AZs
 VPC to create private, isolated section
 Elastic IP address as static IP address
 ENI with pre allocated Mac Address
 Route 53 is highly available and scalable DNS service to distribute traffic across EC2 instances and
ELB in different AZs and regions
 Direct Connect for speed data transfer (takes time to setup and expensive then VPN)
 S3 and Glacier (with RTO of 3-5 hours) provides durable storage
 RDS snapshots and Multi AZ support and Read Replicas across regions
 DynamoDB with cross region replication
 Redshift snapshots to recreate the cluster
 Storage Gateway to backup the data in AWS
 Import/Export to move large amount of data to AWS (if internet speed is the bottleneck)
 CloudFormation, Elastic Beanstalk and Opsworks as orchestration tools for automation and
recreate the infrastructure

Sensitivity: Internal & Restricted

 securely control access to AWS services and resources
 helps create and manage user identities and grant permissions for those users to access AWS resources
 helps create groups for multiple users with similar permissions
 not appropriate for application authentication
 is Global and does not need to be migrated to a different region
 helps define Policies,
 in JSON format
 all permissions are implicitly denied by default
 most restrictive policy wins
 IAM Role
 helps grants and delegate access to users and services without the need of creating permanent
 IAM users or AWS services can assume a role to obtain temporary security credentials that can
be used to make AWS API calls
 needs Trust policy to define who and Permission policy to define what the user or service can
 used with Security Token Service (STS), a lightweight web service that provides temporary,
limited privilege credentials for IAM users or for authenticated federated users
 IAM role scenarios
o Service access for e.g. EC2 to access S3 or DynamoDB
o Cross Account access for users
o with user within the same account
o with user within an AWS account owned the same owner
o with user from a Third Party AWS account with External ID for enhanced
o Identity Providers & Federation
o Web Identity Federation, where the user can be authenticated using external
authentication Identity providers like Amazon, Google or any OpenId IdP using
o Identity Provider using SAML 2.0, where the user can be authenticated using on
premises Active Directory, Open Ldap or any SAML 2.0 compliant IdP using
o For other Identity Providers, use Identity Broker to authenticate and provide
temporary Credentials using AssumeRole (recommended) or GetFederationToken
 IAM Best Practices
 Do not use Root account for anything other than billing
 Create Individual IAM users
 Use groups to assign permissions to IAM users
 Grant least privilege
 Use IAM roles for applications on EC2
 Delegate using roles instead of sharing credentials
 Rotate credentials regularly
 Use Policy conditions for increased granularity
 Use CloudTrail to keep a history of activity
 Enforce a strong IAM password policy for IAM users
 Remove all unused users and credentials
 provides secure cryptographic key storage to customers by making hardware security modules (HSMs)
available in the AWS cloud
 single tenant, dedicated physical device to securely generate, store, and manage cryptographic keys used
for data encryption
 are inside the VPC (not EC2-classic) & isolated from the rest of the network

Sensitivity: Internal & Restricted

 can use VPC peering to connect to CloudHSM from multiple VPCs
 integrated with Amazon Redshift and Amazon RDS for Oracle
 EBS volume encryption, S3 object encryption and key management can be done with CloudHSM but
requires custom application scripting
 is NOT fault tolerant and would need to build a cluster as if one fails all the keys are lost
 expensive, prefer AWS Key Management Service (KMS) if cost is a criteria
AWS Directory Services
 gives applications in AWS access to Active Directory services
 different from SAML + AD, where the access is granted to AWS services through Temporary Credentials
 Simple AD
 least expensive but does not support Microsoft AD advance features
 provides a Samba 4 Microsoft Active Directory compatible standalone directory service on AWS
 No single point of Authentication or Authorization, as a separate copy is maintained
 trust relationships cannot be setup between Simple AD and other Active Directory domains
 Don’t use it, if the requirement is to leverage access and control through centralized
authentication service
 AD Connector
 acts just as an hosted proxy service for instances in AWS to connect to on-premises Active
 enables consistent enforcement of existing security policies, such as password expiration,
password history, and account lockouts, whether users are accessing resources on-premises or in the
AWS cloud
 needs VPN connectivity (or Direct Connect)
 integrates with existing RADIUS-based MFA solutions to enabled multi-factor authentication
 does not cache data which might lead to latency
 Read-only Domain Controllers (RODCs)
 works out as a Read-only Active Directory
 holds a copy of the Active Directory Domain Service (AD DS) database and respond to
authentication requests
 they cannot be written to and are typically deployed in locations where physical security cannot
be guaranteed
 helps maintain a single point to authentication & authorization controls, however needs to be
 Writable Domain Controllers
 are expensive to setup
 operate in a multi-master model; changes can be made on any writable server in the forest, and
those changes are replicated to servers throughout the entire forest
 is a web application firewall that helps monitor the HTTP/HTTPS and allows controlling access to the
 helps protect web applications from attacks by allowing rules configuration that allow, block, or monitor
(count) web requests based on defined conditions. These conditions include IP addresses, HTTP headers,
HTTP body, URI strings, SQL injection and cross-site scripting.
 helps define Web ACLs, which is a combination of Rules, which is a combinations of Conditions and Action
to block or allow
 integrated with CloudFront, Application Load Balancer (ALB), API Gateway services commonly used to
deliver content and applications
 supports custom origins outside of AWS, when integrated with CloudFront
a WAF sandwich pattern can be implemented where an autoscaled WAF sits between the Internet and
Internal Load Balancer
AWS Shield

Sensitivity: Internal & Restricted

 is a managed service that provides protection against Distributed Denial of Service (DDoS) attacks for
applications running on AWS
 provides protection for all AWS customers against common and most frequently occurring infrastructure
(layer 3 and 4) attacks like SYN/UDP floods, reflection attacks, and others to support high availability of
applications on AWS.
 provides AWS Shield Advanced with additional protections against more sophisticated and larger attacks
for applications running on EC2, ELB, CloudFront, AWS Global Accelerator, and Route 53
AWS GuardDuty
 offers threat detection that enables continuous monitoring and protect the AWS accounts and workloads.
 analyzes continuous streams of meta-data generated from AWS account and network activity found in
AWS CloudTrail Events, VPC Flow Logs, and DNS Logs.
 integrated threat intelligence such as known malicious IP addresses, anomaly detection, and machine
learning to identify threats more accurately.
 operates completely independently from the resources so there is no risk of performance or availability
impacts to the workloads
AWS Inspector
 is an automated security assessment service that helps test the network accessibility of EC2 instances and
the security state of the applications running on the instances.
 helps automate security vulnerability assessments throughout the development and deployment pipeline
or against static production systems
Share this:

 helps define a logically isolated dedicated virtual network within the AWS
 provides control of IP addressing using CIDR block from a minimum of /28 to maximum of /16 block size
 supports IPv4 and IPv6 addressing
 cannot be extended once created
 can be extended by associating secondary IPv4 CIDR blocks to VPC
 Components
 Internet gateway (IGW) provides access to the Internet
 Virtual gateway (VGW) provides access to on-premises data center through VPN and Direct
Connect connections
 VPC can have only one IGW and VGW
 Route tables determine where network traffic from subnet is directed
 Ability to create subnet with VPC CIDR block
 A Network Address Translation (NAT) server provides outbound Internet access for EC2 instances
in private subnets
 Elastic IP addresses are static, persistent public IP addresses
 Instances launched in the VPC will have a Private IP address and can have a Public or a Elastic IP
address associated with it
 Security Groups and NACLs help define security
 Flow logs – Capture information about the IP traffic going to and from network interfaces in your
 Tenancy option for instances
 shared, by default, allows instances to be launched on shared tenancy
 dedicated allows instances to be launched on a dedicated hardware
 Route Tables
 defines rules, termed as routes, which determine where network traffic from the subnet would
be routed
 Each VPC has a Main Route table, and can have multiple custom route tables created
 Every route table contains a local route that enables communication within a VPC which cannot
be modified or deleted

Sensitivity: Internal & Restricted

 Route priority is decided by matching the most specific route in the route table that matches the
 Subnets
 map to AZs and do not span across AZs
 have a CIDR range that is a portion of the whole VPC.
 CIDR ranges cannot overlap between subnets within the VPC.
 AWS reserves 5 IP addresses in each subnet – first 4 and last one
 Each subnet is associated with a route table which define its behavior
o Public subnets – inbound/outbound Internet connectivity via IGW
o Private subnets – outbound Internet connectivity via an NAT or VGW
o Protected subnets – no outbound connectivity and used for regulated workloads
 Elastic Network Interface (ENI)
 a default ENI, eth0, is attached to an instance which cannot be detached with one or more
secondary detachable ENIs (eth1-ethn)
 has primary private, one or more secondary private, public, Elastic IP address, security groups,
MAC address and source/destination check flag attributes associated
 AN ENI in one subnet can be attached to an instance in the same or another subnet, in the same
AZ and the same VPC
 Security group membership of an ENI can be changed
 with pre allocated Mac Address can be used for applications with special licensing requirements
 Security Groups vs Network Access Control Lists
 Stateful vs Stateless
 At instance level vs At subnet level
 Only allows Allow rule vs Allows both Allow and Deny rules
 Evaluated as a Whole vs Evaluated in defined Order
 Elastic IP
 is a static IP address designed for dynamic cloud computing.
 is associated with AWS account, and not a particular instance
 can be remapped from one instance to an other instance
 is charged for non usage, if not linked for any instance or instance associated is in stopped state
 allows internet access to instances in private subnet
 performs the function of both address translation and port address translation (PAT)
 needs source/destination check flag to be disabled as it is not actual destination of  the traffic
 NAT gateway is a AWS managed NAT service that provides better availability, higher bandwidth,
and requires less administrative effort
 are not supported for IPv6 traffic
 Egress-Only Internet Gateways
 outbound communication over IPv6 from instances in the VPC to the Internet, and prevents the
Internet from initiating an IPv6 connection with your instances
 supports IPv6 traffic only
 Shared VPCs
 allows multiple AWS accounts to create their application resources, such as EC2 instances, RDS
databases, Redshift clusters, and AWS Lambda functions, into shared, centrally-managed VPCs
 VPC Peering
 allows routing of traffic between the peer VPCs using private IP addresses and no IGW or VGW
 No single point of failure and bandwidth bottlenecks
 cannot span across regions
 supports inter-region VPC peering
 IP space or CIDR blocks cannot overlap
 cannot be transitive, one-to-one relationship between two VPC

Sensitivity: Internal & Restricted

 Only one between any two VPCs and have to be explicitly peered
 Private DNS values cannot be resolved
 Security groups from peered VPC cannot be referred for ingress and egress rules in security
group, use CIDR block instead
 Security groups from peered VPC can now be referred, however the VPC should be in the same
 VPC Endpoints
 enables you to privately connect VPC to supported AWS services and VPC endpoint services
powered by PrivateLink
 does not require a public IP address, access over the Internet, NAT device, a VPN connection or
Direct Connect
 traffic between VPC & AWS service does not leave the Amazon network
 are virtual devices.
 are horizontally scaled, redundant, and highly available VPC components that allow
communication between instances in your VPC and services without imposing availability risks or
bandwidth constraints on your network traffic.
 Gateway Endpoints
o is a gateway that is a target for a specified route in the route table, used for traffic
destined to a supported AWS service.
o only S3 and DynamoDB are currently supported
 Interface Endpoints
o is an elastic network interface with a private IP address that serves as an entry point for
traffic destined to a supported service
o services supported API Gateway AWS CloudFormation, CloudWatch, CloudWatch
Events, CloudWatch Logs AWS CodeBuild AWS CodeCommit AWS Config, EC2 API Elastic Load
Balancing API, Elastic Container Registry, Elastic Container Service AWS Key Management Service,
Kinesis Data Streams, SageMaker and, SageMaker Runtime, SageMaker Notebook Instance AWS
Secrets Manager AWS Security Token Service AWS Service Catalog, SNS, SQS AWS Systems Manager
Direct Connect & VPN
 provide secure IPSec connections from on-premise computers or services to AWS over the
 is quick to setup, is cheap however it depends on the Internet speed
 Direct Connect
 is a network service that provides an alternative to using Internet to utilize AWS services by using
private dedicated network connection
 provides Virtual Interfaces
o Private VIF to access instances within an VPC via VGW
o Public VIF to access non VPC services
 requires time to setup probably months, and should not be considered as an option
if turnaround time is less
 does not provide redundancy, use either second direct connection or IPSec VPN connection
 Virtual Private Gateway is on the AWS side and Customer Gateway is on the Customer side
 route propagation is enabled on VGW and not on CGW
 Direct Connect vs VPN IPSec
 Expensive to Setup and Takes time vs Cheap & Immediate
 Dedicated private connections vs Internet
 Reduced data transfer rate vs Internet data transfer cost
 Consistent performance vs Internet inherent variability
 Do not provide Redundancy vs Provides Redundancy
Route 53
 Highly available and scalable DNS & Domain Registration Service

Sensitivity: Internal & Restricted

 Reliable and cost-effective way to route end users to Internet applications
 Supports multi-region and backup architectures for High availability. ELB , limited to region, does not
support multi region HA architecture
 supports private Intranet facing DNS service
 internal resource record sets only work for requests originating from within the VPC and currently
cannot extend to on-premise
 Global propagation of any changes made to the DN records within ~ 1min
 Route 53 to create an alias resource record set that points to ELB, S3, CloudFront. An alias resource
record set is a Route 53 extension to DNS. It’s similar to a CNAME resource record set, but supports both for
root domain – zone apex  e.g., and for subdomains for e.g.
 CNAME resource record sets can be created only for subdomains and cannot be mapped to the zone apex
 Route 53 Split-view (Split-horizon) DNS enables you to access an internal version of your website using
the same domain name that is used publicly
 Routing policy
 Simple routing – simple round robin policy
 Weighted round robin – assign weights to resource records sets to specify the proportion for e.g.
 Latency based routing – helps improve global applications as request are sent to server from the
location with minimal latency, is based on the latency and cannot guarantee users from the same
geographic will be served from the same location for any compliance reasons
 Geolocation routing – Specify geographic locations by continent, country, state limited to US, is
based on IP accuracy
 Failover routing – failover to a backup site if the primary site fails and becomes unreachable
 Weighted, Latency and Geolocation can be used for Active-Active while Failover routing can be used for
Active-Passive multi region architecture

 provides scalable computing capacity
 Features
 Virtual computing environments, known as EC2 instances
 Preconfigured templates for EC2 instances, known as Amazon Machine Images (AMIs), that
package the bits needed for the server (including the operating system and additional software)
 Various configurations of CPU, memory, storage, and networking capacity for your instances,
known as Instance types
 Secure login information for your instances using key pairs (public-private keys where private is
kept by user)
 Storage volumes for temporary data that’s deleted when you stop or terminate your instance,
known as Instance store volumes
 Persistent storage volumes for data using Elastic Block Store (EBS)
 Multiple physical locations for your resources, such as instances and EBS volumes,
known as Regions and Availability Zones
 A firewall to specify the protocols, ports, and source IP ranges that can reach your instances
using Security Groups
 Static IP addresses, known as Elastic IP addresses
 Metadata, known as tags, can be created and assigned to EC2 resources
 Virtual networks that are logically isolated from the rest of the AWS cloud, and can optionally
connect to on premises network, known as Virtual private clouds (VPCs)
 Amazon Machine Image
 template from which EC2 instances can be launched quickly
 does NOT span across across regions, and needs to be copied
 can be shared with other specific AWS accounts or made public

Sensitivity: Internal & Restricted

 Purchasing Option
 On-Demand Instances
o pay for instances and compute capacity that you use by the hour
o with no long-term commitments or up-front payments
 Reserved Instances
o provides lower hourly running costs by providing a billing discount
o capacity reservation that is applied to instances
o suited if consistent, heavy, predictable usage
o provides benefits with Consolidate Billing
o can be modified to switch Availability Zones or the instance size within the same
instance type, given the instance size footprint (Normalization factor) remains the same
o pay for the entire term regardless of the usage, so if the question targets cost effective
solution and answer mentions reserved instances are purchased & unused, it can be ignored
 Spot Instances
o cost-effective choice but does NOT guarantee availability
o applications flexible in the timing when they can run and also able to handle
interruption by storing the state externally
o AWS will give a two minute warning if the instance is to be terminated to save any
unsaved work
 Dedicated Instances, is a tenancy option which enables instances to run in VPC on hardware
that’s isolated, dedicated to a single customer
 Light, Medium, and Heavy Utilization Reserved Instances are no longer available for purchase
and were part of the Previous Generation AWS EC2 purchasing model
 Enhanced Networking
 results in higher bandwidth, higher packet per second (PPS) performance, lower latency,
consistency, scalability and lower jitter
 supported using Single Root I/O Virtualization (SR-IOV) only on supported instance types
 is supported only with an VPC (not EC2 Classic), HVM virtualization type and available by
default on Amazon AMI but can be installed on other AMIs as well
 Placement Group
 provide low latency, High Performance Computing via 10Gbps network
 is a logical grouping on instances within a Single AZ
 don’t span availability zones, can span multiple subnets but subnets must be in the same AZ
 NOTE – Spread Placement Groups can span multiple AZs.
 can span across peered VPCs for the same Availability Zones
 existing instances cannot be moved into an existing placement group
 Enhancement – Existing instance can now be moved to a placement group, or moved from one
placement group to another, or removed from a placement group, given it is in the stopped state. 
 for capacity errors, stop and start the instances in the placement group
 use homogenous instance types which support enhanced networking and launch all the
instances at once

Elastic Load Balancer & Auto Scaling

 Elastic Load Balancer
 Managed load balancing service and scales automatically
 distributes incoming application traffic across multiple EC2 instances
 is distributed system that is fault tolerant and actively monitored by AWS scales it as per the
 are engineered to not be a single point of failure
 need to Pre Warm ELB if the demand is expected to shoot especially during load testing

Sensitivity: Internal & Restricted

 supports routing traffic to instances in multiple AZs in the same region
 performs Health Checks to route traffic only to the healthy instances
 support Listeners with HTTP, HTTPS, SSL, TCP protocols
 has an associated IPv4 and dual stack DNS name
 can offload the work of encryption and decryption (SSL termination) so that the EC2 instances
can focus on their main work
 supports Cross Zone load balancing to help route traffic evenly across all EC2 instances
regardless of the AZs they reside in
 to help identify the IP address of a client
o supports Proxy Protocol header for TCP/SSL connections
o supports X-Forward headers for HTTP/HTTPS connections
 supports Stick Sessions (session affinity) to bind a user’s session to a specific application instance,
o it is not fault tolerant, if an instance is lost the information is lost
o requires HTTP/HTTPS listener and does not work with TCP
o requires SSL termination on ELB as it users the headers
 supports Connection draining to help complete the in-flight requests in case an instance is
 For High Availability, it is recommended to attach one subnet per AZ for at least two AZs, even if
the instances are in a single subnet.
 cannot assign an Elastic IP address to an ELB
 IPv4 & IPv6 support however VPC does not support IPv6. VPC now supports IPV6.
 HTTPS listener does not support Client Side Certificate
 for SSL termination at backend instances or support for Client Side Certificate use TCP for
connections from the client to the ELB, use the SSL protocol for connections from the ELB to the back-
end application, and deploy certificates on the back-end instances handling requests
 supports a single SSL certificate, so for multiple SSL certificate multiple ELBs need to be created
 Auto Scaling
 ensures correct number of EC2 instances are always running to handle the load by scaling up or
down automatically as demand changes
 cannot span multiple regions.
 attempts to distribute instances evenly between the AZs that are enabled for the Auto Scaling
 performs checks either using EC2 status checks or can use ELB health checks to determine the
health of an instance and terminates the instance if unhealthy, to launch a new instance
 can be scaled using manual scaling, scheduled scaling or demand based scaling
 cooldown period helps ensure instances are not launched or terminated before the previous
scaling activity takes effect to allow the newly launched instances to start handling traffic and reduce
 Auto Scaling & ELB can be used for High Availability and Redundancy by spanning Auto Scaling groups
across multiple AZs within a region and then setting up ELB to distribute incoming traffic across those AZs
 With Auto Scaling use ELB health check with the instances to ensure that traffic is routed only to the
healthy instances

Elastic Block Store – EBS

 is virtual network attached block storage
 volumes CANNOT be shared with multiple EC2 instances, use EFS instead
 persists and is independent of EC2 lifecycle
 multiple volumes can be attached to a single EC2 instance
 can be detached & attached to another EC2 instance in that same AZ only
 volumes are created in an specific AZ and CANNOT span across AZs
 snapshots CANNOT span across regions

Sensitivity: Internal & Restricted

 for making volume available to different AZ, create a snapshot of the volume and restore it to a new
volume in any AZ within the region
 for making the volume available to different Region, the snapshot of the volume can be copied to a
different region and restored as a volume
 provides high durability and are redundant in an AZ, as the data is automatically replicated within that AZ
to prevent data loss due to any single hardware component failure
 PIOPS is designed to run transactions applications that require high and consistent IO for e.g. Relation
database, NoSQL etc
 Key-value based object storage with unlimited storage, unlimited objects up to 5 TB for the internet
 is an Object level storage (not a Block level storage) and cannot be used to host OS or dynamic websites
(but can work with Javascript SDK)
 provides durability by redundantly storing objects on multiple facilities within a region
 support SSL encryption of data in transit and data encryption at rest
 regularly verifies the integrity of data using checksums and provides auto healing capability
 integrates with CloudTrail, CloudWatch and SNS for event notifications
 S3 resources
 consists of bucket and objects stored in the bucket which can be retrieved via a unique,
developer-assigned key
 bucket names are globally unique
 data model is a flat structure with no hierarchies or folders
 Logical hierarchy can be inferred using the keyname prefix e.g. Folder1/Object1
 Bucket & Object Operations
 allows retrieval of 1000 objects and provides pagination support and is NOT suited for list or
prefix queries with large number of objects
 with a single put operations, 5GB size object can be uploaded
 use Multipart upload to upload large objects up to 5 TB and is recommended for object size of
over 100MB for fault tolerant uploads
 support Range HTTP Header to retrieve partial objects for fault tolerant downloads where the
network connectivity is poor
 Pre-Signed URLs can also be used shared for uploading/downloading objects for limited
time without requiring AWS security credentials
 allows deletion of a single object or multiple objects (max 1000) in a single call
 Multipart Uploads allows
 parallel uploads with improved throughput and bandwidth utilization
 fault tolerance and quick recovery from network issues
 ability to pause and resume uploads
 begin an upload before the final object size is known
 Versioning
 allows preserve, retrieve, and restore every version of every object
 protects individual files but does NOT protect from Bucket deletion
 Storage tiers
 Standard
o default storage class
o 99.999999999% durability & 99.99% availability
o Low latency and high throughput performance
o designed to sustain the loss of data in a two facilities
 Standard IA
o optimized for long-lived and less frequently accessed data
o designed to sustain the loss of data in a two facilities
o 99.999999999% durability & 99.9% availability
o suitable for objects greater than 128 KB kept for at least 30 days

Sensitivity: Internal & Restricted

 Reduced Redundancy Storage
o designed for noncritical, reproducible data stored at lower levels of redundancy than
the STANDARD storage class
o reduces storage costs
o 99.99% durability & 99.99% availability
o designed to sustain the loss of data in a single facility
 Glacier
o suitable for archiving data where data access is infrequent and retrieval time of several
(3-5) hours  is acceptable
o 99.999999999% durability
 allows Lifecycle Management policies
 transition to move objects to different storage classes and Glacier
 expiration to remove objects
 Data Consistency Model
 provide read-after-write consistency for PUTS of new objects and eventual consistency for
overwrite PUTS and DELETES
 for new objects, synchronously stores data across multiple facilitiesbefore returning success
 updates to a single key are atomic
 Security
 IAM policies – grant users within your own AWS account permission to access S3 resources
 Bucket and Object ACL – grant other AWS accounts (not specific users) access to  S3 resources
 Bucket policies – allows to add or deny permissions across some or all of the objects within a
single bucket
 Data Protection – Pending
 Best Practices
 use random hash prefix for keys and ensure a random access pattern, as S3 stores object
lexicographically randomness helps distribute the contents across multiple partitions for better
 use parallel threads and Multipart upload for faster writes
 use parallel threads and Range Header GET for faster reads
 for list operations with large number of objects, its better to build a secondary index in
 use Versioning to protect from unintended overwrites and deletions, but this does not protect
against bucket deletion
 use VPC S3 Endpoints with VPC to transfer data using Amazon internal network
 suitable for archiving data, where data access is infrequent and a retrieval time of several hours (3 to 5
hours) is acceptable (Not true anymore with enhancements from AWS)
 provides a high durability by storing archive in multiple facilities and multiple devices at a very low cost
 performs regular, systematic data integrity checks and is built to be automatically self healing
 aggregate files into bigger files before sending them to Glacier and use range retrievals to retrieve
partial file and reduce costs
 improve speed and reliability with multipart upload
 automatically encrypts the data using AES-256
 upload or download data to Glacier via SSL encrypted endpoints
 provides low latency and high data transfer speeds for distribution of static, dynamic web or streaming
content to web users
 delivers the content through a worldwide network of data centers called Edge Locations
 keeps persistent connections with the origin servers so that the files can be fetched from the origin
servers as quickly as possible.

Sensitivity: Internal & Restricted

 dramatically reduces the number of network hops that users’ requests must pass through
 provides Relational Database service
 supports MySQL, MariaDB, PostgreSQL, Oracle, Microsoft SQL Server, and the new, MySQL-compatible
Amazon Aurora DB engine
 as it is a managed service, shell (root ssh) access is not provided
 manages backups, software patching, automatic failure detection, and recovery
 supports use initiated manual backups and snapshots
 daily automated backups with database transaction logs enables Point in Time recovery up to the last
five minutes of database usage
 snapshots are user-initiated storage volume snapshot of DB instance, backing up the entire DB instance
and not just individual databases that can be restored as a independent RDS instance
 support encryption at rest using KMS as well as encryption in transit using SSL endpoints
 for encrypted database
 logs, snapshots, backups, read replicas are all encrypted as well
 cross region replicas and snapshots does not work across region (Note – this is possible now with
latest AWS enhancement)
 Multi-AZ deployment
 provides high availability and automatic failover support and is NOT a scaling solution
 maintains a synchronous standby replica in a different AZ
 transaction success is returned only if the commit is successful both on the primary and the
standby DB
 Oracle, PostgreSQL, MySQL, and MariaDB DB instances use Amazon technology, while SQL
Server DB instances use SQL Server Mirroring
 snapshots and backups are taken from standby & eliminate I/O freezes
 during automatic failover, its seamless and RDS switches to the standby instance and updates
the DNS record to point to standby
 failover can be forced with the Reboot with failover option
 Read Replicas
 uses the PostgreSQL, MySQL, and MariaDB DB engines’ built-in replication functionality to create
a separate Read Only instance
 updates are asynchronously copied to the Read Replica, and data might be stale
 can help scale applications and reduce read only load 
 requires automatic backups enabled
 replicates all databases in the source DB instance
 for disaster recovery, can be promoted to a full fledged database
 can be created in a different region for MySQL, Postgres and MariaDB, for disaster recovery,
migration and low latency across regions
 RDS does not support all the features of underlying databases, and if required the database instance can
be launched on an EC2 instance
 RMAN (Recovery Manager) can be used for Oracles backup and recovery when running on an EC2
 fully managed NoSQL database service
 synchronously replicates data across three facilities in an AWS Region, giving high availability and data
 runs exclusively on SSDs to provide high I/O performance
 provides provisioned table reads and writes
 automatically partitions, reallocates and re-partitions the data and provisions additional server capacity
as data or throughput changes
 provides Eventually consistent (by default) or Strongly Consistent option to be specified during an read

Sensitivity: Internal & Restricted

 creates and maintains indexes for the primary key attributes for efficient access of data in the table
 supports secondary indexes
 allows querying attributes other then the primary key attributes without impacting performance.
 are automatically maintained as sparse objects
 Local vs Global secondary index
 shares partition key + different sort key vs different partition + sort key
 search limited to partition vs across all partition
 unique attributes vs non unique attributes
 linked to the base table vs independent separate index
 only created during the base table creation vs can be created later
 cannot be deleted after creation vs can be deleted
 consumes provisioned throughput capacity of the base table vsindependent throughput
 returns all attributes for item vs only projected attributes
 Eventually or Strongly vs Only Eventually consistent reads
 size limited to 10Gb per partition vs unlimited
 supports cross region replication using DynamoDB streams which leverages Kinesis and provides time-
ordered sequence of item-level changes and can help for lower RPO, lower RTO disaster recovery
 Data Pipeline jobs with EMR can be used for disaster recovery with higher RPO, lower RTO requirements
 supports triggers to allow execution of custom actions or notifications based on item-level updates
 managed web service that provides in-memory caching to deploy and run Memcached or Redis protocol-
compliant cache clusters
 ElastiCache with Redis,
 like RDS, supports Multi-AZ, Read Replicas and Snapshots
 Read Replicas are created across AZ within same region using Redis’s asynchronous replication
 Multi-AZ differs from RDS as there is no standby, but if the primary goes down a Read Replica is
promoted as primary
 Read Replicas cannot span across regions, as RDS supports
 cannot be scaled out and if scaled up cannot be scaled down
 allows snapshots for backup and restore
 AOF can be enabled for recovery scenarios, to recover the data in case the node fails or service
crashes. But it does not help in case the underlying hardware fails
 Enabling Redis Multi-AZ as a Better Approach to Fault Tolerance
 ElastiCache with Memcached
 can be scaled up by increasing size and scaled out by adding nodes
 nodes can span across multiple AZs within the same region
 cached data is spread across the nodes, and a node failure will always result in some data loss
from the cluster
 supports auto discovery
 every node should be homogenous and of same instance type
 ElastiCache Redis vs Memcached
 complex data objects vs simple key value storage
 persistent vs non persistent, pure caching
 automatic failover with Multi-AZ vs Multi-AZ not supported
 scaling using Read Replicas vs using multiple nodes
 backup & restore supported vs not supported
 can be used state management to keep the web application stateless
 fully managed, fast and powerful, petabyte scale data warehouse service
 uses replication and continuous backups to enhance availability and improve data durability and can
automatically recover from node and component failures

Sensitivity: Internal & Restricted

 provides Massive Parallel Processing (MPP) by distributing & parallelizing queries across multiple physical
 columnar data storage improving query performance and allowing advance compression techniques
 only supports Single-AZ deployments and the nodes are available within the same AZ, if the AZ supports
Redshift clusters
 spot instances are NOT an option
Share this:

Data Pipeline
 orchestration service that helps define data-driven workflows to automate and schedule regular data
movement and data processing activities
 integrates with on-premises and cloud-based storage systems
 allows scheduling, retry, and failure logic for the workflows
 is a web service that utilizes a hosted Hadoop framework running on the web-scale infrastructure of EC2
and S3
 launches all nodes for a given cluster in the same Availability Zone, which improves performance as it
provides higher data access rate
 seamlessly supports Reserved, On-Demand and Spot Instances
 consists of Master Node for management and Slave nodes, which consists of Core nodes holding data and
Task nodes for performing tasks only
 is fault tolerant for slave node failures and continues job execution if a slave node goes down
 does not automatically provision another node to take over failed slaves
 supports Persistent and Transient cluster types
 Persistent which continue to run
 Transient which terminates once the job steps are completed
 supports EMRFS which allows S3 to be used as a durable HA data storage
 enables real-time processing of streaming data at massive scale
 provides ordering of records, as well as the ability to read and/or replay records in the same order to
multiple Kinesis applications
 data is replicated across three data centers within a region and preserved for 24 hours, by default and can
be extended to 7 days
 streams can be scaled using multiple shards, based on the partition key, with each shard providing the
capacity of 1MB/sec data input and 2MB/sec data output with 1000 PUT requests per second
 Kinesis vs SQS
 real-time processing of streaming big data vs reliable, highly scalable hosted queue for storing
 ordered records, as well as the ability to read and/or replay records in the same order vs no
guarantee on data ordering (with the standard queues before the FIFO queue feature was released)
 data storage up to 24 hours, extended to 7 days vs up to 14 days, can be configured from 1
minute to 14 days but cleared if deleted by the consumer
 supports multiple consumers vs single consumer at a time and requires multiple queues to
deliver message to multiple consumers

 extremely scalable queue service and potentially handles millions of messages
 helps build fault tolerant, distributed loosely coupled applications
 stores copies of the messages on multiple servers for redundancy and high availability
 guarantees At-Least-Once Delivery, but does not guarantee Exact One Time Delivery which might result
in duplicate messages (Not true anymore with the introduction of FIFO queues)

Sensitivity: Internal & Restricted

 does not maintain or guarantee message order, and if needed sequencing information needs to be added
to the message itself (Not true anymore with the introduction of FIFO queues)
 supports multiple readers and writers interacting with the same queue as the same time
 holds message for 4 days, by default, and can be changed from 1 min – 14 days after which the message is
 message needs to be explicitly deleted by the consumer once processed
 allows send, receive and delete batching which helps club up to 10 messages in a single batch while
charging price for a single message
 handles visibility of the message to multiple consumers using Visibility Timeout, where the message once
read by a consumer is not visible to the other consumers till the timeout occurs
 can handle load and performance requirements by scaling the worker instances as the demand changes
(Job Observer pattern)
 message sample allowing short and long polling
 returns immediately vs waits for fixed time for e.g. 20 secs
 might not return all messages as it samples a subset of servers vsreturns all available messages
 repetitive vs helps save cost with long connection
 supports delay queues to make messages available after a certain delay, can you used to differentiate
from priority queues
 supports dead letter queues, to redirect messages which failed to process after certain attempts instead
of being processed repeatedly
 Design Patterns
 Job Observer Pattern can help coordinate number of EC2 instances with number of job requests
(Queue Size) automatically thus Improving cost effectiveness and performance
 Priority Queue Pattern can be used to setup different queues with different handling either by
delayed queues or low scaling capacity for handling messages in lower priority queues
 delivery or sending of messages to subscribing endpoints or clients
 publisher-subscriber model
 Producers and Consumers communicate asynchronously with subscribers by producing and sending a
message to a topic
 supports Email (plain or JSON), HTTP/HTTPS, SMS, SQS
 supports Mobile Push Notifications to push notifications directly to mobile devices with services like
Amazon Device Messaging (ADM), Apple Push Notification Service (APNS), Google Cloud Messaging (GCM)
etc. supported
 order is not guaranteed and No recall available
 integrated with Lambda to invoke functions on notifications
 for Email notifications, use SNS or SES directly, SQS does not work
 orchestration service to coordinate work across distributed components
 helps define tasks, stores, assigns tasks to workers, define logic, tracks and monitors the task and
maintains workflow state in a durable fashion
 helps define tasks which can be executed on AWS cloud or on-premises
 helps coordinating tasks across the application which involves managing intertask dependencies,
scheduling, and concurrency in accordance with the logical flow of the application
 supports built-in retries, timeouts and logging
 supports manual tasks
 Characteristics
 deliver exactly once
 uses long polling, which reduces number of polls without results
 Visibility of task state via API
 Timers, signals, markers, child workflows
 supports versioning

Sensitivity: Internal & Restricted

 keeps workflow history for a user-specified time
 task-oriented vs message-oriented
 track of all tasks and events vs needs custom handling
 highly scalable and cost-effective email service
 uses content filtering technologies to scan outgoing emails to check standards and email content for
spam and malware
 supports full fledged emails to be sent as compared to SNS where only the message is sent in Email
 ideal for sending bulk emails at scale

AWS: Amazon Web Services (AWS), it is a collection of various cloud computing services and application that offers
flexible, reliable, easy to use and cost-effective solutions
Cloud Computing: It is an internet-based computing service in which various remote servers are networked to
allow centralized data storage and online access to computer services and resources
Types of cloud:

There are three types of clouds

 Public: The resources and services provided by the third-party service providers are available to the
customers via internet
 Private cloud: Here the resources and services are managed by the organizations or by the third party
only for the customers organization
 Hybrid cloud: It is a combination of both Public and Private Cloud. The decision to run the services on
public or private depends on the parameters like sensitivity of the data and applications, industry certifications and
required standards etc

Types of EC2 computing instances:

 General Instances: It is used for applications that require a balance of performance and cost
 Compute Instances: It is used for applications that require a lot of processing from the CPU
 Memory Instances: It is used for applications that need a lot of RAM
 Storage instances: It is used for applications with a data set that occupies a lot of space
 GPU instances: It is used for applications requiring heavy graphics rendering
Basic CLI commands:
 cat /proc/mounts: It Displays a list of mounted drives.
 rm <filename>: It Removes the specified file from the current director
 rpm – ql‘<package name>‘: It is used to obtain a list of utilities contained within a package.
 sudo chmod <options>: It Changes the access mode for the current directory.
 sudo mkdir <directory name>: Used to create a new directory to hold files.
 sudo reboot: Reboots the remove AWS system so that you can see the results of any changes you make.
 sudo rmdir <directory name>: Removes the specified directory.
 sudo yum groupinstall “<group package name> “: Installs the specified group of packages.
 sudo yum search ‘<package name> ‘: Searches for a package.

Sensitivity: Internal & Restricted

 sudo yum update: Performs required AWS updates.
 sudo yum -y install <service or feature>: Installs a required support service or feature onto the AWS
 gives developers and systems administrators an easy way to create and manage a collection of related
AWS resources
 Resources can be updated, deleted and modified in a orderly, controlled and predictable fashion, in
effect applying version control to the AWS infrastructure as code done for software code
 CloudFormation Template is an architectural diagram, in JSON format, and Stack is the end result of that
diagram, which is actually provisioned
 template can be used to set up the resources consistently and repeatedly over and over across multiple
regions and consists of
 List of AWS resources and their configuration values
 An optional template file format version number
 An optional list of template parameters (input values supplied at stack creation time)
 An optional list of output values like public IP address using the Fn::GetAtt function
 An optional list of data tables used to lookup static configuration values for e.g., AMI names per
 supports Chef & Puppet Integration to deploy and configure right down the the application layer
 supports Bootstrap scripts to install packages, files and services on the EC2 instances by simple describing
them in the CF template
 automatic rollback on error feature is enabled, by default, which will cause all the AWS resources that CF
created successfully for a stack up to the point where an error occurred to be deleted
 provides a WaitCondition resource to block the creation of other resources until a completion signal is
received from an external source
 allows DeletionPolicy attribute to be defined for resources in the template
 retain to preserve resources like S3 even after stack deletion
 snapshot to backup resources like RDS after stack deletion
 DependsOn attribute to specify that the creation of a specific resource follows another
 Service role is an IAM role that allows AWS CloudFormation to make calls to resources in a stack on the
user’s behalf
 support Nested stacks that can separate out reusable, common components and create dedicated
templates to mix and match different templates but use nested stacks to create a single, unified stack
Elastic BeanStalk
 makes it easier for developers to quickly deploy and manage applications in the AWS cloud.
 automatically handles the deployment details of capacity provisioning, load balancing, auto-scaling and
application health monitoring
 CloudFormation supports ElasticBeanstalk
 provisions resources to support
 a web application that handles HTTP(S) requests or
 a web application that handles background-processing (worker) tasks
 supports Out Of the Box
 Apache Tomcat for Java applications
 Apache HTTP Server for PHP applications
 Apache HTTP server for Python applications
 Nginx or Apache HTTP Server for Node.js applications
 Passenger for Ruby applications
 MicroSoft IIS 7.5 for .Net applications
 Single and Multi Container Docker
 supports custom AMI to be used
 is designed to support multiple running environments such as one for Dev, QA, Pre-Prod and Production.

Sensitivity: Internal & Restricted

 supports versioning and stores and tracks application versions over time allowing easy rollback to prior
 can provision RDS DB instance and connectivity information is exposed to the application by environment
variables, but is NOT recommended for production setup as the RDS is tied up with the Elastic Beanstalk
lifecycleand if deleted, the RDS instance would be deleted as well
 is a configuration management service that helps to configure and operate applications in a cloud
enterprise by using Chef
 helps deploy and monitor applications in stacks with multiple layers
 supports preconfigured layers for Applications, Databases, Load Balancers, Caching
 OpsWorks Stacks features is a set of lifecycle events – Setup, Configure, Deploy, Undeploy, and
Shutdown – which automatically runs specified set of recipes at the appropriate time on each instance
 Layers depend on Chef recipes to handle tasks such as installing packages on instances, deploying apps,
running scripts, and so on
 OpsWorks Stacks runs the recipes for each layer, even if the instance belongs to multiple layers
 supports Auto Healing and Auto Scaling to monitor instance health, and provision new instances
 allows monitoring of AWS resources and applications in real time, collect and track pre configured or
custom metrics and configure alarms to send notification or make resource changes based on defined rules
 does not aggregate data across regions
 stores the log data indefinitely, and the retention can be changed for each log group at any time
 alarm history is stored for only 14 days
 can be used an alternative to S3 to store logs with the ability to configure Alarms and generate metrics,
however logs cannot be made public
 Alarms exist only in the created region and the Alarm actions must reside in the same region as well
 records access to API calls for the AWS account made from AWS management console, SDKs, CLI and
higher level AWS service
 support many AWS services and tracks who did, from where, what & when
 can be enabled per-region basis, a region can include global services (like IAM, STS etc), is applicable to all
the supported services within that region
 log files from different regions can be sent to the same S3 bucket
 can be integrated with SNS to notify logs availability, CloudWatch logs log group for notifications when
specific API events occur
 call history enables security analysis, resource change tracking, trouble shooting and compliance

Here we have the list of topics if you want to jump right into a specific one:
 Cloud Computing
 Need AWS Lambda?
 What is AWS Lambda?
 How does it work?
 Building Blocks
 Using AWS Lambda with S3
 Why not other compute services but Lambda?
 AWS Lambda VS AWS EC2
 AWS Lambda VS AWS Elastic Beanstalk

Sensitivity: Internal & Restricted

 Benefits of AWS Lambda
 Limitations of AWS Lambda
 AWS Lambda Pricing
 Use Cases of Lambda
 Hands-On: How to create a Lambda function using Lambda console
 Conclusion
Cloud Computing

Before proceeding towards AWS Lambda, let’s understand its domain cloud computing where AWS originated
Cloud computing is simply a practice of using a network of remote servers hosted on the Internet to store,
manage, and process data, rather than a local server or a personal computer. For more information on cloud
computing, you can refer to this informative blog on Cloud computing.

But why are we talking about AWS when there are numerous cloud computing vendors. Here are some of the
major players in the marketplace when it comes to Cloud Computing.

If we look into the stats, currently AWS is a pioneer in providing cloud services, as is evident from the google trends
graph below:

Sensitivity: Internal & Restricted

To know more about Amazon Web Services, you can check this Amazon Web Services by Intellipaat.
If we talk of services, AWS Compute services plays a major role when you want to start AWS, as it provides
you secure, resizable,computing capacity on cloud.

In AWS Compute services, there are multiple reliable services:

 AWS Elastic Compute Cloud (EC2)
 AWS Elastic Beanstalk
 AWS Lambda

We are going to talk about AWS Lambda today, which is a very reliable serverless compute service.

But why do we need AWS Lambda, when we already have 2 other reliable computing services?

Don’t worry, we will answer all your questions regarding AWS Lambda today in this blog.
Need of AWS Lambda

As you all know how cloud works, so let’s take an example and understand the need of AWS Lambda. Let’s take an
example of a website, suppose this website is hosted on AWS EC2, in which currently 90-100 users are reading a
blog, and in the back-end, admin user uploads 10 videos on the website for processing.

This increases the load on the server, which triggers the auto scaling feature, so EC2 provisions more number of
instances for this task, as hosting and back-end changes are both taking place in the same instance. Autoscaling
takes a long time to provision more instances, which eventually becomes a reason for slow actions in the website
when the initial spike in the task is received.

Sensitivity: Internal & Restricted

This problem was solved using distributed computing. In this, the website is hosted on one Instance and the back-
end code runs on another instance.

While the users are reading a blog on the website and the admin user uploads those 10 videos on the website. The
website will forward the task of video processing to another instance. This makes the website resistant to video
processing, and therefore website performance will not be impacted. But video processing still took a lot of time,
when the load increased because auto-scaling took time on EC2.

We needed a stateless system to solve this problem, and AWS did exactly this with the launch of AWS Lambda!

How? We shall discuss as we move along in this blog.

So, let’s understand what exactly is AWS Lambda?

What is AWS Lambda?

It is one of the computing services provided by AWS, which is event-driven and serverless.
AWS Lambda is a Stateless Serverless system which helps us run our background tasks in the most efficient
manner possible.

Sensitivity: Internal & Restricted

From serverless it doesn’t mean that servers are nowhere in play but, you don’t have to worry about the
provisioning or management of your servers or instances, it just helps you focus on your main goal i.e., CODING,
just put your code in AWS Lambda and you’re good to go. Whatever resources are required for your code in
response to your events, it automatically provides you with that. The best feature about AWS Lambda is that you
just have to pay for every request made during that time.

Now let’s understand how it works?

How does it work?

Sensitivity: Internal & Restricted

Before we proceed forward and understand that how AWS Lambda works, first we need to understand few
aspects of Lambda-based Application.
Building Blocks

 Lambda Function: Whatever the custom codes and libraries that you’ve created are what a Function
 Event Source: Any AWS or custom service that triggers your function and helping in executing its logic.
 Log Streams: As we know that Lambda monitors your function automatically and one can view its metric
on CloudWatch directly, but you can also code your function in a way that allows you a custom logging statements
to let you analyze the flow of execution and performance of your function to check if it’s working properly.
Using AWS Lambda with S3

In this section, we will see how AWS S3 can be used with AWS Lambda.Let’s take an example where User is
uploading an Image in the Website for resizing it.
 The user creates a Lambda function.
 User uploads the code to the Lambda function.
 Then uploads the image from the Website in the S3 bucket as an object.
 After receiving the object, our S3 bucket triggers the Lambda Function.
 Then the Lambda Function does its job by resizing the image in the back-end and sends a successful
completion email through SQS.
Pseudo Code for Lambda function:
<code to resize image>
<once the image is resized, send the email for successful operation through SQS>

So, from this example, you must be able to figure out that how AWS Lambda performs its tasks in the back-end.
Check the below diagram for a summary to it:

Sensitivity: Internal & Restricted

Let’s understand the reason behind it.
Why not other computing services but lambda?

As we are aware that AWS Lambda is one of the computing services provided by Amazon and if we talk about
other computing services to execute a task, like AWS EC2 & AWS Elastic beanstalk, why should we choose Lambda
in place of them. Let’s try to understand this:
 AWS EC2 VS AWS Lambda
 AWS Elastic Beanstalk VS AWS EC2

As we know that in AWS EC2, one can host a website as well as run and execute the back-end codes.

AWS Lambda AWS EC2

AWS Lambda is a Platform as a Service (PaaS) with AWS EC2 is an Infrastructure as a Service (IaaS) provided
a remote platform to run & execute your back-end with the virtualized computing resources.

No flexibility to log in to compute instances, choose Offers the flexibility to choose the variety of instances,
customized operating system or language runtime. custom operating systems, network & security patches

Sensitivity: Internal & Restricted

Just choose your environment where you want to For the first time in EC2, you have to choose the OS and
run your code and push the code into AWS Install all the software required and then push your code
Lambda. in EC2.

Environment restrictions, as it is restricted to few No environment restrictions are there.

languages only.

AWS Lambda VS AWS Elastic Beanstalk

AWS Elastic Beanstalk AWS Lambda

Deploy and manage the apps on AWS Cloud Whereas AWS Lambda is used only for running and
without worrying about the infrastructure that executing your Back-end code, it can’t be used to deploy
runs those applications. an application.

Freedom to select AWS resources, like, choose an Whereas in Lambda, you cannot select the AWS resources,
EC2 instance type which is optimal for your like a type of EC2 instance, lambda provides you resources
application. based on your workload.

It is a stateful system.
It is a stateless system.

Now that we understand how AWS Lambda plays its part, let’s take a sneak peek on its pros and cons.
Benefits of AWS Lambda

 Due to its serverless architecture, one need not worry about provisioning or managing servers.
 No need to set up any Virtual Machine (VM).

Sensitivity: Internal & Restricted

 Allows the developers to run and execute the codes response to events without thinking of building any
 Pay as you go: Just pay for the compute time taken, only when your code runs.
 Pay only for the used memory, the number of processed code requests and the code execution time
rounded up by 100 milliseconds.
 You can easily monitor your code performance in real time through CloudWatch.
Limitations of AWS Lambda

There are several limitations of AWS Lambda due to its hardware as well as its architecture:
 The maximum execution duration per request is set to only 300 seconds (15 Minutes).
 In the case of hardware, the maximum disk space provided is 512 MB for the runtime environment, which
is very less
 Its memory volume varies between 128 to 1536 MB.
 Event request body cannot exceed more than 128 KB.
 Its code execution timeout is only 5 minutes.
 Lambda functions write their logs only to CloudWatch, which is the only tool available in order to monitor
or troubleshoot your functions.

So, these are the limitations of AWS Lambda which are basically there to ensure that the services are used as

Now let’s come to its pricing part!

AWS Lambda Pricing

Sensitivity: Internal & Restricted

Just like every service provided by AWS, AWS Lambda is also a pay per use service.
 It is based on the number of requests made
 Pay only for the number of requests you made on all your lambda functions.
 Prices are as follows:
 First 1 million requests, every month are for free.
 20$ per million requests thereafter.
 Also based on the duration
 These prices depend on the amount of memory you allocate to your function.
 First 400,000 GB-seconds per month, up to 3.2M seconds of computing time, are free.
*Source: AWS Official Website

Now that we’ve discussed its pricing, now let’s move forward and investigate its very general use cases.
Use Cases of AWS Lambda

Sensitivity: Internal & Restricted

Serverless Websites

Building a serverless website allows you to focus on your website code. You don’t have to manage and operate its
infrastructure. Sounds cool, isn’t it? Yes, this is the most common and interesting use case of AWS Lambda, where
people are actually taking advantage of its pricing model. Hosting your static website over S3 and then making it
available using AWS Lambda makes it easier for the users to keep track of their resources being used and to
understand if their code is that feasible or not. That even with the ability to troubleshoot and fix the problem in no
Automated Backups of everyday tasks

One can easily schedule the Lambda events and create back-ups in their AWS Accounts. Create the back-ups, check
if there are any idle resources or not, generate reports and other tasks which can be implemented by Lambda in no
Filter and Transform data

One can easily use it for transferring the data between Lambda and other Amazon Services like S3, Kinesis, Redshift
and database services along with the filtering of the data. One can easily transform and load data between Lambda
and these services. When we investigate its Industrial use cases, then a very apt implementation of Lambda can be
found in a company name Localytics.
Use case in Localytics

Localytics is a Boston-based, web and mobile app analytics and engagement company. Its marketing and analytics
tools are being extensively used by some major brands such as ESPN, eBay, Fox, Salesforce, RueLaLa and the New
York Times to understand and evaluate the performance of the apps and to engage with the existing as well as the

Sensitivity: Internal & Restricted

new customers. Their software is employed in more than 37000 apps on more than three billion devices all around
the globe.

Regardless of how popular Localytics is now, Localytics faced some serious challenges before they started using

Let’s see what those challenges were before we discuss how lambda came to the rescue and helped Localytics
overcome these challenges, let’s check what were the challenges.
 Billions of data points, that are uploaded every day from different mobile applications running Localytics
analytics software are fed to the pipeline that they support.
 Additional capacity planning, utilization monitoring, and infrastructure management were required since
the engineering team had to access subsets of data in order to create new services.
 Platform team was more inclined towards enabling self-service for engineering teams.
 Every time a microservice was added, the main analytics processing service for Localytics had to be

The big solution to all these challenges is “Lambda”.

The Solution
 Localytics now uses AWS to send about 100 billion data points monthly through Elastic Load Balancing
where ELB helps in distributing incoming application traffic across multiple targets
 Afterward, it goes to Amazon Simple Queue Service where it enables you to decouple and scale
microservices, distributed systems, and serverless applications.

Sensitivity: Internal & Restricted

 Then to Amazon Elastic Compute Cloud, and finally into an Amazon Kinesis stream which makes it easy to
collect, process, and analyze real-time, streaming data so you can get timely insights and react quickly to new
 With the help of AWS Lambda, a new Microservice is created for each new feature of marketing software
to access the Amazon Kinesis. Microservices can access the data in parallel with others.

Rest you can understand from the below diagram for the same:

With all the benefits it provides, Lambda has contributed to the popularity of Localytics.
 Lambda rules out the need to provision and manage infrastructure in order to run each Microservice.
 Processing tens of billions of data aren’t as big of a hassle as it were before as Lambda automatically
scales up and down with the load.
 Lambda enables the creation of new Microservices to access data stream by decoupling the product
engineering efforts from the platform analytics pipeline eliminating the need to be bundled with the main analytics

After addressing AWS Lambda, its function along with its workflow and use cases, now let’s end our tutorial with
running our first lambda function.

In this Hands-on, we will take you through the step wise guide on how to create a lambda function using lambda

Let’s start by creating a Lambda Function.

 After setting the AWS Account.
 Go and type AWS Lambda in your AWS Management console.
 Click on Create a Function.

Sensitivity: Internal & Restricted

 You’ll see a setup page shows up where you have to fill up few aspects for your function as the name,
runtime, role, you can choose from blueprints as well but here we’re going to author it from scratch.

 Enter the name and all the credentials, now in case of the runtime, you can choose any based on your
understanding of that language, we’re choosing NodeJS 8.10, here you can choose from any option like python,
java, .Net, Go (these are the languages it supports).

Sensitivity: Internal & Restricted

 Then create a role, you’ll have to create a new role if you don’t have one, either you create a new
template for the role or leave the template blank.
 Like in our case, we’ve chosen an existing role that we have created

 As here we have already defined our role with the name of service-role/shubh-intel.

 The next step after this is Writing Code for your Lambda Function.
 We’re choosing Lambda Console here, you can choose from different code editors like Cloud9 Editor or
on your Local machine.
 You can check your function being created, as here we have created it with the name of example-lambda.
 The AWS IoT acts as the mediator between the components and the network.  It hence, gathers
information from those things and works on them. AWS IoT is defined as a platform which enables you to
connect devices to AWS Services along with the other devices, secure data and interactions, process and
act upon device data, and also allows applications to interact with devices even when they are offline.
 AWS IoT has the following gears :
Gears Description
Message It gives safety for the IoT and its uses. For issuing and donating we have to use

Sensitivity: Internal & Restricted

MQTT and HTTP Rest IP.

Correlation of different services of the same kind is done by the rules engine.
For processing and transferring information, SQL-based code  like Amazon
Rules engine S3,DB,Lambda, etc. has to be choosen.
It is also called as a device shadow. It gives the present data of any component
connected to IoT. For that a JSON material is used.
Thing shadow
It takes care of the protection in a shared basis. The message transfer to and
Security and from the things  are done using these services with confidential credentials.
identity service
It allows  the gadgets to interact with the IoT in a safe and sound environment.
Device gateway

 When you created the function, you will be directed to a Function Code screen, where you will be
defining your function, either you can choose the code from below or you can make your own template for it, it’s
quite easy.

Sensitivity: Internal & Restricted

exports.handler = async (event) => {// TODO implement

return ‘Hello from Lambda!’

 If you want to define the key value then you can, like here we’ve defined key1 and key2 and key 1= ‘Hello
from Lambda!’
 Then create the event like we created it with the name of mytestevent click Save and Test in order to run
your function.

 After running it, you will get an output where you check the details and you will get the output as below:

Sensitivity: Internal & Restricted

Sensitivity: Internal & Restricted
Congratulations! Now you’ve created and executed your Lambda Function successfully.
AWS Analytics Overview
The  services provided by the analytical  tools of AWS are as follows:
The data are organized and controlled by the  Amazon Elastic MapReduce with the assistance of the Hadoop
technology  sharing deals.This tool very simply sets up and manages the Hadoop framework. This tool controls the
calculative assets and carries on the MapReduce process.  The  flowing of the data in larger concern is done by the 
Amazon Kinesis. We can transfer the information from the Kinesis to any storeroom like Redshift, EMR cluser etc.
The Data pipeline manages the transfer of information and also actioning on them. The required pipeline tells
about the  various parameters like the input information, the terms and conditions for transfer, the location to
where the data need to be carried away etc.
Some of the analytic tools are –
Tools Description

It uses Hadoop, which is an open source framework, for managing

Amazon and processing data. It uses the MapReduce engine to distribute
EMR  processing using a cluster.

It helps in regularly moving and processing data. Using a pipeline,

we define the input data source, the computing resources in
order to perform the processing, any conditions that must be
AWS validated before performing any processing is also defined, and 
Data the output data location such as Amazon S3, Amazon DynamoDB
Pipeline etc is to be defined.

Amazon Kinesis allows real-time processing of streaming data at a

humongous scale. One can also send data from Amazon Kinesis to
Amazon a data warehouse, such as Amazon S3 or Amazon Redshift or to
Kinesis an Amazon EMR cluster.

By the use of Amazon ML, developers can easily use machine

learning technology for obtaining predictions for their
Amazon applications by using simple APIs.
Amazon Cognito
Amazon Cognito finds out ways to recognize sole users, recover provisional ,unimportant passwords and helps in
information management operations.

To initiate with Cognito, the steps are:

1. Register in AWS.
2. Get the token of your application.
3. Develop an identity pool for Cognito.
4. Develop SDK , accumulate and then synchronize the information.

The maximum size of the dataset is 1 MB and that of the identity is 20 MB.

Developing a dataset and and put keys is done by the following command:

DataSet *dataset = [syncClient openOrCreateDataSet:@”myDataSet”];

Sensitivity: Internal & Restricted

NSString *value = [dataset readStringForKey:@”myKey”];[dataset putString:@”my value” forKey:@”myKey”];

As the applications in the cloud increases, the cost of Cognito also increases. We will be able to use use 10 GB of
storeroom for the first year.
The mobile SDK allows easy building of the AWS applications. A Few characteristics of them are as follows:
Object Mapper

It helps in admitting the DynamoDB from the applications. It assists us to program for converting substance into
tables and vice versa. The substances allow reading, writing and removing services on the items and support
S3 Transfer Manager

It helps for uploading and downloading documents from S3 by increasing the performance level and dependency.
Operations on file transfer now can be further modified. Using BFTask this tool is repaired and now transformed
into a better and cleaner boundary.
iOS / Objective-C Enhancements

The Mobile SDK helps ARC with BFTask for better utilization of Objective C. and Cocoapods.

AWS Development Tools Overview

Applications of AWS are used by developer tools brilliantly. Tools used by the developers are as below:
 AWS Management Console : It manages the quickly growing Amazon architecture.It  controls your
calculation, storing  and also some cloud based activities using a very simple graphical border.
 AWS Toolkit for Eclipse : It is a tool for using  Java with AWS. It helps in  installing, unfolding as well as
developing Java with AWS. Various services from Java can be communicated by making use of the explorer. It even
consists of the most up-to-date edition of the Java SDK.
 AWS Toolkit for Microsoft Visual Studio : This tool makes .NET functions to be easily used in AWS.
Various services from the Visual Studio IDE can be communicated by making use of the explorer. It even consists of
the most up-to-date edition of the .NET  SDK. Along with all these it also supports services related to the cloud also
 Some of the tool and their description are as follow.
Tools Description
It make it easy to work with AWS APIs in any of the preferred
AWS SDKs programming language as well as platform.

AWS Command AWS also offers the AWS Command Line Interface (CLI). It is a single tool
Line Tools for controlling as well as managing multiple AWS services

The AWS Toolkits gives specialized cloud tools integration into your
IDE Toolkits development environment.

 In order to download the AWS SDKs, you can go to Tools for Amazon Web Services.
 In order to download the AWS CLI or the PowerShell tools, you can go to Tools for Amazon Web Services.

Sensitivity: Internal & Restricted

AWS Development Tools Overview

Applications of AWS are used by developer tools brilliantly. Tools used by the developers are as below:
 AWS Management Console : It manages the quickly growing Amazon architecture.It  controls your
calculation, storing  and also some cloud based activities using a very simple graphical border.
 AWS Toolkit for Eclipse : It is a tool for using  Java with AWS. It helps in  installing, unfolding as well as
developing Java with AWS. Various services from Java can be communicated by making use of the explorer. It even
consists of the most up-to-date edition of the Java SDK.
 AWS Toolkit for Microsoft Visual Studio : This tool makes .NET functions to be easily used in AWS.
Various services from the Visual Studio IDE can be communicated by making use of the explorer. It even consists of
the most up-to-date edition of the .NET  SDK. Along with all these it also supports services related to the cloud

AWSCSA-3: 10,000 Foot Overview

Each of the AWS Components

Tier 1: AWS Global Infrastructure Tier 2: Networking Tier 3: Compute, Storage, Databases Tier 4: Analytics, Security
and ID, Management Tools Tier 5: App Services, Dev Tools, Mobile Services Tier 6: Enterprise Applications. Internet
of things

There are a number of different regions spread across the world for AWS.

What's a region? Geographical area.

Each have two availability zones. These are data centers.

What are edge locations? CDN locations for CloudFront. CloudFront is AWS CDN service. Currently over 50 edge


VPC: Virtual Private Cloud - A virtual data center. You can have multiple VPCs. Basically a data center in your AWS
account. Isolated set of resources.

Direct Connect - A way of connecting into AWS without an internet connection.

Route53 - Amazons DNS service. 53 because of the port that the service sits on.


EC2: Elastic Cloud 2 - A virtual server you can provision fast.

EC2 Container Service - Sometimes ECS. A highly scalable service for Docker.

Sensitivity: Internal & Restricted

Elastic Beanstalk - Easy to use service for deploying and scalable web services that have been developed with
Java, .NET, Node, PHP, JS, Go, Docker and more. It is designed for developers to upload code and have AWS
provision services. It is essentially AWS for beginners.

Lambda - By far the most powerful service. Let's you run code without provisioning or managing servers. You only
pay for the compute time. You pay for execution time.


S3 - Object Based Storage as a place to store your flat files in the cloud. It is secure, scalable and durable. You pay
for the storage amount.

CloudFront - AWS CDN service. It integrates with other applications. It is an edge location for cacheing files.

Glacier - Low cost storage for long term storage and back up. Up to 4 hours to access it. Think of it as an archiving

EFS - Elastic File Storage - used for EC2. NAS for the cloud. Connects up to multiple EC2 instanges. Block level. It is
still in preview, and not currently in exams.

Snowball - Import/Export service. It allows you to send in your own hard disks and they will load the data onto the
platform using their own internal network. Amazon give you the device and you pay for it daily.

Storage Gateway - The service connecting on-premise storage. Essentially a little VM you run in your office or data
centers and replicates AWS.


RDS: Relational Database Services - Plenty of well known platforms.

DynamoDB - NoSQL Database Storage. Comes up a lot in the dev exam.

Elasticache - A way of cacheing the databases in the cloud.

Redshift - Amazon's data warehousing service. Extremely good performance.

DMS: Database Migration Services - Essentially a way of migrating DB into AWS. You can even convert DBs.


EMR: Elastic Map Reduce - This can come up in the exam. It's a way of processing big data.

Data Pipeline - Moving data from one service to another. Required for pro.

Elastic Search: A managed service that makes it easy to deploy, operate and scale Elastic Search in the AWS cloud.
A popular search and analytics option.

Kinesis - Streaming data on AWS. This is a way of collecting, storing and processing large flows of data.

Machine Learning - Service that makes it easy for devs to use machine learning. Amazon use it for things like
products you might be interested in etc.

Quick Sight - A new service. It's a business intelligence service. Fast cloud-powered service.

Sensitivity: Internal & Restricted

Security and Identity

IAM: Identity Access Management - Where you can control your users, groups, roles etc. - multifactor auth etc etc

Directory Service - required to know

Inspector - allows you to install agents onto your EC2 instances. It searches for weaknesses.

WAF: Web App Firewall service - Recent product.

Cloud HSM (Hardware Security Module) - Comes up in the professional exam.

KMS: Key Management Service - Comes up lightly.

Management Tools

Cloud Watch - Different metrics you can create

Cloud Formation - Deploying a Wordpress Site. Does an amazing amount of autonomous work for you.

Cloud Trail - A way of providing audit access to what people are doing on the platform. Eg. changes to EC2
instances etc.

OpsWorks - Configuration Management service using Chef. We will create our own OpsWork Stack.

Config - Relatively new service. Fully managed service with a AWS history and change notifications for security and
governance etc. Auto checks the configuration of services. Eg. ensure all encrypted services attached to EC2 etc.

Service Catalog - Manage service catalogs approved by AWS.

Trusted Advisor - Does come up in the exam. Automated service that scans the environment and gives way you can
be more secure and save money.

Application Services

API Gateway - A way to monitor APIs.

AppStream - AWS version of ZenApp. Stream Windows apps from the cloud.

CloudSearch - Managed service on the cloud that makes it manageable for a scale solution and supports multiple
languages etc.

Elastic Transcoder - A media transcoding service in the cloud. A way to convert media into a format that will play
on varying devices.

SES: Simple Email Service - Transactional emails, marketing messages etc. Also can be used to received emails that
can be integrated.

SQS - Decoupling the infrastructure. First service ever launched.

SWF: Simple Workflow Services - Think of when you place an order on AWS, they use SWF so that people in the
warehouse can start the process of collecting and sending packages.

Sensitivity: Internal & Restricted

Developer Tools

CodeCommit - Host secure private .git repositories CodeDeploy - A service that deploys code to any instance
CodePipeline - Continuous Delivery Services for fast updates. Based on code modes you define.

Mobile Services

Mobile Hub - Building, testing and running use of phone apps.

Cognito - Save User preference data.

Device Farm - Improve the quality of apps tested against real phones.

Mobile analytics - Manage app usage etc.

SNS: Simple Notification Service - Big topic in the exam. Sending notifications from the cloud. You use it all the time
in production.

Enterprise Apps

Workspaces - Virtual Desktop in the Cloud

WorkDocs - Fully managed enterprise equivalent to Dropbox etc. (safe and secure)

WorkMail - Amazon's answer to Exchange. Their email service.

Internet of Things

Internet of things - A new service that may become the most important.

AWSCSA-4: Identity Access Management (IAM)

IAM 101

It's the best place to start with AWS.

It allows you to manage users and their level of access to the AWS Console. It is important to understand IAM and
how it works, both for the exam and for administrating a companies AWS account in real life.

What does IAM give you?

 Centralised control of your AWS account

 Shared Access to your AWS account
 Granular permissions
 Identity Federation (including FB, LinkedIn etc)
 Multifactor Auth
 Provide temporary access for users/devices and services where necessary
 Allows you to set up your own password rotation policy
 Integrates with many AWS service
 Supports PCI DSS Compliance

Sensitivity: Internal & Restricted

Critical Terms

1. User - End Users (people)

2. Group - A collection of users under one set of permissions
3. Roles - You create roles and can then assign them to AWS resources
4. Policies - document that defines permissions. Attach these to the above.

AWSCSA-5: IAM - Identity Access Management Crash Course

Log into IAM.

You'll find the IAM users sign-in link near the top.

 You can customize this sign-in link instead of the number

Go through the Security Status and tick off all the boxes!

Activate MFA on your root account

 you can add multifactor auth to secure your root account.

 select the appropriate device

Create individual IAM users

 Currently we'll be logged in as the root account

 Manage and create the user
 The keys given are for the command line or API interaction. Download and store.
 Add a password.
 By default, users have no permissions.
 You can use policies to give permissions. Policies in JSON. Attach them to a user.
 Instead, you can create a group with these policies. Afterwards, you can attach users to the group.

Apply an IAM password policy

Manage the password policy.

Configuring a Role

It'll make sense when you start using EC2. It's about having resources access other resources in AWS.

Create a role.

Different Types of Roles

We'll choose Amazon EC2 for our role. Select S3 full access as your policy for now.

AWSCSA-6: S3 Crash Course

Sensitivity: Internal & Restricted

S3 provides developer and IT teams with secure, durable, highly-scalable object storage.

It's easy to use, has a simple web services interface to store and retrieve any amount of data from anywhere on
the web. Think of a hard drive on the web.

 Data is stored across multiple devices and facilities. It's built for failure.
 Not a place for databases etc. you need block storage for that.
 Files from 1 byte to 5TB. You can store up to the petabyte if you wanted to.
 Files are stored in buckets - like the directories.
 When you create a bucket, you are reserving that name.
 Read after Write consistency for PUTS of new objects
 Eventual Consistency for overwrite PUTS and DELETES (can take some time to propagate)

S3 is a key, value store

 You have your "key" - the name of the object

 You have your value - simply the data and is made up of a sequence of bytes
 You have the version ID (important for versioning)
 Metadata (data about the data)
 Subresources
 Access Control Lists

The basics

 It guarantees 99.99% availability.

 11 9's durability for S3 info.
 Tiered Storage available
 Lifecycle Management
 Versioning
 Encryption
 Secure through access control lists and bucket policies

S3 Storage Tiers/Classes

1. S3 - Normal
2. S3 IA (Infrequently Accessed) - retrieval fee
3. RRS - Reduced Redundancy Storage. Great for objects that can be lost.
4. Glacier - Very cheap, but 3-5 hours to restore from Glacier! As low $0.01 per gigabytes per month

S3 is charged for the amount of storage, the number of requests and data transfer pricing.

Not a place to upload an OS or a database.

Sensitivity: Internal & Restricted

AWSCSA-6.1: Create an S3 Bucket

From the Console, select S3

Again, think of the bucket as a "folder".

If it is the first time accessing it, you will be greeted with a screen that simply allows you to create a bucket.

Top right hand side gives you options for what the side bar view is.

Under properties, you can change things such as Permissions, Static Website Hosting etc.

 For the Static Websites, you don't have to worry about load balances etc.
o You can't server-side scripts on these websites eg. php etc.
 Logging can be used to keep a log
 Events are about triggering something on a given action eg. notifications etc.
 You can allow versioning
 Cross-Region Replication can be done for other regions

If you click on a bucket, you can see details of what is there. By default, permissions are set that access is denied.

You can set the Storage class and Server-Side Encryption from here too.

Allowing public read of the bucket

"Version": "2008-09-17",
"Statement": [
"Sid": "AllowPublicRead",
"Effect": "Allow",
"Principal": {
"AWS": "*"
"Action": "s3:GetObject",
"Resource": "arn:aws:s3:::dok-basics/*"

Sensitivity: Internal & Restricted

---- AWSCSA-6.2: S3 Version Control

If you turn Versioning on for the bucket, you can only suspend it you. You cannot turn it off. Click it and you can set
the versions.

To add files, you can go and select "Actions" and select upload.

If you show the Versions, it will give you the version ID.

If you delete the file, it will show you the file and the delete markers. You can restore the file by selecting the
delete marker and selecting actions and delete.

Bear in mind, if you do versioning, you will have copies of the same file.

Cross-Region Replication

To enable this, go to your properties. You will need to Enable Versioning for this to be enabled.
In order for this to happen, you also need to Create/Select IAM Roles for policies etc.

Existing Objects will not be replicated, only uploads from then on.

Amazon handle all the secure transfers of the data for you.

Versioning integrates with Lifecycle rules. You can turn on MFA so that it requires an auth code to delete.

AWSCSA-7: CloudFront

It's important to understand the key fundamentals.

A CDN is a system of distributed servers (network) that deliver webpages and other web content to a user based
on the geographic locations of the user, the origin of the webpage and a content delivery server.

Key Terms

Edge Location: The location where content will be cached. This is separate to a AWS Region/Avail Zone.

Origin: This is the origin of all the files that the CDN will distribute. This can be a S3 bucket, EC2, Route53, Elastic
Load Balancer etc. that comes from the source region.

Distribution: This is the name given to the CDN which consists of a collection of Edge Locations.

TTL (Time to Live): TTL is the time that is remains cached at this CDN. This cacheing will make is more useful for
other users.

Sensitivity: Internal & Restricted

Web Distribution: Typically used for Websites.

RTMP: Used for Media Streaming.


__*Amazon CloudFront can be used to deliver your entire website, including dynamic, static, streaming and
interactive content using a global network of edge locations. Requests for your content are automatically routed to
the nearest edge location, so content is delivered with the best possible performance.

CloudFront is optimized to work with other AWS. CloudFront also works seamlessly with any non-AWS origin
server, which stores the original, definitive versions of your files.*__

You can also PUT to a Edge Location, not just READ.

You can also removed cached objects, but it will cost you.

---- AWSCSA-7.1: Create a CloudFront CDN

Head to CloudFront from the Dashboard.

Create a distribution and create a web distribution.

Select the bucket domain name. The origin path is the folder path if you just want something from individual

You can have multiple distributions from the one bucket.

You can restrict the bucket access to come only from the CDN.

Follow the steps to allow things like Read Permissions and Cache Behaviour.

Distribution Settings

If you want to use CNAMEs, you can upload the certificate for SSL.

Default Root Object is if you access the naked URL.

Turn Logging "on" if you want.

Once the Distribution is Ready

After it is done, you can use the domain name provided to start accessing the cached version.

You can create multiple Origins for the CDN, and you can also updates behaviours etc. for accessing certain files
from certain types of buckets.

You can also create custom error pages, restrict content based on Geography etc.

AWSCSA-8: Securing Buckets

By default, all newly created buckets are PRIVATE.

Sensitivity: Internal & Restricted

Access Controls

 Bucket Policies
 Access Control Lists
 S3 buckets can be configured to create access logs which log all requests made to the S3 bucket


2 types

 In Trasit: SSL/TLS (using HTTPS)

 At Rest:
o Server-side
 S3 Managed Keys - SSE-S3
 AWS Key Management Service, Managed Keys - SSE-KMS
 Server-Side Encryption with Customer Provided Keys - SSE-C
o Client Side Encryption

AWSCSA-9: Storage Gateway

This is just a theoretical level intro.

AWS Storage Gateway is a service that connects an on-premises software appliance with cloud-based storage to
provide seamless and secure integration between an organization's on-premises IT environment and AWS's storage
infrastructure. The service enables you to securely store data to the AWS cloud for scalable and cost-effective

Essentially replicates data from your own data center to AWS. You install it as a host on your data center.

You can use the Management Console to create the right console for you.

Three Types of Storage Gateways

1. Gateway Stored Volumes - To keep your entire data set on site. SG then bacs this data up asynchronously
to Amazon S3. GS volumes provide durable and inexpensive off-site backups that you can recover locally
or on Amazon EC2.

2. Gateway Cached Volumes - Only your most frequently accessed data is stored. Your entire data set is
stored in S3. You don't have to go out and buy large SAN arrays for your office/data center, so you can get
significant cost savings. If you lose internet connectivity however, you will not be able to access all of your

3. Gateway Virtual Tape Libraries (VTL) - Limitless collection or VT. Each VT can be stored in a VTL. If it is
stored in Glacier, is it a VT Shelf. If you use products like NetBackup etc you can do away with this and just
the VTL. It will get rid of your physical tapes and create the virtual ones.

Sensitivity: Internal & Restricted


 Know and understand the different gateways.


AWSCSA-10: Import/Export

AWS Import/Export Disk accelerates moving large amounts of data into and out of the AWS cloud using portable
storage devices for transport. AWS Import/Export Disk transfers your data directly onto and off of storage devices
using Amazon's high-speed internal network and bypassing the Internet.

You essentially go out and buy a disk and then send it to Amazon who will then import all that data, then send your
disks back.

There is a table that shows how connection speed equates to a certain amount of data uploaded in a time frame to
give you an idea of what is worthwhile.

---- AWSCSA-10.1: Snowball

Amazon's product that you can use to transfer large amounts of data in the Petabytes. It can be as little as one fifth
the price.

Check the FAQ table on when they recommend to use Snowball.


Import/Export Disk:

 Import to EBS
 Import to S3
 Import to Glacier
 Export from S3

Import/Export Snowball:

 Only S3
 Only currently in the US (check this out on the website for the latest)

AWSCSA-11: S3 Transfer Acceleration

Uses the CloudFront Edge Network to accelerate the uploads to S3. Instead of uploading directly to S3, you can use
a distinct URL to upload directly to an edge location which will then transfer that file to S3. You will get a distinct
URL to upload to.


Sensitivity: Internal & Restricted

Using the new URL, you just send the file to the edge location, and that edge location will send that to S3 over their
Backbone network.

---- AWSCSA-11.1: Turning on S3 Transfer Acceleration

From the console, access your bucket. From here, what you want to do is "enable" Transfer Acceleration. This
endpoint will incur an additonal fee.

You can check the speed comparison and it will show you how effective it is depending on distance from a region.
If you see similar results, the bandwith may be limiting the speed.

AWSCSA-12: EC2 - Elastic Compute Cloud

---- AWSCSA-12.1: EC2 Intro

Arguably the most important topics.

EC2 is a web service that provides resizable compute capacity in the cloud. Amazon EC2 reduces the time required
to obtain and boot new server instances to minutes, allowing you to quickly scale capacity, both up and down, as
your computing requirements change.

To get a new server online used to take a long time, but then public cloud came online and you could provision
virtual instances in a matter of time.

EC2 changes the economics of computer by allow you to pay only for capacity that you actually use. Amazon EC2
provides developers the tools to build failure resilient applications and isolate themselves from common failure

Pricing Models

1. On Demand - allows you to pay a fixed rate by the hour with no commitment.
2. Reserved - provide you with a capacity reservation, and offer a significant discount on the hourly charge
for an instance. 1 year or 3 year terms.
3. Spot - enable you to bid whatever price you want for instance capacity, providing for even greater savings
if your applications have flexible start and end times.

You would use Reserved if you have a steady state eg. two web servers that you must always have running.

Used for applications that required reserved capacity.

Users are able to make upfront payments to reduce their total computing costs even further.

Sensitivity: Internal & Restricted

It makes more sense to use this if you know the amount of memory etc you may need and that you'll need it.
Useful after understanding the steady state.

On Demand would be for things like a "black Friday" sale where you spin up some web servers for a certain
amount of time.

This is for low cost and flexible EC2 without any up-front payment or long-term commitment. Useful for
Applications with short term, spiky, or unpredictable workloads that cannot be interrupted, or being tested or
developed on EC2 for the first time.

Spot Instances go with your bidding, but if the spot price goes above your bid, you will be given an hour notice
before the instance is terminated. Large compute requirements are normally used this way. They basically time
these instances and search where to get the best pricing.

You can check Spot Pricing on the AWS website to see the history etc. to make an educated guess.

This is for applications on feasible at low costs, and for users with urgent computing needs for large amounts of
additional capacity.

Spot is always the most commercially feasible.

Spot won't charge a partial hour if Amazon terminate the instance.

Instance Types

 T2 is the lowest of the family.

 Applications use M normally.
 C4/3 is for processor intensive.
 R3 is memory optimized.
 I2 - noSQL databases etc.
 D2 Data warehouses etc.

To think of it, think of the MCG digging up the dirt.


D for density I for IOPS R for RAM T for cheap general purpose (this T2) M for main choice for apps C for compute G
for Graphics

This is VERY useful for working professionally.

---- AWSCSA-12.2: What is EBS? (Elastic Block Store)

Amazon EBS allows you to create storage volumes and attach them to Amazon EC2 instances. Once attached, you
can create a file system on top of these volumes, run a database, or use them in any other way you would use a
block device.

Sensitivity: Internal & Restricted

Amazon EBS volumes are placed in a specific Availability Zone, where they are automatically replicated to protect
you from the failure of a single component.

It is basically a disk in the cloud. The OS is installed on here + any DB and applications. You can add multiple EBS
instances to one EC2 instance.

You cannot share one EBS with multiple EC2.

Volume Types

1. General Purpose SSD (GP2)

o 99.999% availability
o Ratio of 3 IOPS per GB with up to 10,000 IOPS and the ability to burt up to 3000 IOPS for short
periods for volumes under 1GB.
o IOPS are Input/Output Per Second to measure how fast the computer is from a read/write
2. Provisioned IOPS SSD (IO1)
o Designed for I/O intensive apps like relational/NoSQL databases. Use if you need more than
10,000 IOPS.
3. Magnetic (standard)
o Lowest cost per GB, where data is accessed frequently and applications where the lowest storage
cost is important.

Initially, there will be no instances etc. except for 1 Security Group.

Click on "Launch Instance" and will take you to choose an AMI (Amazon Machine Image).

You will see different classification types in the brackets:

1. HVM: Hardware Virtual Machine.

2. PV: Para Virtual

For this example, we will choose Amazon Linux because it comes with a suite of things already available. This will
then take you to choose the Instance Types.

You can then Configure your Instance Details.

Subnets can choose different availability zone.

You can choose the IAM role from here too.

The Advanced Details is us running a set of bash scripts.

Example: run updates after launch.

yum update -y
In Step 4: Add Storage, we can change the IOPS by changing the size of the instance and we can alter the Volume

By default, the instance will terminate the EBS volume.

Sensitivity: Internal & Restricted

We can also add a new volume etc.

Step 5: Tag Instance is used to give a key value pair for the tag.

 This is useful for things like resource tagging and billing purposes. You can monitor which staff are using
what resources.

Step 6: Security Groups

Add in the security groups you need eg. HTTP, SSH

After reviewing and Selecting launch, you will need to download a new key pair.

rom here, we are back to the EC2 Dashboard and that shows us the status.

Once the status is available, head to terminal and configure the file the way you normally would in your
/.ssh/config file (for easy access - refer to the SSH-7 file for more info).

Note: Ensure that you change the permissions on the .pem key first!

chmod 600 <keyname.pem>

We can use sudo su to update our privileges to root.
Then yum update -y to run the updates.

 Termination Protection is turned off by default

 The default action for an EBS-backed instance is to be deleted.
 Root Volumes cannot be encrypted by default. You need a third party tool.
 Additional Volumes can be encrypted.

---- AWSCSA-12.4: Security Groups Basics

Go to the EC2 section in the AWS console and select Security Groups.

You can edit Security Groups on the fly that takes effect immediately.

In terminal

1. SSH in
2. Turn to root
3. Install apache yum install httpd -y

We check the server status using service httpd status

To enable the server - use service httpd start
To auto start up - use chkconfig httpd on
For everything that is publicly accessible, we move in /var/www/html
Feel free to nano a website and test it out. We can see the website by navigating to on a web
Note: You need to ensure that HTTP access is allowed from anywhere on your security group! You can only allow
rules, not deny rules.

Sensitivity: Internal & Restricted

If we were to delete the Outbound traffic, it won't change anything just yet as it is "stateful". If something can go
in, it can also go back out.

WSCSA-12.5: Volumes VS Snapshots + Creating From Snapshots

 Volumes exist on EBS

o Virtual Hard Disk
 Snapshots exist on S3
 You can take a snapshot of a volume, this will store that volume on S3
 Snapshots are point in time copies of Volumes
 Snapshots are incremental, this means that only the blocks that have changed since your last snapshot are
moved to S3.
 If this is the first snapshot, it may take some time to create.

From the EC2 Dashboard, you can select "Volumes" and then set a name for the volume. It is best practice to name
your volumes.

You can Create Volumes and set them available.

You can use Actions to attach this to an Instance

To check what volumes are attached to an instance, you can run the following from the command line: lsblk (think
list block)
The first thing we generally want to do is to check if there is any data on it.

file -s /dev/xvdf for this particular example.

Format the volume

mkfs -t ext4 /dev/xvdf

Now we want to make a fileserver directory.

mkdir /fileserver
mount /dev/xvdf /fileserver
cd /fileserver
ls // shows lost+found
rm -rf lost+found/
nano helloworld.txt
nano index.html

# now let's unmount the volume

cd ..
umount /dev/xvdf
cd /fileserver
# check there are no files
Now go ahead and detach that volume. This will change the state back to available.

Create a snapshot

Sensitivity: Internal & Restricted

Select Action > Create Snapshot and fill in the details and select Create
You can select Snapshot on the left and check the Snapshot is there. If you delete the volume, and then go back to
the snapshot, you can select it and go to actions and you can create a new volume and create it with this Snapshot.

We can then attach the new volume. Now we can go through the above process and mount again.
Using the command file -s /dev/xvdf we can check the files available.

Snapshots of encrypted volumes are encrypted automatically.

You can share Snapshots if they are unencrypted.

To create a snapshot for Amazon EBS volumes that serve as root devices, you should stop the instance before
taking the snapshot.

AWSCSA-13: Create an Amazon Machine Image (AMI)

You specify the AMI you want when you launch an instance, and you can launch as many instances from the AMI
as you need.

Three Components

1. A template for the root volume for the instance

2. Launch permissions that control which AWS accounts can use the AMI to launch instances
3. A block device mapping that specifies the volumes to attach to the instance when it's launched

If you have a running instance (from before), we can create a snapshot of the volume that is attached to that

From here, we can select that Snapshot and select "Create Image". Set the name and choose settings, then select

Under the Images > AMIs on the left hand bar, you can see the images owned by you and the public image. You
can share these AMIs if you know the AWS Account Number.

If you are looking to make a public AMI, there are a few things that you would like to consider. Check the website
for that.

There is also a segment for Shared usage.

You will also want to delete you bash history.

history -c

 The snapshot is stored in S3.

 AMIs are regional. You can copy them to other regions, but you need to do this with the console or the
command line.

AWSCSA-13.1: EBS root volumes vs Instance Store

Sensitivity: Internal & Restricted

What is the difference between AMI types?

EBS backed and Instance Store backed are two different types.

We can select our AMI based on Region, OS, Architecture, Launch Permissions and Storage for the Root Device.

You can choose... - Instance Store - EBS Backed Volumes

When launching an instance, it will also mention what type of AMI type it is.

After IS have been launched, you can only add additional EBS from then on.

You cannot stop an IS. However, with EBS, you can. So why would I want to stop an instance? If the underlining
hypervisor is in a failed state, you can stop and start and it will start on another hypervisor. However, you cannot
do that on an IS. You also cannot detach. Better for provisioning speed times.

IS volumes are created from a template stored in S3 (and may take more time). Also know as Ephemeral Storage.

EBS you can tell to keep the root device volume if you so wish.

AWSCSA-14: Load Balancer and Health Checks

In the terminal

cd /var/www/html
Now head back to the console for EC2. Head to the load balancer section.

Leave the other defaults and move on to choose the security groups.

Configure the health check, which will use the healthcheck.html file.

Under the Adv Settings:

The Check will hit the file. The unhealthy thresh hold will check twice before bringing in the load balancer. The
healthy threshold will then wait for 10 successful checks before bringing back our service.

Repose Timeout: How long to check for the response Interval: How often to check Unhealthy Threshold: How many
fails before the load balancer comes in. Healthy Threshold: How many successes it needs to take load balancer off.

Move on and select the instance.

Move on and add tags.

Back on the dashboard, it should then come InService after the given time limit.
If this isn't working...

1. Check the website

2. Check the zone status
3. Ensure the web server is running

The DNS name in the information is what you will use to resolve the load balancer.

Sensitivity: Internal & Restricted

The Elastic Load Balancer is never given a static IP address. You have public instance static IPs, but not the ELB.

AWSCSA-15: CloudWatch for EC2

CloudWatch looks after all of your obervations for things like CPU usage etc.

You can set up detailed monitoring from the instances.

In CloudWatch

We have Dashboards, Alarms, Events, Logs and Metrics.

Metrics are added to our dashboards. Create a dashboard to check this.

EC2 metrics are only on a Hypervisor level. Memory itself is actually missing.

EC2 Metrics include CPU, Disk, Network and Status.

You can update the time frame on the top right hand side.

This whole thing is about being to keep a heads up on the different instances etc.


CW Events help you react to changes in the state of the AWS environment. Auto invoke things like a AWS Lambda
function to update DNS entries for that event.


You can store an agent on an EC2 instance and it will monitor data about that and send it to CloudWatch. These
will be things like HTTP response codes in Apache logs.


Eg. When CPUUtilization is higher than a certain amount.

You can then send actions to send emails.

You can select the Period and Statistics on how you want this to work etc.


 Standard memory is on by default at 5 minutes.

 You can have dashboards that tell you about the environment
 Set alarms and take actions
 Events help you respond to state changes in your AWS resources
 Logs can be used to help aggregate, monitor and store logs

AWSCSA-16: The AWS Command Line

Sensitivity: Internal & Restricted

Create a new instance without an IAM role.

You can use an existing key pair if you wish.

If we create a new user in IAM with certain permissions. After downloading the keys for this new user, we can
create a group that has full access for S3.

Back on the EC2, we can see on the dashboard a running instance that has no IAM role. You can only assign this
role when you create this instance.

Then, after connecting to the instance in the terminal.

Jump up to root using sudo su.

Then we can use the AWS command line tool to check certain things. Ensure that you have the AWS CLI installed.

We can configure things from here.

aws configure
If it needs the Access Key ID and Access Key, then copy and paste it in. Then choose the default region. You do not
to put anything for the output format.

We can retype aws s3 ls to see our list of what is in that environment.

For things like help, you can type things like aws s3 help
Where are credentials stored?

cd ~
cd .aws
#shows config and credentials
You can nano into credentials. Others could easily get into this. You can access someone's environment using these
credentials. Therefore, it can unsafe to store these here.

That's where roles come in. An EC2 instance can assume a role.

---- AWSCSA-16.1: Using IAM roles with EC2

Under Roles, we an either create a new role with AmazonS3FullAccess.

Then go back to EC2 and launch a new instance. You can select the role for IAM role and go through and create the

Again, roles can have permissions changed that will take effect, but you cannot assign a new role to an EC2
instance after launching again, this is important.

Now if we ssh into our instance, we will find that in the root file there is no .aws file. Therefore, there are no
credentials that we have to be worried about.


1. Roles are more secure

2. Roles are much easier to manage
3. Roles can only be assigned when the EC2 instance is provisioned

Sensitivity: Internal & Restricted

4. Roles are universal, you can use them in any region

AWSCSA-17: Using Bootstrap Bash Scripts

These are scripts that our EC2 instances will run when they are created.

For an example, just create and save a html file.

In the AWS console itself, go into S3, create a bucket. This bucket will contain all the website code. Upload the

All new buckets have objects that are by default private.

Create a new instance of the EC2 instance, use T2 micro and then go into advanced details and add in some text.

yum install httpd -y
yum update -y
aws s3 cp s3://dok-example/index.html /var/www/html
service httpd start
chkconfig httpd on
Now after the instance is up, we should be easily about to navigate to the IP address and every thing should be

AWSCSA-18: EC2 Instance Meta-data

This is data about the EC-2 instance and how we can access this data from the command line.

ssh into an instance.

Elevate the privileges and the use the following to get some details.

What is returned is the meta-data values you can pass after the url again to recieve data about.

Commands like this can write it to a html file:

curl > mypublicip.html

AWSCSA-19: Autoscaling 101

How to use launch configs and autoscaling gorups.

In a text editor, we can create the healthcheck.html test.

I am healthy.
Drop that guy into the relevant bucket. Ensure the load balancer is set up.

Head to launch configurations under autoscaling.

Sensitivity: Internal & Restricted

First, you need to create this launch config. Click Launch Config.

From here, you can select the AMIs related. Select the T2 micro from Amazon if you wish.

Add the roles etc and add in the advanced details if required.

Note: Using the aws command line, you can copy files from a bucket.

You can also assign the IP type.

Select the security group, and then getting a warning about the file.

Create your key etc. too.

Auto Scaling Group

After, you will create the Auto Scaling Group. You can choose the group size too. Using the subnet, you can create
a subnet and network.

If you have three groups and three availability zones, it will put each group in each availability zone.

In the advanced details, we can set up the Elastic Load Balance and Health Check Type (ELB in this case).

Health Grace Period is how often the health is checked. The health check will also fail until Apache is turned on.

Scaling Policies

This allows us to automatically increase and decrease the number of instances depending on the number of
settings that we do.

We can scale between 1 and 5 instances. You will choose to create and alarm and execute a policy at a certain
time. These will grow and shrink depending on the settings.

Once they are up, you can check each IP address and see if they are up. If you choose the DNS, it will go towards
one of the addresses.

We can use things like Route 53 and use it to help start sending traffic to other parts of the world.

AWSCSA-20: EC2 Placement Groups

What is it? A logical grouping of instances within a single available zones. This enables applications to participate in
a low-latency, 10 Gbps network. Placement groups are recommended for applications that benefit from low
network latency, high network throughput, or both.

Recommended for things like group computer and you need a low latency for things like Cassandra Notes.

High network throughput, low latency - they're the goals!

Sensitivity: Internal & Restricted

 A placement group cannot span multiple Availability Zones.
 The name you specify must be unique with your AWS account.
 Only certain types of instances can be launched in a placement group (Compute Optimized, GPU, Memory
Optimized, Storage Optimized.)
 AWS recommend homogenous instances within placement groups (same size and family)
 You can't merge placement groups.
 You can't move an existing instance into a placement group. You can create an AMI from your existing
instance, and then launch a new instance from the AMI into a placement group.

AWSCSA-21: EFS (Elastic File System) Concepts and Lab

EFS is a file storage service for EC2. It is easy to use and has a simple interface for configuration. Storage grows and
shrinks as needed.

 Pay for used storage.

 Supports NFSv4 protocol.
 Scale to petabytes
 Supports thousands of concurrent NFS connections
 Read after write consistency

You can create an EFS from the dashboard.

Set up is similar to other set ups. We can predetermine our IP addresses and security groups.

While the set up is created, you can create a two or more instances and provision them.

If you have set up security groups, feel free to use them.

Provision a load balance for this as well.

Once they are all up, head back to the EFS and if it is ready in the availability zones, go and note down the public
ips for the instances.

Note: Make sure that the instances are in the same security group as the EFS.

If they are all set up, we can head back to EC2. Again, grab the ips and run the two instances in two different

Once you've ssh'd into the instanes, instance Apache to run for the webserver. Start the server up!

service httpd start

Now we can cd /var/www/html
However, head to the root folder for both and then move to EFS.

You can select the EC2 Mount Instructions, then run the command to mount the EFS and ensure that it moves to

Now these will be mounted on EFS, and if we now nano index.html and create a home page.
This will move the files across both instances!

Sensitivity: Internal & Restricted

AWSCSA-22: Lambda

Very, very advanced!

What is Lambda? It's a compute servie where you can upload your code and create a Lambda function. AWS
Lambda takes care of provisioning and managing the servers that you use to run the code. You don't have to worry
about OS, patching, scaling, etc.

You can use Lambda in the following ways:

 Event driven compute service. Lambda can run code in response to events. eg. uploading a photo to S3
and then Lambda triggers and turns the photo into a thumbnail etc.
 Transcoding for media
 As a compute service to run your code in response to HTTP requests using Amazon API Gateway or API
calls made using AWS SDKs.

The Structure

 Data Centres
 Hardware
 Assembly Code/Protocols
 High Level Languages
 OS
 App Layer/AWS APIs

Lambda captures ALL of this. You only have to worry about the code, Lambda looks after everything else.

Pricing is ridiculously cheap. You pay for number of requests and duration. You only pay for this when the code

Why is it cool?

No servers. No worry for security vulnerabilities etc! Continuously scales. No need to worry about auto scaling.

AWSCSA-23: Route53

---- AWSCSA-23.1: DNS101

DNS is used to convert human friendly domain names into an Internet Protocol address (IP).

IP addresses come in 2 different forms: IPv4 and IPv6.

IPv4 is 32-bit. IPv6 is 128 bits.

IPv4 is used, but IPv6 is for the future.

Top Level Domains

Sensitivity: Internal & Restricted

 .com
 .edu
 etc

Domain Registrars

Names are registered with InterNIC - a service of ICANN. They enforce the uniqueness.

Route53 isn't free, but domain registrars include things like etc.

SOA Records

The record stores info about:

 supplies name of the server

 admin of the zone
 current version of the data file
 number of seconds a server should wait before retrying a failed zone
 Default number of seconds for TTL on resource records

NS Records

Name Server Records

 used by Top Level Domains to direct traffic to the Content DNS servers which contains the authoritative
DNS records.

A Records

Address record - used to translate from a domain name to the IP address. A records are always IPv4. IPv6 is AAA.


Length that the DNS is cached on either the Resolving Server or on your PC. This is important from an architectural
point of view.


Canonical Name (CName) can be used to resolve one domain name to another. eg. you may have a mobile website that is used for when users browse to your domain on a mobile. You may also want to point there as well.

Alias Records

Used to map resource record sets in your hosted zone to Elastic Load Balancers, CloudFront Distribution, or S3
buckets that are configured as websites.

Alias records work like a CNAME record in that you can map one DNS name ( to another
'target' DNS name (

Sensitivity: Internal & Restricted

Key Difference - A CNAME can't be used for naked domain names (zone apex). You can't have a CNAME for It must be either an A record or an Alias.

The naked domain name MUST always be an A record, not a C name. eg

The Alias will map this A record to an ELB.


For an ELB, you need a DNS name to resolve to an ELB. You will always need an IPv4 domain to resolve this... which
is why you have the Alias Record.

Records with alias records won't have you charged, whereas CName will.

---- AWSCSA-23.2: Creating a Route 53 Zone

Going from Route 53 to a load balancer to an EC2 Instance.

In EC2, launch an instance.

Bootstrap Script

yum install httpd -y
service httpd start
yum update -y
echo "Hello Cloud Gurus" > /var/www/html/index.html
After moving through and launching the instance, create a load balancer.

Configure the health check for index.html.

Once the ELB is up.

Head to Route53 afterwards.

1. Create a Hosted Zone

2. Use a domain name you have purchased for the Domain Name and you can have a Public Hosted Zone or
a private for VPC
3. Create this and it will create a Start of Authority record and a Name Server name.
4. Cut and paste the NS record, head back to the name server and customise that and enter in the NS values.

Configure the naked domain name

Create a Record Set. You need to create an alias and the target address will have your ELB DNS addresses.

There are routing policies (see the next section).

Once this is created, we should be able to type in the domain name and visit the website!

Sensitivity: Internal & Restricted

---- AWSCSA-23.3: Routing Policies

Simple: This is the default. Most commonly used when you have a single resource that performs a given function
for your domain eg. one web server that serves content for the a website.

Think of one EC2 instance.

Weighted: Example you can send a percentage of users to be directed to one region, and the rest to others. Or
split to different ELBs etc.

Used for splitting traffic regionally or if you want to do some A/B testing on a new website.

Latency: This is based on the lowest network latency for your end user (routing to the best region). You create a
latency resource set for each region.

Route53 will select the latency resource set for the region that will give them the best result and repond with be
resource set.

User -> DNS -> the better latency for an EC2 instance
Failover: When you want to create an active/passive set up. Route53 will monitor the health of your primary site
using a health check.

Geolocation: Sending the user somewhere based on the location of the user.


 ELBs don't have a IPv4 address, you need to resolve to them.

 Understand the different between a CName and an alias. CName reqs are billed, alias are free. A domain
name will want an alias record because you cannot use a CName. Always cheaper for alias.

AWSCSA-24: Databases

---- AWSCSA-24.1: Launching an RDS Instance

Head into EC2 and create a webserver.

There is a Bootstrap bash script you can use for practising purposes from this course.

yum install httpd php php-mysql -y
yum update -y
chkconfig httpd on
service httpd start
echo "<?php phpinfo();?>" > /var/www/html/index.php
cd /var/www/html
To create an RDS instance

Select an engine - MySQL. Select production or dev/test.

Choose an instance class, "No" for Multi-AZ Deployment and leave everything else as default.

Sensitivity: Internal & Restricted

Set up the Settings Instance ID, username etc.

Ensure the current selection is available for free tier.

For the options, select the database name and port.

 some instances can be encrypted at rest

 there is also a back up window
 select launch at the end when you're ready

Back in EC2

We want to check if the bootstrap has worked successfully. We can do so by checking the site.

We can also try <ip>/connect.php to try and call a connection string.

In connect.php, we can check the settings. Ensure that this isn't from the bootstrap bash script from the course.

ssh into EC2 and move into /var/www/html and see the files.

For the hostname, we need to point it towards our RDS instance endpoint.

Ensure the webserver security can talk to the RDS instance.

In the security group, we want to allow security MYSQL/aurora to be able to connect and work for our security
group. It's an inbound rule.

---- AWSCSA-24.2: Backups, Multi AZ and Read Replicas

Automated Backups

2 Different Types

1. Automated Backups
2. Database Snapshots

Auto back ups allow you to recover your database to any point in time within a retention period. That can last
between one and 35 days.

Auto Backups will take a full daily snapshot and will also store transaction logs throughout the day.

When you do a recovery, AWS will first choose the most recent daily back up and then apply transaction logs
relevant to that day.

This allows you to do a point in time recovery down to a second, within a retention period.

Snapshots are done manually, and they are stored even after you delete the original RDS instance, unlike
automated back ups.

Whenever you restore the database, it will come on a new RDS instance with a new RDS endpoint.


Sensitivity: Internal & Restricted

Encryption at rest is supported for MySQL, Oracle, SQL Server, PostgreSQL & MariaDB. Encryption is done using the
AWS Key Management Service (KMS).

Once your RDS instance is encrypted the data stored at rest in the underlying storage is encrypted, as are its
automated backups, read replicas and snapshots.

In the instance actions, you can take a snapshot.

To restore, you can click in and restore. It will create a new database.

For Point In Time, you can select the option and click back to a point in time.

You can migrate the Snapshot onto another database, you can also copy it and move it to another region and you
can of course restore it.

If it is encrypted, you will need to then use the KMS key ID and more.


If you have three instances, they can connect to another database which will then move data across to another
database. You will then not need to move the entry point over.

It is for Disaster Recovery only. It is not primarily used for improving performance. For performance improvement,
you need Read Replicas.

Read Replica

Different to Multi-AZ. Again with the three instances to an RDS, it creates an exact copy that you can read from.
Multi-AZ is more for Disaster Recovery. This way you can improve your performance.

You can change the connection strings to your instances to read from the read replicas etc.

You can also have read replicas of read replicas.

They allow you to have a read-only replica. This is achieved using Async replication form the primary RDS instance
to the read replica. You use read replica's primarily for very read-heavy database workloads.

Remember, RR is used for SCALING. You can also use things like Elasticache. This comes later.

YOU MUST have auto back ups turned on in order to deploy a read replica. You can have up to 5 read replica
copies of any database. You can have more read replicas, but then you could have latency issues.

Each replica will have its own DNS end point. You cannot have a RR with Multi-AZ.

You can however create Read Replicas of a DB with Multi-AZ.

RR can be promoted to be their own databases. This breaks the replication.

You can create this from the Instance Actions menu.

DynamoDB vs RDS

Sensitivity: Internal & Restricted

In RDS, we have to manually create a snapshot and then scale etc. - not automatic. You can only really scale up
(read only) and not out (RW).

"Push button" scaling is DynamoDB.

---- AWSCSA-24.3: DynamoDB

What is it?

A fast and flexible noSQL database service for all applications that need consistent, single-digit millisecond latency
at any scale.

It is a fully managed database and support both document and key-value data models. The flexible data model and
reliable performance make it a great fit for mobile, web, gaming, ad-tech, IoT and many other applications.


 Always stored on SSD storage

 Stored across 3 geographically distinct data centres.
 Eventual Consistency Reads (default): consistency across all copies of data is usually reached within a
second. (Best read performance). If the data can wait after being read for 1s, then this is the best option.
 Strong Consistency Reads: A strongly consistent read returns a result that reflects all writes that received
a successful response prior to the read.

Pricing is based on Provision Throughput Capacity:

 Write for 10units for hour

 Read is for 50units for hour
 Storage also factors in

DynamoDB can be expensive for writes, but cheap for reads.

Creating DynamoDB

To create it, go through the dashboard.

Add in a primary key (eg. number - student ID).

You can even go into the tables and start creating an item from the dashboard. rom here, you can start creating

You can then add in more columns for documents as your grow.

You can then scan and from the same dashboard.

There is no downtime during scaling.

---- AWSCSA-24.4: Redshift

 Data warehousing service in the cloud. You can start small and then scale for $1000/TB/year.

Sensitivity: Internal & Restricted

The example shows a massive record set of summing things up and what is sold etc.


You can start at a single node (160GB)

You can also have a Multi Node (think Toyota level) - Leader Node (manages client connections and receives
queries). - Compute Node (store data and perform queries and computations). Up to 128 compute nodes. - These
two Nodes work in tandem.

Columnar Data Storage

Instead of storing data as a series of rows, Redshift organises data by column. While row-based is ideal for
transaction processing, column-based are ideal for data warehousing and analytics where queries often involve
aggregate performed over large data sets.

Since only the columns involved in the queries are processed and columnar data is stored sequentially on the
storage media, column-based systems require far fewer I/Os, greatly improving query performance.

Advanced Compression - columnar data stores can be compressed much more than row-based data stores because
similar data is stored sequentially on disk.

Redshift employs multiple compression techniques and can often achieve significant compression relative to
traditional relational data stores.

In addition, Amazon Redshift doesn't require indexes or materialised views and so uses less space than traditional
relational database systems. When loading into an empty table, Amazon Redshift automatically samples your data
and selects the most appropriate compression scheme.

Massive Parallel Processing (MPP)

Redshift auto distributes data and query load across all nodes. Redshift makes it easy to add nodes to your data
warehouse and enables you to maintain fast query performance as your data warehouse grows.

Massive advantage when you start using multi nodes.

This whole thing is priced on compute nodes. 1 unit per node per hour. Eg. a 3-node data warehouse cluster
running persistently for an entire month would incur 2160 instance hours. You will not be charged for the leader
node hours; only compute nodes will incur charges.

Also charged for a back up and data transfer (within a VPC).


 Encrypted in transit using SSL

 Encrypted at rest using AES-256 encryption
 By default Redshift takes care of key management


 Not multi AZ. 1 AZ at this point in time.

 Can restore snapshots to new AZ's in the event of an outage.

Sensitivity: Internal & Restricted

 Extremely efficient on infrastructure and software layer.

---- AWSCSA-24.5: Elasticache

What is it?

It's a web service that makes it easy to deploy, operate and scale an in-memory cache in the cloud. The service
improves the performance of web applications by allowing you to retrieve info from fast, managed, in-memory
caches, instead of relying entirely on slow disk-based databases.

It can be used to significantly improve latency and throughput for many read-heavy application workload or
compute-intensive workloads.

Caching improves application performance by storing critical pieces of data in memory for low-latency access.
Cached info may include the results of I/O-intensive database queries or the results of computationally intensive

Two different engines

1. Memcached
o Widely adopted memory object caching system. ElastiCache is protocol compliant with
Memcached, so popular tools that you use today with existing Memcached environments will
work seamlessly with the service.

2. Redis
o Open-source in-memory key-value store that supports data structures such as sorted sets and
lists. ElastiCache supports Master/Slave replication and Multi-AZ which can be used to achieve
cross AZ redundancy.

ElastiCache is a good choice if your database is particularly read heavy and not prone to frequent change.

Redshift is a good answer if there is management running OLAP transactions on it.


Most important thing for that CSA exam. You should know how to build this from memory for the exam.

What is it?

Think of a virtual data centre in the cloud.

It lets you provision a logically isolated section of AWS where you can launch AWS resources in a virtual network
that you define.

You have complete control over your virtual networking environment, including selection of your own IP address
range, creation of subnets, and configuration of route tables and network gateways.

Sensitivity: Internal & Restricted

Easily customizable config for your VPC. Eg. you can create a public-facing subnet for your webservers that has
access to the Internet, and place your backend systems such as databases or application servers in a private facing
subnet with no Internet access. You can leverage multiple layers of security to help control access to EC2 instances
in each subnet.

This is multiple tier architecture.

You can also create Hardware Virtual Private Network (VPN) connections between your corporate datacenter and
your VPC and leverage the AWS cloud as an extension of your corporate datacenter.

What can you do with a VPC?

 You can launch instances into a subnet of your choosing

 Assign custom IP address ranges in each subnet
 Configure route tables between subnets
 Create internet gateways and attach them to subnets
 Better security control over AWS resources
 Create instant security groups for each instance
 ACLs - subnet network control lists

Default VPC vs Custom VPC

 Default is very user friendly allowing you to immediately deploy instances

 All Subnets in default VPC have an internet gateway attached
 Each EC2 instance has both a public and private IP address
 If you delete the default VPC, you can only get it back by contacting AWS

VPC Peering

Connect one VPC with another via a direct network route using a private IP address.

Instances behave as if they're on the same private network.

You can peer VPCs with other AWS accounts as well as with other VPCs in the same account.

Peering is a star config, 1 VPC that peers with 4 others. The outer four can only contact with the middle. No such
thing as transitive peering.

---- AWSCSA-25.1: Build your own custom VPC

No matter which exam you sit, you need to do this as well.

From the dashboard, head to the VPC section and create a VPC.

Under Route Tables, the route has been created automatically.

Under Subnet, we want to create some subnets. Select the Test-VPC and the availability zone. REALLY
IMPORTANT - the subnet is always mapped to one availability zone.

Sensitivity: Internal & Restricted

Give a subnet like etc. (more would be etc)

Once it is created, we can choose from the available IPs.

Feel free to create more. The example gives up to 3.

Under Subnet > Route Table, we can see the target. Now if we deploy those 3 subnets, they could all communicate
to each other through the Route Table.

Now we need to create an Internet Gateway. It allows you Internet Access to your EC2 Instances.

Create this. There is only one Internet Gateway per VPC. Then attach that to the desired VPC.

Now we need to create another Route Table. Now under Route Tables > Routes, edit this. Set the Destination to and set the target to our VPC.

Now in Route Table > Subnet Associations, we need to decide which subnet we want to be internet accessible.
Select one through edit and save. Now in the Route Table > Routes section for other table we created, it no longer
has that Internet Route associated.

So now we effectively have a subnet with internet access, but not for the other 2.

If we deploy instances into one with internet access and one without, we select the network as the new VPC and
then select the correct subnet. We can auto assign the public IP.

For the security group, we can set HTTP to be accessed for everywhere.

Ensure that you create a new key pair. With that key value, copy that to the clip board and launch another
"database" server. This will sit inside a subnet that is not internet accessible. Select the correct VPC and put it into (other the other) and name is "MyDBServer". Stick it inside the same security group.

Hit review and launch, then use the existing key pair (the new one that was created).

The IP for the web server will now be added.

Once you have SSH'd in, you can use the update to install and prove you have Internet access.

If you head back to the instance with the DBServer, you can see there is no IP address. SSH in from the first
WebServer (ssh to that, then ssh to the DBServer). Once in, the private subnet (not internet accessible) will fail and
then we can create a NAT instance so we can actually download things like updates etc.

---- AWSCSA-25.2: Network Address Translation (NAT)

Move into the security groups and create a new one for the NAT instance.

Call it (MyNATSG) and the assign the correct VPC.

We want http and https to flow from public subnets to private subnets.

For Inbound, let's create HTTP and set our DBServer IP. Do the same for HTTPS.

For the Outbound, we could configure it to only allow the HTTP and HTTPS traffic out for anywhere.

Sensitivity: Internal & Restricted

Creating the NAT Instance

Head back to EC2, and launch an instance and use a community AMI. Search for nat and find the top result.

Deploy to our VPC, disable to IP and select the web accessible VPC.

For the disabled IP, even if you have instances inside a public subnet, it doesn't mean they're internet accessible.
You need to give it either a public IP or an ELB.

We can call it "MyNATVM" and add it to the NAT security group and then review and launch. We don't need to
connect into this instance which is cool. It's a gateway. Use an existing key pair and launch.

Head to Elastic IPs and allocate a new address. This does incur a charge if it isn't associated with a EC2 instance.
Associate the address with the NATVM.

Now, go back to the Instance, select Actions, choose Networking and select Change Source Dest. Check. Each EC2
instance performs checks by default, however NAT needs to be able to send source info - we need to disable this!
Otherwise, they will not communicate with each other.

Jump into VPC and look at the default Route Tables. The main Route Table without the Internet Route, go to
Routes, Edit and add another route. Select MyNATVM and set the destination as

Now we can start running updates etc into the instance with a private IP.

You use the NAT to translate into the computer that communicates to each other and enable that private IP to run
internet commands.

---- AWSCSA-25.3: Access Control Lists (ACLs)

ACLs act like a firewall that allow you set up network rules between subnets. When you set an ACL, it overrides
security groups.

Amazon suggest setting rules in % 100s. By default, each custom ACL starts out as closed.

The key thing to remember is that each subnet must be associated with a network ACL - otherwise it is associated
with the default ACL.

How to do it

Under VPC > Network ACL, you check the Test-VPC and see the Inbound and Outbound rules set to

For rules, the lowest is evaluated first. This means 100 is evaluated before 200.

We can create a new one. "My Test Network Control list."

Everything by default is denied for Inbound and Outbound. When you associate subnets with the ACL, it will only
be associated with that ACL.

When you disassociate the subnets, it will default back to the "default" ACL.

Sensitivity: Internal & Restricted

---- AWSCSA-25.4: VPC Summary

 Created a VPC
o Defined the IP Address Range using CIDR
o Default this created a Network ACL & Route table
 Created a custom Route Table
 Created 3 subnets
 Created a internet gateway
 Attached to our custom route
 Adjusted our public subnet to use the newly defined route
 Provisioned an EC2 instance with an Elastic IP address

NAT Lecture

 Created a security group

 Allowed inbound connects for certain IPs on HTTP and HTTPS
 Allowed outbound connections on HTTP and HTTPS for all traffic
 Provisioned our NAT instance inside our public subnet
 Disabled Source/Destination Checks -> SUPER IMPORTANT
 Set up a route on our private subnet to route through the private NAT instance


 ACLs over multiple subnets

 ACLs encompass all security groups under the subnets associated with them
 Numbers, Lowest is incremented first

AWSCSA-26: Application Services

---- AWSCSA-26.1: SQS

This was the very first AWS service.

Gives you access to a message queue that can be used to store messages while waiting for a computer to process

Eg. uploading an image file and that a job needs to be done on it. That message is on SQS. It will queue that system
job and those app services will then access that image from eg s3 and will do something like adding a watermark
and then when it's done, it will remove that message from the queue.

Therefore, if you've lost a web server that message will still stay in that queue and other app services can go ahead
and do that task.

Sensitivity: Internal & Restricted

A queue is a temporary repository for messages that are awaiting processing.

You can decouple components. A message can contain 256KB of text in any format. Any component can then
access that message later using the SQS API.

The queue acts as a buffer between the component producing and saving data, and the component receiving the
data for processing. This resolves issues that arise if the producer is producing work faster than the consumer can
process it, or if the producer or consumer are only intermittently connected to the network. - referencing
autoscaling or fail over.

Ensures delivery of each message at least once. A queue can be used simultaneously. Great for scaling outwards.

SQS does not guarantee first in, first out. As long as all messages are delivered, sequencing isn't important. You can
enable sequencing.

Eg. a image encode queue. A pool of EC2 instances running the needed image processing software does the

1. Async pull task messages from the queue. ALWAYS PULLS. (Polling)
2. Retrieves the named file.
3. Processes the conversion.
4. Writes the image back to Amazon S3.
5. Writes a "task complete" to another queue.
6. Deletes the original task.
7. Looks for new tasks.


 Component 1 -> Message queue.

 Message queue pulled from Component 2 (visibility clock time out starts).
 Component 2 processes and deletes it from the queue during the visibility timeout period.

You can configure auto scaling. It will see the group growing fast and start autoscaling in response to what you
have set. Some of the back bones of the biggest websites out there. SQS works in conjunction with autoscaling.


 Does not offer FIFO

 12 hours of visibility
 SQS is engineered to provide "at least once" delivery of all messages in its queues - you should design
your system so that processing a message more than once does not create any errors or inconsistencies
 For billing, a 64KB chunk is billed as 1 request.

Sensitivity: Internal & Restricted

---- AWSCSA-26.2: SWF (Simple Workflow Service)

It's a web service that makes it easy to coordinate work across distributed application components. SWF enables
applications for a range of use cases, including media processing, web application back-ends, business process
workflows, and analytics pipelines, to be designed as a coordination of tasks.

Tasks represent invocations of various processing steps in an application which can be performed by executable
code, web service calls, human actions and scripts.

Amazon use this for things like processing orders online. They use it to organise how to get things to you. If you
place an order online, that transaction then kicks off a new job to the warehouse where the member of staff can
get the order and then you need to get that posted to them.

That person will then have the task of finding the hammer and then package the hammer etc.

SQS has a retention period of 14 days. SWF has up to a year for workflow executions.

Amazon SWF presents a task-oriented API, whereas Amazon SQS offers a message-orientated API.

SWF ensures that a task is assigned only once and is never duplicated. With Amazon SQS, you need to handle
duplicated messages and may also need to ensure that a message is processed only once.

SWF keeps track of all tasks in an application. With SQS, you need to implement your own application level

SWF Actors

1. Workflow starters - an app that can initiate a workflow. Eg. e-commerce website when placing an order.
2. Deciders - Control the flow of activity tasks in a workflow execution. If something has finished in a
workflow (or fails) a Decider decides what to do next.
3. Activity workers - carry out the activity tasks.

---- AWSCSA-26.3: SNS (Simple Notification Service)

This is a mobile service. It's a service that make it easy to send and operate notifications from the cloud. It's a
scalable, flexible and cost-effective way to publish message from an application and immediately deliver them to
subscribers or other applications.

Useful for things like SNS. That can email you or send you a text letting you know that your instance is growing.

You can use it push notifications to Apple, Google etc.

SNS can also deliver notifications by text, email or SQS queues or any HTTP endpoint. It can also launch Lambda
functions that will be invoked. The message input is a payload that is reacts to.

It can connect up a whole myriad of things.

It can group multiple recipients using a topic.

One topic can have multiple end point types.

It delivers appropriately formatted notifications and it Multi-AZ.

Sensitivity: Internal & Restricted

You can use it for CloudWatch and autoscaling.


 Instantaneous, push-based delivery (no polling)

 Simple APIs and easy integration with applications
 Flexible message delivery over multiple transport protocols
 Inexpensive, pay as you go model.
 Simple point-and-click GUI


Both messaging services, but SNS is push while SQS is pulls (polls).

---- AWSCSA-26.4: Elastic Transcoder

Relatively new service. It converts media files form their original source format in to different formats.

It provides presets for popular output formats.

Example uploading media to an S3 bucket will trigger a Lambda function that then uses the Elastic Transcoder to
put it into a bunch of different formats. The transcoded files could then be put back into the S3 Bucket.

Check to read some more about that.

---- AWSCSA-26.5: Application Services Summary


 It is a web service that gives you access to a message queue that can be used to store messages while
waiting for a computer to process them.
 Eg. Processing images for memes and then storing in a S3 bucket.
 Not FIFO.
 12 hour visability time out.
 At least once messaging.
 Messages are 256kb but billed in 6kb chunks
 You can use two SQS queues and the premium can be polled first and when it is emptied then you can use
the second queue.


 SQS has a retnetion period of 14 days vs a year for SWS

 SQS is message orientated whereas SWF is task-orientated
 SWF ensures that a task is assigned only once and never duplicated
 SWF keeps track of all the tasks and events in an application. With Amazon SQS, you need to implement
your own application-level tracing, especially if your application uses multiple queues.

Sensitivity: Internal & Restricted

3 Different types of actors:

1. Workflow Starter - starts a workflow

2. Deciders - Control the flow of activity tasks in a workflow execution
3. Activity workers - carry out the activity tasks


 HTTP, HTTPS, Email, Application, Lambda etc etc

 SNS and SQS are both messaging services in AWS. SNS is push, whereas SQS is polling (pull)
 Pay based on the minutes that you transcode and the resolution at which you transcode

AWSCSA-27: Real World Application - Wordpress Deployment

---- AWSCSA-27.1: Setting up the environment

First of all, create a new role in AMI.

It will be for Amazon S3 access.

Set up the two security groups: one for the EC2 and the other for the RDS.

After that has been creating, go into the web security group and ensure that you can allow port 80 for HTTP and 22
for SSH.

For the RDS group, allow MySQL traffic and choose the source as the web security group.

Head to S3 and create a new bucket for the WordPress code. Choose the region for the security groups that you
just made.

Once the bucket has been created, sort out the CDN. So head to CloudFront and create a new distribution. Use the
web distribution and use the bucket domain. The Origin path would be the subdirectory. Restrict the Bucket Access
to hit CloudFront only and not S3. Ensure that you update the bucket policy so that it always has read permissions
for the public.

Head and create the RDS instance. Launch the MySQL instance. You can use a free-tier if you want, but multi-AZ
will require an incurred cost.

Ensure you've set the correct settings that you would like.

Add that to the RDS security group and ensure that it is not publicly accessible.

Head over to EC2 and provision a load balancer. Put that within the web security group and configure the health

Once the Load Balancer is up, head to Route 53 and set up the correct info for the naked domain name. You will
need to set an alias record, and set that to the ELB.

Sensitivity: Internal & Restricted

---- AWSCSA-27.2: Setting up EC2

Head to EC2 and provision an instance. The example uses the Amazon Linux of course.

Ensure you assign the S3 role to this. Add the bootstrap script from the course if you want.

Ensure that you have given the correct user rights to wp-content.

chmod -R 755 wp-content

chown -R apache.apache wp-content
cd wp-content

---- AWSCSA-27.3: Automation and Setting up the AMI

In the uploads directory for wp-content, you'll notice that it hasn't nothing. If you do upload the image through
wp-admin, it'll be available (as you could imagine).

Back in the console in uploads, you can then ls and see the file there. We want it so that all of our images head to
S3 and they will be served out of CloudFront.
cd /var/www/html
List out the s3 bucket to start synchronising the content.

aws s3 ls to figure out the bucket.

# ensure the s3 bucket is your media bucket

aws s3 cp --recursive /var/www/html/wp-content/uploads s3://wordpressmedia16acloudguru

Now we can sync up after adding another photo.

# ensure the s3 bucket is your media bucket

aws s3 sync /var/www/html/wp-content/uploads s3://wordpressmedia16acloudguru --delete --dryrun // the

delete marker makes it a perfect sync //dry run won't do anything but show you what it wil do
We need to create a .htaccess file and create some changes.

In the .htaccess file, we have the following.

Options +FollowSymlinks
RewriteEngine on
rewriterule ^wp-content/uploads/(.*)$ [cloudfront-link] [r=301,nc]

# BEGIN Wordpress

# END Wordpress
Now, we actually want to edit all this.

cd etc && nano crontab

We want to now schedule some tasks.

*/2 * * * * root aws s3 sync --delete /var/www/html/ s3://wordpresscodebucket/ (the s3 bucket)

*/2 * * * * root aws s3 sync --delete /var/ww/html/wp-content/uploads/ s3://mediabucket/

Sensitivity: Internal & Restricted

Normally you would just have one EC2 instance that is a write and then the rest are read only.

For the other EC2 instances that aren't dedicated.

*/3 * * * * root aws s3 sync --delete s3://wordpress /var/www/html/

If you try to upload, it is still waiting to replicate to the CDN.

Creating an Image

Select the EC2 instance and then select that EC2 and create an image. After it is finished in the AMIs, you can
launch it on another computer.

When you move that file across, you can launch the instance and have a bash script to run the updates and then
do a AWS sync.

When the instance is live, we should be able to just go straight to the IP address.

---- AWSCSA-27.4: Configuring Autoscaling and Load Testing

Create an autoscaling group from the menu. Initial create a launch config.

There is a TemplateWPWebServer AMI from A Cloud Guru that they use here.

For the Auto-Scaling Group.

Create it and select your subnets.

In Advanced, run off the ELB we created.

In the scaling policies, we want 2 to 4. Sort out the rest of the stuff for the ASG, review and launch.

Once the instances are provisioned, you'll also note that the ELB will have no instances until everything has set up.

If you reboot the RDS, you test the failover.

Now we can run a stress test call stress that was installed in the A Guru Bootstrap.

---- AWSCSA-27.5: CloudFormation

A lot of what you will be doing for services. CF is like Bootstrapping for AWS.

In the EC2 instance, ensure that you've done a tear down and then remove the RDS instance as well.

Select into "CloudFormation"

Create a new stack. Here, we can design a template ourselves or we can hit a template designer.

From here, you will specify details. You can set the parameters from here!

Running this we can actually have the CF Stack provision everything for us.

AWS Knowledge required for the Exam:

Sensitivity: Internal & Restricted

 Hands-on experience using compute, networking, storage, and database AWS services
 Professional experience architecting large-scale distributed systems
 Understanding of elasticity and scalability concepts
 Understanding of the AWS global infrastructure
 Understanding of network technologies as they relate to AWS
 A good understanding of all security features and tools that AWS provides and how they relate to
traditional services
 A strong understanding of client interfaces to the AWS platform
 Hands-on experience with AWS deployment and management services

Key items you should know before you take the exam:

1. How to configure and troubleshoot a VPC inside and out, including basic IP subnetting. VPC is arguably
one of the more complex components of AWS and you cannot pass this exam without a thorough
understanding of it.
2. The difference in use cases between Simple Workflow (SWF), Simple Queue Services (SQS), and Simple
Notification Services (SNS).
3. How an Elastic Load Balancer (ELB) interacts with auto-scaling groups in a high-availability deployment.
4. How to properly secure a S3 bucket in different usage scenarios
5. When it would be appropriate to use either EBS-backed or ephemeral instances.
6. A basic understanding of CloudFormation.
7. How to properly use various EBS volume configurations and snapshots to optimize I/O performance and
data durability.

General IT Knowledge preferred for the Exam:

 Excellent understanding of typical multi-tier architectures: web servers, caching, application servers, load
balancers, and storage
 Understanding of Relational Database Management System (RDBMS) and NoSQL
 Knowledge of message queuing and Enterprise Service Bus (ESB)
 Familiarity with loose coupling and stateless systems
 Understanding of different consistency models in distributed systems
 Knowledge of Content Delivery Networks (CDN)
 Hands-on experience with core LAN/WAN network technologies
 Experience with route tables, access control lists, firewalls, NAT, HTTP, DNS, IP and OSI Network
 Knowledge of RESTful Web Services, XML, JSON
 Familiarity with the software development lifecycle
 Work experience with information and application security concepts, mechanisms, and tools
 Awareness of end-user computing and collaborative technologies

Agile Project Management vs PMBOK® Guide

Sensitivity: Internal & Restricted

Agile PMBOK® Guide

 Change driven
Essences  Plan driven
 Value driven
 Plan, process and change control
 People, collaboration and shared value

Project Nature Iterative Waterfall

Attitude to Control changes

Embrace changes
Changes Gold-plating is to be avoided

Customer Collaborative – actively involved throughout the

Authoritative – approving the plan and product
Involvement project

Project Requirements are getting more detailed as the

Rather fixed from the beginning
Requirements project evolves

Detailed to track changes and deviation, need

Documentation Barely sufficient
formal change approval

Project Success Measured against final outcome Measured against the plan

 Easily controlled and measured (costs,

time and quality)
Advantages  Can respond to changes rapidly for
 Know the output from the beginning
competitive advantages
 When the processes are designed right,
quality is easily guaranteed

 Relies heavily on initial requirements

 Final product may be vastly different gathering
from what is planned at the beginning  Project may fail owing to faults in the
Disadvantages  Depends very much on getting the right plan
people on board  Time spent on plans and
 Cost overrun may be expected documentations are costly
 Changes are not easily accommodated

Projects with fast changing environments (e.g. Large projects with relatively fixed requirements
Ideal For
software development) (e.g. construction)

PMI-ACP® Study Notes: Domain I Agile Principles and Mindset

Below is a collection of the key knowledge addressed in Domain I Agile Principles and Mindset and the nine tasks
related to the domain:

 Agile Manifesto and 12 Agile Manifesto Principles

 Individuals and interactions over Processes and tools
 Working software over Comprehensive documentation
 Customer collaboration over Contract negotiation
 Responding to change over Following a plan
 Agile Project Management Fundamentals
 Users Involvement

Sensitivity: Internal & Restricted

 Team Empowerment
 Fixed Time Box
 Requirements at a High Level
 Incremental Project Releases
 Frequent Delivery
 Finish Tasks One by One
 Pareto Principle
 Testing – Early and Frequent
 Teamwork
 Agile Methodologies
 The following are the common Agile methodologies in practice these days, these are listed in
order of importance for the PMI-ACP® Exam. An understanding of the process and terminologies of
these Agile methodologies will help ensure Agile practices to be carried out effectively.
 Scrum
 XP (eXtreme Programming)
 Kanban
 LSD (Lean Software Development)
 Crystal Family
 FDD (Feature Driven Development)
 ASD (Adaptive Software Development)
 DSDM (Dynamic Systems Development Method) Atern
 Information Radiators
 Information radiators are highly visible charts and figures displaying project progress and status,
e.g. Kanban boards, burn-down charts. These shows the real progress and performance of the project
and team which enhances transparency and trust among team members and other stakeholders.
 Agile Experimentations
 Agile projects make use of empirical process control for project decisions, ongoing observation
and experimentation are carried out during project execution to help and influence planning
 Introduce spike (including architecture spike) to carry out a technical investigation to reduce risks
by failing fast
 Sharing of Knowledge
 Ideally, Agile teams are best to be co-located (working within the same room with seats facing
each other) to enhance pro-active support, free discussion, open collaboration and osmotic
 Face-to-face communication is always encouraged
 Practice pair programming if feasible
 Make use of daily stand-up, review and retrospectives
 Make use of Agile tooling to enhance sharing of knowledge:
 Kanban boards
 white boards
 bulletin boards
 burn-down/burn-up charts
 wikis website
 instant messaging – Skype, web conferencing, etc.
 online planning poker
 Since documentations are not encouraged, co-located teams can share tacit knowledge more
 Self-organization and Empowerment
 Self-organizing teams are the foundation for Agile project management
 Self organization includes: team formation, work allocation (members are encouraged to take up
works beyond their expertise), self management, self correction and determining how to work is
considered “done”

Sensitivity: Internal & Restricted

 Agile team is given the power to self-direct and self-organize by making and implementing
decisions, including: work priority, time frames, etc. as they believe “the best person to make the
decision is the one whose hands are actually doing the work”
 In Agile projects, the project manager/Coach/ScrumMaster practice servant leadership to
remove roadblocks and obstacles and to enable the team to perform best
According to the PMI-ACP® Exam Content Outline, Domain I Agile Principles and Mindset consists of nine tasks:

1. Act as an advocate for Agile principles with customers and the team to ensure a shared Agile mindset
2. Create a  common understanding  of the values and principles of Agile through practising  Agile practices
and using Agile terminology effectively.
3. Educate  the  organization and influence project and organizational processes, behaviors and people to
support the change to Agile project management.
4. Maintain highly visible  information radiators  about the progress of the projects to enhance transparency
and trust.
5. Make it  safe to  experiment and make mistakes so that everyone can benefit from empirical learning.
6. Carry out  experiments as needed to enhance creativity and discover efficient solutions.
7. Collaborate with one another to enhance knowledge sharing as well as removing knowledge silos and
8. Establish a safe and respectful working environment to encourage emergent leadership throughself-
organization and  empowerment.
9. Support and encourage team members to perform their best by being a  servant leade
1) Which of the following is an Agile Manifesto principle?

A) Welcome changing requirements, early in development. Agile processes handle changes for the customer's
competitive advantage.
B) Welcome changing priorities, early in development. Agile processes harness change for the customer's
competitive advantage.
C) Welcome changing priorities, even late in development. Agile processes handle changes for the customer's
competitive advantage.
D) Welcome changing requirements, even late in development. Agile processes harness change for the
customer's competitive advantage.

Answer: D

Explanation: The correct wording of the principle is “Welcome changing requirements, even late in
development. Agile processes harness change for the customer's competitive advantage.” The agile principles
do not speak to changing priorities or to welcoming only early changes.

2) When managing an agile software team, engaging the business in prioritizing the backlog is an example of:

A) Technical risk reduction    

B) Incorporating stakeholder values    
C) Vendor management    
D) Stakeholder story mapping    

Answer: B

Explanation: We engage the business in prioritizing the backlog to better understand and incorporate
stakeholder values. Although such engagement will likely impact technical risk reduction, vendor
management, or stakeholder story mapping, these are not the main reasons we engage the business.

Sensitivity: Internal & Restricted

3) Which of the following items is not a benefit associated with product demonstrations?    

A) Learn about feature suitability    

B) Learn about feature usability    
C) Learn about feature estimates
D) Learn about new requirements    

Answer: C    

Explanation: Product demonstrations provide the benefits of learning about feature suitability and usability,
and they can prompt discussions of new requirements. They are not typically used to learn about feature
estimates, however, since estimating is done during estimation sessions, rather than during demonstrations.

4) Choose the correct combination of XP practice names from the following options:    

A) Test-driven design, refactoring, pair programming    

B) Test-driven development, reforecasting, peer programming    
C) Test-driven development, refactoring, pair programming
D) Test-driven design, refactoring, peer programming    

Answer: C    

Explanation: The XP practices include test-driven development, refactoring, and pair programming. “Test-
driven design,” “reforecasting,” and “peer programming” are not XP practice names.

5) An agile team is planning the tools they will use for the project. They are debating how they should show
what work is in progress. Of the following options, which tool are they most likely to select?    

A) User story backlog    

B) Product roadmap    
C) Task board    
D) Work breakdown structure    

Answer: C    

Explanation: Of the options presented, the best tool to show work in progress is a task board. The user story
backlog shows what work is still remaining to be done on the project. The product roadmap shows when work
is planned to be completed. Work breakdown structures are not commonly used on agile projects.

6) When using a Kanban board to manage work in progress, which of the following best summarizes the
philosophy behind the approach?    

A) It is a sign of the work being done and should be maximized to boost performance.    
B) It is a sign of the work being done and should be limited to boost performance.    
C) It is a sign of the work queued for quality assurance, which should not count toward velocity.    
D) It is a sign of the work queued for user acceptance, which should not count toward velocity.    

Answer: B    

Explanation: The correct answer is “It is a sign of the work being done and should be limited to boost

Sensitivity: Internal & Restricted

performance.” A Kanban board shows work in progress (WIP), which represents work started but not
completed. Therefore, the WIP should be limited and carefully managed to maximize performance. More WIP
does not equal more output; in fact, it is quite often the opposite. Also, WIP is any work that is in progress,
regardless of what stage the work is at, so the answer options that limit it to work waiting for quality
assurance or user acceptance are wrong.

7) Which of the following is not true of how burn up charts that also track total scope differ from burn down

A) Burn up charts separate out the rate of progress from the scope fluctuations    
B) Burn up charts and burn down charts trend in opposite vertical directions    
C) Burn up charts can be converted to cumulative flow diagrams by the addition of WIP    
D) Burn down charts indicate whether rate of effort changes are due to changes in progress rates or scope    

Answer: D    

Explanation: It is true that burn up charts can be converted to cumulative flow diagrams by the addition of
WIP, and they trend in the opposite vertical direction from burn down charts. It is also true that burn up charts
that also track total scope (rather than burn down charts) separate out the rate of progress from the scope
fluctuations. So the option that is not true is “Burn down charts indicate whether rate of effort changes are
due to changes in progress rates or scope.”

8) As part of stakeholder management and understanding, the team may undertake customer persona
modeling. Which of the following would a persona not represent in this context?

A) Stereotyped users    
B) Real people    
C) Archetypal description    
D) Requirements    

Answer: D    

Explanation: Personas do represent real, stereotyped, composite, and fictional people. They are archetypal 
(exemplary) descriptions, grounded in reality, goal-oriented, specific, and relevant to generate focus. Personas
are not a replacement for requirements on a project, however.

9) An agile team is beginning a new release. Things are progressing a little slower than they initially estimated.
The project manager is taking a servant leadership approach. Which of the following actions is the project
manager most likely to do?

A) Create a high-level scope statement and estimates.    

B) Intervene in nonproductive team arguments.    
C) Do administrative activities for the team.    
D) Demonstrate the system to senior executives.    

Answer: C    

Explanation: In taking a servant leadership approach, the project manager is most likely to do administrative
activities for the team. As implied by the term, the role of a servant leader is focused on serving the team. A
servant leader recognizes that the team members create the business value and does what is necessary to

Sensitivity: Internal & Restricted

help the team be successful. Of the choices presented, the action of doing administrative work best supports
this goal.

10) The PMO has asked you to generate some financial information to summarize the business benefits of
your project. To best describe how much money you hope the project will return, you should show an
estimate of:    

A) Internal rate of return (IRR)    

B) Return on investment (ROI)    
C) Gross domestic product (GDP)    
D) Net present value (NPV)    

Answer: B    

Explanation: Since we are being asked to show how much the project will return, the metric to choose is the
return on investment (ROI). You might have been tempted to choose net present value (NPV) since this
calculation accounts for inflation, but the question did not ask for an adjusted value. Instead, it simply asked
how much money the project would return. IRR and GDP would not provide the information the PMO has
asked for.

11) What do risk burn down graphs show?    

A) The impacts of project risks on the project schedule    

B) The impacts of project risks on the project budget    
C) The cumulative risk severities over time    
D) The cumulative risk probabilities over time    

Answer: C    

Explanation: Risk burn down graphs do not show the impacts of the risks on the schedule or budget, but they
do show the cumulative risk severities over time. Tracking just the probabilities over time would be of little use
without knowing the impacts of these risks. After all, they could all be trivial, and in that case, why would we
need to be concerned?

12) What is the process cycle efficiency of a 2-hour meeting if it took you  2 minutes to schedule the meeting
in the online calendar tool and 8 minutes to write the agenda and e-mail it to participants?    

A)  90%    
B)  8%    
C)  92%    
D)  96%    

Answer: C    

Explanation: The formula for finding process cycle efficiency is: Total value-added time / total cycle time. In
this question, the value-added time is 2 hours, and the total cycle time is 2 minutes + 8 minutes + 120 minutes
= 130 minutes. So the correct answer is 120 / 130 = 92%.

13) The steps involved in value stream analysis include:    

Sensitivity: Internal & Restricted

A) Create a value stream map to document delays and wasted time, such as meetings and coffee breaks.    
B) Create a value stream map of the current process, identifying steps, queues, delays, and information
C) Review the value stream map of the current process and compare it to the goals set forth in the project
D) Review how to adjust the value stream charter to be more flexible.    

Answer: B    

Explanation: The only option here that is a step in value stream analysis is “Create a value stream map of the
current process, identifying steps, queues, delays, and information flows.” None of the other options are valid
steps in value stream mapping.

14) When we practice active listening, what are the levels through which our listening skills progress?    

A) 1) Global listening, 2) Focused listening, 3) Intuitive listening    

B) 1) Interested listening, 2) Focused listening, 3) Global listening    
C) 1) Self-centered listening, 2) Focused listening, 3) Intuitive listening    
D) 1) Internal listening, 2) Focused listening, 3) Global listening    

Answer: D    

Explanation: The progression is internal listening (how will this affect me?) to focused listening (what are they
really trying to say?) and then finally to global listening (what other clues do I notice to help me understand
what they are saying?).

15) The Agile Manifesto value “customer collaboration over contract negotiation” means that:    

A)  Agile approaches encourage you not to focus too much on  negotiating contracts, since most vendors are
just out for themselves anyway.    
B)  Agile approaches focus on what we are trying to build with our vendors, rather than debating the details
of contract terms.    
C)  Agile approaches prefer not to use contracts, unless absolutely necessary, because they hamper our ability
to respond to change requests.    
D)  Agile approaches recommend that you only collaborate with vendors who are using agile processes

Answer: B    

Explanation: Valuing customer collaboration over contract negotiation means we look for mutual
understanding and agreement, rather than spend our time debating the fine details of the agreement.

16) To ensure the success of our project, in what order should we execute the work, taking into account the
necessary dependencies and risk mitigation tasks?    

A)  The order specified by the project management office (PMO)    

B)  The order specified by the business representatives    
C)  The order specified by the project team    
D)  The order specified by the project architect    

Sensitivity: Internal & Restricted

Answer: B    

Explanation: It is largely the business representatives who outline the priority of the functional requirements
on the project. That prioritization is then a key driver for the order in which we execute the work.

17) Incremental delivery means that:    

A)  We deliver nonfunctional increments in the iteration retrospectives.    

B)  We release working software only after testing each increment.    
C)  We improve and elaborate our agile process with each increment delivered.    
D)  We deploy functional increments over the course of the project.    

Answer: D    

Explanation: Incremental delivery means that we deploy functional increments over the course of the project.
It does not relate to retrospectives, testing, or changes to the process, so the other options are incorrect, or
“less correct”.

18) In agile approaches, negotiation is viewed as:    

A)  A zero-sum game    

B)  A winner-takes-all challenge    
C)  A fail proof win-win scenario    
D)  A healthy process of give and take    

Answer: D    

Explanation: In agile approaches, negotiation is viewed as a healthy process of give and take rather than a
zero-sum game, a competitive challenge, or a fail proof win-win scenario.

19) In Scrum, the definition of “done” is created by everyone EXCEPT:    

A)  Development team    

B)  Product owner    
C)  Scrum Master    
D)  Process owner    

Answer: D    

Explanation: The whole team, including the development team, product owner, and Scrum Master, is
responsible for creating a shared definition of “done.” Since “process owner” is a made-up term, this is the
correct choice for someone who would NOT be involved in defining done.

20) When working with a globally distributed team, the most useful approach would be to:

A)  Bring the entire team together for a diversity and sensitivity training day before starting the first
B)  Bring the entire group together for a big celebration at the end of the project    

Sensitivity: Internal & Restricted

C) Bring the entire group together for a get-to-know-you session before starting the first 
D)  Gather the entire team for a kickoff event and keep them working together for at least the first

Answer: D    

Explanation: Having the teamwork together for an iteration would be a great way to help integrate a globally
distributed team. Diversity training and get-to-know-you sessions are nice, but having the team members
actually work together would be the best opportunity for them to learn each other's work habits and
interaction modes.

Sensitivity: Internal & Restricted