Main Topic covered on Exam
- Design Resilent Architectures
- Define Performant Architectures
- Specifiy secure applications and Architectures
- Design cost optimised Architectures
- Define Operationally excellent Architectures
Simple Storage Service – S3
S3 is a object based storage service also a server less storage in the cloud.
S3 objects contain your data. they are like files.
Object may consist of key, value, version id, meta data.
You can store data from o Bytes to 5 Terabytes in size.
S3 bucket hold objects. buckets can also have folders which in turn hold objects.
S3 is a universal namespace so bucket names must be unique.
S3 – Storage Classes
- Standard : Fast 99.99% availability. 11 9’s Durabiltiy. replicated across at least three AZs.
- Intelligent Tiering: Uses Machine Learning to analyse your object usage and determine the appropriate storage class. Data is moved to the cost effective access tier, without any performance impact or added overhead.
- Standard Infrequently Accessed(IA) : Still fast cheaper if you access files less than once a month. Additional retrieval fee is applied. 50% less than standard ( reduced Availabilty).
- One Zone IA: Still fast object only exist in one AZ. Availabilty is 99.5%. But cheaper than standard IA by 20% less. Data could get destroyed. A retrieval fee is applied.
- Glacier: For long term cold storage. Retrieval of data can take minutes to hours but the off is very cheap storage.
- Glacier Deep Archive: The lowest cost storage class. Data retrieval time is 12 hours.
All new buckets are private when created by default.
Access control is configured using bucket policies and Access control lists(ACL).
Encryption in Transit: traffic between your local host and S3 is achieved via SSL/TLS
Server side encryption (SSE) : Encryption at rest – S3 managed keys- SSE- AES-256 Algorithm – SSE-KMS Envelope encryption, SSE-C customer provided key.
Client side encryption: you encrypt your own files before uploading them to S3.
S3 – Cross Region Replication – CRR
CRR enabled any object that is uploaded will be automatically replicated to another region. provides higher durability and potential disaster recovery for objects.
you must have versioning turn on both the source and destination buckets. you can have CRR replicate to another AWS account.
S3 – Versioning
Store all version of an object in S3.
Once enabled it cannot be disabled, only suspended on the bucket.
Fully integrate with S3 lifecycle rules.
MFA delete feature provides extra protection against deletion of your data.
S3- Lifecycle Management
Automate the process of moving objects to different storage classes or deleting objects all together.
Can be used together with versioning.
Can be applied to both current and previous versions.
S3 – Transfer Acceleration
Fast and secure transfer of files over long distances between your end users and an S3 bucket.
Utilizes CloudFront distributed Edge locations.
Instead of uploading to your bucket, uses use a distinct URL for an edge location.
as a data arrives at the edge location it is automatically routed to S3 over a specially optimised network path.
S3 – MFA Delete
MFA Delete ensure users cannot delete objects from a bucket unless they provide their MFA code.
MFA delete can only be enabled under these condition
The AWS CLI must be used to turn on MFA and The bucket must have versioning turned on.
Only the bucket owner logged in as root user can delete objects from bucket.
Command line interface – S3
#aws s3 ls
#aws s3 ls sukheshcs
Copying file from desktop to AWS S3
#aws s3 cp s3:///sukheshcs/mypic.jpg ~/desktop/folder/test.jpb
Uploading file to S3 from desktop
#aws s3 cp ~/Desktop/folder/test.jpg s3://sukheshcs/test.jpgTemporary access to the file on S3
#aws s3 presign s3://sukhehscs/test.jpg –expires-in 300
- Simple storage service object based storage. store unlimited amount of data without worry of underlying storage infrastructure.
- S3 replicates data across at least 3 AZs to ensure 99.99% availability and 11 9s of durability.
- objects contain your data.
- Objects can be size anywhere from 0 bytes up to 5 Terabytes.
- Buckets contain objects. Bucket can also contain folders which can in turn can contain objects.
- Bucket names are unique across all AWS accounts. like a domain name.
- When upload a file to S3 successfully you’ll receive a HTTP 200 code. Lifecycle management objects can be moved between storage classes or objects can be deleted automatically based on schedule.
- Versioning objects are giving a version ID. When new objects are uploaded the old objects are kept. you can access any object version. when you delete an object the previous object is restored. Once versioning is turn on it cannot be turn off, only suspended.
- MFA delete enforce Delete operations to require MFA token in order to delete an object. Must have versioning turned on to use. can only turn on MFA Delete from AWS CLI. Root account is only allowed to delete objects.
- All new buckets are private by default.
- Logging can be turned to on bucket to log to track operation performed on object.
- Access control is configured using Bucket policies and Access control lists.
- Bucket polices are JSON documents which let you write complex control access.
- ACLs are the legacy method where you grant access to objects and buckets with simple actions.
- Security in transit uploading files is done over SSL.
- SSE stands for server side encryption. S3 has 3 options for SSE.
- SSE-AES S3 handles the key, uses AES-256 algorithm.
- SSE-KMS envelope encryption via AWS KMS and you manage the keys.
- SSE-C Customer provided key
- Client side Encryption – you must encrypt your own files before uploading them to S3.
- Cross Region Replication (CRR) allows you to replicate files across regions for greater durability. You must have versioning turned on in the source and destination bucket. you can have CRR replicate to bucket in another AWS account.
- Transfer Acceleration provide faster and secure uploads from anywhere in the world. data is uploaded via distinct url to an Edge location. Data is then transported to your S3 bucket via AWS backbone network.
- Presigned URLs is a url generated via the AWS CLI and SDK. it provides temporary access to write or download object data. Presigned urls are commonly used to across private objects.
- S3 has 6 different storage classes.
- Standard fast 99.99% Availability. 11 9s durability. replicated across at least three AZs.
- Intellignet Tiering uses ML to analyse your object usage and determine the appropriate storage class. Data is moved to the most cost effective access tier, without any performance impact or added overhead.
- Standard Infrequently Accessed IA stil fast cheaper if you access files less than once a month. Additional retrieval fee is applied. 50% less than standard.
- One Zone IA still fast objects only exist in one AZ. Availability is 99.5%. But cheaper than standard IA by 20% less data could get destroyed. A retrieval fee is applied.
- Glacier for long term cold storage. Retrieval of data can take minutes to hours but the off is very cheap storage.
- Glacier Deep Archive the lowest cost storage class. Data retrieval time is 12 hours.
Petabyte scale data transfer service. Move data onto AWS via physical briefcase computer.
Snowball can reduce that costs by 1/5th.
Speed it can take 100TB over 100 days to transfer over high speed internet snowball can reduce that transfer time by less than a week.
Snowball features and Limitations
- E-lnk display (shipping information)
- Tamper and weather proof
- Data is encypted end to end 256 bit encryption
- uses trusted platform model – TPM
- for security purposes, data transfers must be completed within 90 days of the snowball being prepared.
- Snowball can import and export from S3 bucket
Snowballs come in two sizes – 50TB(usable-42TB). and 80TB(usable-72TB)
AWS Snowball Edge
Petabyte scale data transfer service. Move data onto AWS via physical briefcase computer. More storage and onsite compute capabilities.
similar to snowball but with main storage and with local processing.
Snowball edge features and limitations
- LCD display (shipping information and other functionality)
- can undertake local processing and edge computing workloads
- can use in a cluster in groups of 5 to 10 devices
- three options for device configurations – Storage optimised(24 vCPUs), Compute optimised(54 vCPUs), GPU optimised(54 vCPUs).
- Snowball edge come in two sizes – 100TB (usable -83TB) and 100TB Clustered (usable-45TB)
45 foot long ruggedised shipping container, pulled by semi trailer truck. Transfer up to 100PB per snowmobile.
AWS personnel will help you connect your network to the snowmobile and when data transfer is complete they’ll drive it back to AWS to import into S3 or Glacier.
Snowmobile security Features
- GPS tracking
- Alarm monitoring
- 24/7 video surveillance
- an escort security vehicle while in transit(optional).
Virtual Private Cloud – VPC
Provision a logically isolated section of the AWS Cloud where you can launch AWS resources in a virtual network that define.
Think of a AWS VPC as your own personal data centre.
VPC core components
- Internet Gateway – IGW
- Virtual Private Gateway – VPN Gateway
- Routing Tables
- Network Access Control Lists – NACLs stateless
- Security Groups- SG – Stateful
- Public Subnets
- Private Subnets
- NAT Gateway
- Customer Gateway
- VPC Endpoints
- VPC Peering
- VPCs are region specific they do not span regions.
- you can create upto 5 VPC per region.
- Every region comes with a default VPC
- You can have 200 subnets per VPC
- you can use IPv4 cider block and in addition to a IPv6 cider blocks
- Cost nothing: VPC’s, Route Tables, NACLs, Internet Gateways, Security Groups and Subnets, VPC Peering.
- Somethings Cost money: Nat Gateway, VPC endpoints, VPN Gateway, Customer Gateway.
- DNS hostnames: should your instance have domain name address.
AWS has a default VPC in every region so you can immediately deploy instances.
- Create a VPC with a size 16 IPv4 cidr block
- create a size 20 default subnet in each availability zone
- create an internet gateway and connect it to your default VPC
- create a default security group and associate it with your default VPC
- create a default network access control list – NACL and associate it with your default VPC.
- Associate the default DHCP options set for your AWS account with your default VPC.
- When you create a VPC, it automatically has a main route table.
0.0.0.0/0 is also know as default, it represents all possible ip addresses.
when you see 0.0.0.0/0 just think of giving access from anywhere or the internet.
VPC Peering allows you to connect one VPC with another over a direct network route using private IP address.
- instances on peered VPCs behave just like they are on the same network.
- connect VPCs across same or different AWS accounts and regions.
- Peering uses a star configuration: 1 central VPC – 4 other VPCs
- No overlapping cidr blocks.
Route tables are used to determine where network traffic is directed.
each subnet in your VPC must be associated with a route table
a subnet can only be associated with one route table at a time, but you can associate multiple subnets with the same route table.
VPC Endpoint cheatsheet
- VPC Endpoints help keep traffic between AWS services within the AWS network.
- There are two kinds of VPC Endpoints. Interface Endpoints and Gateway Endpoints.
- Interface Endpoints cost money, Gateway Endpoints are free.
- Interface Endpoint uses an Elastic Network Interface(ENI) with private IP.
- Gateway Endpoints is target for a specific route in your table.
- Interface Endpoint only support DynamoDB and S3.
VPC Flow Logs
VPC flow logs allow you to capture IP traffic information in and out of network interfaces within your VPC.
Flow logs can be created for VPC, Subnets, Network interface.
all log data is stored using Amazon Cloudwatch logs.
VPC Flow Logs Cheatsheet
- VPC flow logs monitor the in and out traffic of your network interfaces within your VPC.
- You can turn on flow logs at the VPC, Subnet or network interface level.
- VPC flow logs cannot be tagged like other AWS resources.
- You cannot change the configuration of a flow log after its created.
- You cannot enable flow logs for VPCs which are peered with you VPC unless it is in the same account.
- VPC flow logs can delivered to an S3 or cloud watch logs.
- VPC flow logs contains the source and destination IP addresses.
- Some instance traffic is not monitored : instance traffic generated by contacting the AWS DNS server, Windows license activation traffic from instances, Traffic to and from the instance metadata address, Dhcp Traffic, Any traffic to the reserved IP address of the default VPC router.
- Network access control list is commonly known as NACL
- VPCs are automatically given a default NACL which allows all outbound and inbound traffic.
- Each subnet within a VPC must be associated with a NACL
- Subnets can only be associated with 1 NACL at a time. Associating a subnet with a new NACL will remove the previous association.
- If a NACL is not explicitly associated with a subnet, the subnet will automatically be associated with the default NACL.
- NACL has inbound and outbound rules.
- Rules can either allow or deny traffic.
- NACL are stateless.
- When you create a NACL it will deny all traffic by default.
- NACLs contain a numbered list of rules that gets evaluated in order from lowest to highest.
- If you needed to block a single IP address you could via NACLs.( security group cannot deny).
Security Groups Cheatsheet
- Security Groups acts as a firewall at the instance level.
- Unless allowed specifically, all inbound traffic is blocked by default.
- All outbound traffic from the instance is allowed by default.
- You can specific for the source to be either an IP range, single IP address or another security group.
- Security Groups are stateful(if traffic is allowed inbound it is also allowed outbound)
- Any changes to a security group take effect immediately.
- EC2 instances can belong to multiple security groups.
- Security Groups can contain multiple EC2 instances.
- You cannot block specific IP address with security groups, for this you would need a Network access control list.
- You can have upto 10,000 Security groups per region (default 2500)
- You can have 60 inbound and 60 outbound rules per security group
- You can have 16 security groups associated to an ENI.(default is 5).
Network Address Tranaslation – NAT instance Cheatsheet
- When creating a NAT instance you must disable source and destination checks on the instance.
- NAT instances must exist in a public subnet.
- You must have a route out of the private subnet to the NAT instance.
- The size of a NAT instance determines how much traffic can be handled.
- High availability can be achieved using autoscaling groups, multiple subnets in different AZs, and automate failover between them using a script.
NAT GATEWAY cheatsheet
- NAT Gateways are redundant inside an availability zone.
- you can only have 1 NAT Gateway inside 1 Availability Zone.
- Starts at 5 Gbps and scales all the way up to 45Gbps.
- NAT gateways are the preferred setup for enterprise systems.
- There is no requirement to patch NAT Gateways, and there is no need to disable source/destination checks for the NAT Gateway.
- NAT Gateways are automatically assigned a public IP address.
- Route tables for the NAT Gateway must be updated.
- Resources in multiple AZs sharing a Gateway will lose internet access if the Gateway goes down, unless you create a Gateway in each AZ and configure route tables accordingly.
- Identity Access Management is used to manage access to users and resources.
- IAM is a universal system applied to all region at the same time. IAM is a free service.
- A root account is the account initially created when AWS is setup.
- New IAM accounts have no permissions by default until granted.
- New users get assigned an access key id and secret when first created when you give them programmatic access.
- Access keys are only used for CLI and SDK
- Access keys are only shown once when created. If lost they must be deleted/recreated again.
- Always setup MFA for root accounts.
- Users must enable MFA on their own, Administrator cannot turn it on for each user.
- IAM allows your set password policies to set minimum password requirements or rotate passwords.
- IAM Identities as Users, Groups, and Roles.
- IAM uses end users who log into the console or interact with AWS resource programmatically.
- IAM Groups up your users so they all share permission levels of the group. Example like Administrator, Developers, Auditors.
- IAM Roles associate permissions to a role and then assign this to an users or groups.
- IAM Policies JSON documents which grant permissions for a specific user, group, or role to access service. Policies are attached to IAM identities.
- Managed policies are policies provided by AWS and cannot be edited.
- Customer managed policies are policies created by use the customer, which you can edit.
- Inline Policies are policies which are directly attached to a user.
- Cognito is decentralised managed authentication system. When you need to easily add authentication to your mobile and desktop app think Cognito.
- User pools user directory, allows users to authenticate using OAuth to IpD such as Facebook, Google, Amazon to connect to web applications. cognate user pool is in itself a IpD.
- User pools use JWTs for to persist authentication.
- Identity pools provide temporary AWS credentials to access services e.g.: S3, DynamoDB.
- Cognito Sync can sync user data and preferences across devices with one line of code.
- Web identity Federation exchange identity and security information between an identity provider and application.
- Identity Provider (IdP) a trusted provider of your user identity that lets you use authenticate to across other services. Eg: Facebook,Twitter, Google, Amazon.
- OIDC is a type of identity provider which uses OAuth.
- SAML is a type of identity provider which is used for single sign-on.
AWS CLI and SDK cheatsheet
- CLI stands for command line interface
- SDK stands for Software Development Kit.
- The AWS CLI lets you interact with AWS from anywhere by simply using a command line.
- The AWS SDK is a set of API libraries that let you integrate AWS services into your application.
- Programmatic access must be enabled per user via the IAM console to use CLI or SDK.
- AWS configure command used to setup your AWS credentials for the CLI.
- The CLI is installed via a Python script.
- Credentials get stored in a plain text file.
- Domain Name system – internet service that converts domain names into routable IP addresses.
- IPv4 – Internet Protocol Version 4 – 32 bit address space
- IPv4 eg. 18.104.22.168
- IPv6 – Internet protocol version 6 – 128 bit address space.
- IPv6 eg: 2001:0db8:85a3:0000:0000:8a2e:0370:7334
- Top Level Domain Eg: .com
- Second level domain eg: .co.uk
- Domain registrar: 3rd party company who you register domain through.
- Name Server: The server which contain the DNS records for a domain.
- Start of Authority (SOA): Contains information about the DNS zone and associated DNS records.
- A Record: DNS record which directly converts a domain name into an IP address.
- CNAME Record: DNS record which lets you convert a domain name into another domain name.
- Time to Live(TTL): The time that a DNS record will be cached for
- Route53 is a DNS provider, register and manage domains, create record sets, Think Godaddy or NameCheap.
- Simple Routing: Default routing policy, multiple addresses result in a random endpoint selection.
- Weighted Routed: Split up traffic based on different weights assigned.
- Latency Based Routing: Directs traffic based on region, for lowest possible latency for users.
- Failover Routing: Primary site in one location, secondary data recovery site in another.
- Geolocation Routing: Route traffic based on the geographic location of a requests origin.
- Geo proximity routing: Route traffic based on geographic location using Bias values.
- Multi value answer routing: Return multiple values in response to DNS queries.
- Traffic flow: visual editor, for chaining routing policies, can version policy records for easy rollback.
- AWS Alias Record: AWS smart DNS record, detects changed IP for AWS resources and adjusts automatically.
- Route53 Resolver: Lets you regionally route DNS queries between your VPCs and your network Hybrid Environments.
- Health Checks can be created to monitor and automatically over endpoints. You can have health checks monitor other health checks.
- Elastic compute cloud is a cloud computing service.
- Configure your EC2 by choosing your OS, Storage, Network Thoughput.
- Lanuch and SSH into your server within minutes.
- EC2 comes in variety instance types specialized for different roles :
- General Purpose : balance of compute, memory and networking resources.
- Compute Optimized: ideal for compute bound applications that benefit from high performance processor.
- Memory Optimized: fast perfomance for workloads that process large data sets in memory.
- Accelerated Optimized: Hardware accelerators, or co processors.
- Storage Optimized: High sequential read and write access to very large data sets on local storage.