1 - API Gateway
Facts
- Automatically scales
- Can cache
- Can throttle
- Can log to Cloudwatch
- Ensure you enable CORS if required by your applcation
2 - Associate Solutions Architect Exam
Direct Connect - direct line from your data center to AWS
KMS - allows you to import your own keys, disable/re-enable keys, and define key management roles
AWS Shield - Protects against DDOS
AWS Macie - Uses ML to protect sensitive data
AWS WAF - Protects against XSS attacks and can block IP addresses
Consolidated billing allows bills coming from multiple AWS accounts to be rolled up to a single bill. Charges are still traceable to their original accounts, there is no charge for consolidation, and it could potentially reduce the overall bill
AWS Trusted Advisor - Will advice on security such as, MFA being configured in the default acount and calling out security groups and ports that have unrestricted access
3 - Cloudfront
Concepts
- Edge Location - Location where the content will be cached
- Origin - Where the files to distribute come from (S3, EC2, ELB, Route53)
- Distribution - Consists of a set of edge locations
- RTMP - Used for Media Streaming
Facts
- Edge locations are read/write
- Objects are cached for their TTL
- Clearing cached objects incurs a cost
Useful Links
4 - Cognito
Facts
- Can provide identity federation with Google, Facebook, or Amazon
- Can be the identity broker for your application
- User Pools handle things like registration, authentication, and account recovery
- Identity pools authorize access to AWS resources
5 - Databases
RDS
- Read Replicas are for performance
- Multi-AZ is for DR
- SQL, MySQL, PostgreSQL, Oracle, Aurora, MariaDB
- Runs on VM’s but you do not have OS level access to them
- Patched by Amazon
- Not serverless (except for Aurora Serverless)
- Encryption at rest is supported
- SQL Server and Oracle can have a maximum of 2 databases per instance
- Aurora
- 2 copies of the data is stored in each AZ at a minimum of 3 AZ’s
- Snapshots can be shared across accounts
- Automated backups turned on by default
DynamoDB
- No SQL
- Uses SSDs
- Spread accross 3 geographically distinct data centers
- Eventual consistent reads by default but strongly consistent reads can be enabled (for a cost).
- Name and value combined cannot exceed 400kb
Elasticache
- Memcached (caching)
- Redis (caching + pub/sub)
Redshift
- For Business Inelligence or Data Warehousing
- Only available in 1 AZ
- Can restore snapshots to new AZs if there is an outage
- Can retain backups for a maximum of 35 days
Useful Links
6 - EC2
Instance Types
- Field Programmable Gate Array (F1) - Genomics research, financial analytics, video processing, big data
- High Speed Storage (I3) - NoSql DB’s, Data warehousing
- Graphics Intensive (G3) - Video Encoding, 3D Application Streaming
- High Disk Throughput (H1) - Map Reduce based workloads, distributed file systems
- Low cost, General Purpose (T3) - Web Servers, small DB’s
- Dense Storage (D2) - File servers, Data warehousing, hadoop
- Memory Optimized (R5) - Memory Intensive Apps/DB’s
- General Purpose (M5) - Application Servers
- Compute Optimized (C5) - CPU Intensive Apps/DB’s
- Arm-based (A1) - Scale-out workloads
Pricing
- On Demand - Fixed rate, no commitment
- Reserved - Capacity reservation and discount with upfront commitment of 1 or 3 years
- Spot - Bid for a price you want to pay. If terminated by Amazon you will not be charged for a partial hour of usage.
- Dedicated - Physical EC2 server dedicated for your use
Placement Groups
- Clustered - Low latency/High Throughput, single az
- Spread - Individual Critical, can be multi az
- Partitioned - Multiple EC2 Iinstnaces, can be multi az
- Name must be unique for your account
- Not all types can be in placement groups
- Cant move an existing instance into a placement group
Facts
- Termination protection is off by default
- Instance Store Volumes are ephemeral
- Retrieve metadata for an instance with
curl http://169.254.169.254/latest/meta-data/
- Retrieve user data for an instance with
curl http://169.254.169.254/latest/user-data/
- When a dedicated host is stopped you can switch it between “dedicated” (single-tenant hardware) and “host” (isolated server), but not back to “default " (shared hardware)
Storage
EBS
Elastic Block Store (for most EC2 workloads).
Types
- General Purpose SSD (gp2) - Most work loads
- Provisioned IOPS SSD (io2) - Databases
- Throughput Optimized HDD (s1) - Big Data/Data Warehouses
- Cold HDD (sc1) - File Servers
- EBS Magnetic (Standard) - Infrequently accessed data
Facts
- Root EBS volumes can be encrypted (so can other volumes)
- EBS Snapshots exist on S3
- EBS Snapshots are incremental
- Snapshots should not be taken of a root volume when an instance is running
- EBS volume sizes can be changed on the fly
- EBS Volumes will always be in the same AZ as the instance they are attached to
- By default the root EBS volume is destroyed if an instance is terminated
EFS
Elastic File Store (super scalable NFS).
- Supports the NFSv4 protocol
- Does not require pre-provisioing
- Can scale to petabytes
- Can support thousands of concurrent connections
- Provides read after write consistency
Useful Links
7 - ELB
Concepts
- Application Load Balancer - Layer 7, can route based off application needs.
- Network Load Balancer - Layer 4, can route based off network information
- Classic Load Balancer -
Facts
- 504 meants the gateway has timed out. This means there is an issue with your application
- The end user’s IPv4 address is available in the
X-Forwarder-For
header - Only given DNS never IP
- With cross zone load balancing you are able to equally distribute load across instances in multiple AZ’s, without it you can distribue load evenly between multiple AZ’s but not evenly across instances
- Sticky sessions can be configured so one user is always routed to the same instance
- Path patterns allow you to route to instances based off the path of the request
Useful Links
8 - IAM
Concepts
- Users
- Groups - Can be used to organize users and their permissions
- Roles - Used for AWS resources to authenticate with each other. Access/Secret Keys should never be used by AWS resources.
- Policies - JSON Document describing what resources can be accessed and in what capacity
Facts
- Global not regional
- Users have no permissions when created
- Root account is the account used when creating the organization, it should be secured and then not used
9 - Kinesis
For Streams
Types
- Kinesis Streams - Producers send data to shards, where it is available to consumers for between 24hours and 7 days. Typically consumed by an EC2 instance and forwarded to a data store (such as dynamo, s3, emr, or redshift) where it can be processed further
- Kinesis Firehose - Data must be processed right away, typically sent to elasticsearch, s3, or redshift (via s3)
- Kinesis Analytics - Can be used in cunjunction with streams or firehose, automatically processes data right away
10 - Lambda
Facts
- Priced on the amount of memory assigned combined with the duration of execution
- Scales out automatically
- Each event triggers a unique instance of a lambda function
- One function can trigger one or more other functions
- X-ray can be used to debug serverless applications
- Lambda can perform operations globally
- Lambda Triggers
11 - Route 53
Routing Policies
- Simple - No health checks
- Weighted - Split requests by %, supports health checks
- Latency - Sends traffic to region with lowest latency
- Failover - Active/Passive Routing
Common DNS Types
- Start of authority record (SOA) - Specifies authoritative information about a DNS zone, including the primary name server, the email of the domain administrator, the domain serial number, and several timers relating to refreshing the zone.
- Nameserver (NS) - Delegates a DNS zone to use the given authoritative name servers
- Address (A) - IP address to direct traffic to
- Canonical Name (CNAMES) - Alias of one name to another
- Mail Exchange (MX) - Maps a domain name to a list of message transfer agents
- PTR - Pointer to a canonical name
Facts
- You can register domains on AWS
- Sometimes it can take days to register a new domain name
- You can integrate SNS to be notified of health check failures
- Health checks can be applied to individual record sets
- Given the choice always choose an alias record over a CNAME
12 - S3
Tiers
S3 Standard - 4 9’s Availability, 11 9’s durability, designed to sustain the concurrent loss of 2 data centers
S3 Standard IA - For data that is infrequently accessed but requires rapid access when it is accessed
S3 One Zone IA - For data that does not require multi-az durability is infrequently accessed but requires rapid access when it is accessed
S3 Glacier - Secure, durable, cheap. Can take minutes to hours to retrieve (configurbale)
S3 Glacier Deep Archive - Can take up to 12 hours to retrieve
S3 Intelligent-Tiered - Automatically adjusts tier of objects to optimize cost while maintaining performance
Pricing
- Storage
- Requests
- Storage Management
- Data Transfer
- Transfer Acceleration
- Cross Region Replication
Object Attributes
- Key
- Value (object)
- Version ID
- Metadata
Concepts
- Versioning - Stores a version of every change (including deletes). Charged storage for each version.
- Life Cycle Management - Can be used to manage tier of storage based off rules you define or automatically. Can be applied to current and previous versions.
- Cross Region Replication - Can automatically replicate objects to a different bucket in another region. Does not apply to objects created before Cross Region Replication is configured. Versioning must be enabled (in both buckets), and delete markers are not replicated.
- Accelertation - When using acceleration you always upload to edge locations rather than to directly to the bucket, then the file is transfer along amazon’s backbone to the data center(s) your bucket resides in. Acceleration does not always result in faster uploads.
Facts
- Files can be 0B-5TB
- Bucket names must be globally unique
- Supports MFA Delete
- Read after write consistency for PUTs of new objects
- Eventual Consistency for overwrite PUTs and DELETEs
- Once enabled, versioning can never be disabled, only suspended
- Snowball is a physical device that can be used to import or export data from S3
File Gateways
Physical device with direct line to Amazon.
- File Gateway - For flat files stored directly on S3
- Volume Gateway
- Stored Volumes - Entire dataset is stored on site and asynchronously backed up to S3
- Cached Volumes - Entire dataset is stored on S3 and most frequently accessed data is cached on site
Useful Links
13 - SNS
Facts
- Push based
- Simple API’s
- Flexible message delivery over multiple transport protocols
- Cheap
- Can be used to de-couple your infrastructure
- Standard SQS - Order is not guaranteed and messages can be delivered multiple times
- FIFO SQS - Order is strictly maintained and messages are delivered only once
14 - SQS
Facts
- Pull based
- Messages are 256KB
- Messages can be in the queue from 1 minute to 14 days, default retention is 4 days
- Visibility Time Out - the amount of time the message will be invisible after it is picked up. If the job is finished before the visibility timeout expires, it is removed from the queue, otherwise it is made visible again. Default is 12 hours
- Guarantees messages will be processed atleast once
Useful Links
15 - SWF
Facts
- Workflow executions can last up to a year
- Task Oriented API (vs message oriented)
- Ensures a task is assigned only once and is never duplicated
- Actors
- Workflow Starts - Anything that can initiate a workflow
- Deciders - Control the flow of tasks in the workflow
- Activity Workers - Perform the tasks
16 - VPC
Concepts
- Internet Gateway (IGW) - An internet gateway is a horizontally scaled, redundant, and highly available VPC component that allows communication between instances in your VPC and the internet. It therefore imposes no availability risks or bandwidth constraints on your network traffic. Only one internet gateway can exist per VPC
- Virtual Private Gateways - Allows you to peer your local network with a VPC
- Egress-Only Internet Gateway - Prevents IPv6 based internet resources from connecting into a VPC while allowing IPv6 traffic to the internet
- Route Tables - A route table contains a set of rules, called routes, that are used to determine where network traffic from your subnet or gateway is directed.
- Network ACL
- Default ACL comes with each VPC and allows all inboud and outbound traffic
- Custom ACL’s deny all inboud and outbound traffic by default
- Each subnet must be associated with an ACL, if one is not explicitly attatched the default ACL is applied
- ACL’s allow you to block IP Addresses
- A single ACL can be attatched to multiple subnets
- Each rule is numbered, rules are evaluated in order
- Inbound and Outbound rules are separate
- Subnets
- A single subnet cannot span multiple AZ’s
- A public subnet always has atleast one route in its table that uses an IGW
- AWS reserves the first 4 and the last IP for each subnet’s CIDR block
- Security Groups
- All inbound traffic is blocked by default
- All outbound traffic is allowed by default
- Changes take effect immediatley
- Unique to each VPC
- multiple groups can be assigned to a single instance
- multiple instances can be assigned to a single group
- Can specify allow rules but not deny rules
- NAT Instances - provide internet access
- Must be in public subnet
- Disable source/destination check on the instance
- Must be route to private subnet for instances there to be able to use it
- If there is a bottleneck consider making the instance larger
- Can be HA if it is in an Autoscaling Group and failover is scripted
- Uses security groups
- Cannot be used as a bastion
- NAT Gateways - provide internet access
- Redundant within a single AZ
- 5Gbps to 45Gbps
- Does not use security groups
- No need to patch or disable source/destination checks
- Automatically gets public IP
- If using multiple AZ’s put a NAT Gateway in each AZ with appropriate routing to ensure availability
- Flow Logs
- Log traffic within a VPC
- Cannot enable flow logs for peered VPC’s unless those VPC’s are in your acconut
- Flow logs cannot be tagged
- Internal DNS Traffic is not logged
- Traffic generated for windows license validation is not logged
- Traffic to/from 169.254.169.254 is not logged
- DHCP Traffic is not logged
- Can be generated at the network interface, subnet, and VPC levels
- VPC Endpoints - allows traffic to AWS services to stay within AWS. Endpoints are virtual, horizontally scaled, and highly available
- Interface Endpoint - API Gateway, Cloudformation, Cloudwatch, CodeBuild, Config, EC2 API, ELB API, Kenisis, KMS, SageMaker, Secrets Manager, STS, Service Catalog, SNS, SQS, Systems Manager, Endpoints in another AWS account
- Gateway Endpoints - DynamoDB, S3
Facts
- No Transitive Peering
- Security Groups are stateful, Network ACL’s are stateless
- When creating a custom VPC a Route Table, ACL, and Security Group are all automatically created
- A VPN connection consits of a customer gateway and a virtual private gateway
- By design Amazon DNS ignores requests coming from outside a VPC