AWS Solution Architect Associate - SAA-C02 - Cheat Sheet
Categories:
I’ve passed my AWS Solution Architect Associate (SAA-C02) certification on July 02, 2021.
Here is my cheat sheet
Section 3: Getting started with AWS
AWS Cloud Overview - Regions & AZ
- Region: cluster of AZ (min=2, max=3)
- Availability Zones (AZ): one or more discrete data centers (DC)
- Points of Presence (Edge Locations): caching S3 & CloudFront
Section 4: IAM & AWS CLI
IAM Security Tools
- Can attach one or more policies to an IAM role or to a IAM user
- Credential Report (account-level)
- Access Advisor (user level) – find not-accessed services
- SDK is equivalent but for coding (Java, .Net, etc.) to CLI (based on python SDK)
Section 5: EC2 Fundamentals
EC2 Instances Purchasing Options
- On demand (Linux billing per second, other per hour)
- Reserved
- Reserved Instances (up to 72% discount) reserved for 1 year OR 3 years
- Convertible (up to 45% discount)
- Scheduled (1 year)
- Spot (up to 90% discount)
- Dedicated host (Compliance reason, Server bound license) - 3 years
- EC2 Instance Metadata - http://169.254.169.254/latest/meta-data
Section 6: EC2 - Solutions Architect Associate Level
Elastic IP (EIP)
- Keep fix public IP when EC2 stops/starts
- 5 per account
Placement Groups
- Hardware repartition of EC2 instances (same region)
- Cluster (same rack)
- Spread (cross AZ, each EC2 on different hardware): 7 instances par AZ
- Partition (each partition on separate rack): up to 7 partitions par AZ, 100s EC2 per partition
Elastic Network Interfaces (ENI)
- Virtual network card
- Primary private IPv4 (eth0), one or more secondary IPv4
- Many public IP (1 public IP per private)
- One EIP
- One or more Security groups
- Bound to AZ
Hibernate
- Root ESB volume must be encrypted
EC2 Nitro
- Nitro is necessary for 64,000 IOPS – max 32,000 on non-Nitro
EC2 – vCPU
- Each vCPU is a thread of a CPU core
- Possible to limit number of CPU cores and number of thread par care (licences)
Section 7: EC2 Instance Storage
EBS (Elastic Block Store) Volume
- Network drive (like network USB stick)
- Lock to one AZ (even if multi-attach io1/io2)
- Root ESB deleted by default (can be unchecked), other attached volumes are not deleted
ESB Raid configuration
- RAID 0 = performance
- RAID 1 = fault tolerance
EFS – Elastic File System
- Managed NFS 4.1, Linux - POSIX
- Multi AZ
- High available, scalable (3 x gp2)
- Performance mode: General purpose / Max IO
- Throughput mode: Bursting / Provisioned
- Lifecycle management (after N days): Infrequent access (EFS-IA)
Section 8: High Availability and Scalability: ELB & ASG
Elastic Load Balancing (ELB) Overview
- Cross AZ, but same region
- Health check
- Stickiness cookie when layer 7 (not NLB)
- Cross-Zone Load Balancing: each load balancer distributes across all registered instances in all AZ
- always true ALB,
- free inter AZ for CLB & ALB,
- cost option for NLB
- Connection Draining (CLB) or Deregistration Delay (ALB, NLB): time to complete request in progress – default 300 seconds
- Uses Server Name Indication (SNI) for ALB & NLB: multiple SSL certificate into one serve
CLB - Classic Load Balancers (v1)
- TCP (Layer 4), HTTP & HTTPS (Layer 7)
- Fixed hostname XXX.region.elb.amazonaws.com
ALB - Application Load Balancer (v2)
- Layer 7 (HTTP & HTTPS)
- Route in different target groups via path, hostname, query string, header:
- EC2 instance (can be managed by an Auto Scaling Group)
- ECS task (dynamic port)
- Lambda functions – HTTP request is translated into a JSON event
- IP Addresses – must be private Ips
- Health check at target group level
- Fixed hostname (XXX.region.elb.amazonaws.com)
- Client seen via header: IP (X-Forwarded-For), port (X-Forwarded-Port) and proto (X-Forwarded-Proto)
NLB - Network Load Balancer (v2)
- Layer4 (TCP/UDP)
- Million request/second + low latency (100ms vs. 400 ALB)
Auto Scaling Group (ASG)
- ASG defined with:
- Lauch configuration (AMI, user data, ESB, security groups, SHH key pair) or Lauch Template (new)
- Min/Max size
- Subnet
- One (optional) Load balancing
- If EC instance unhealthy (by EC2 or LB), terminate and create new EC2 instance
- Scale out = increase – Scale in = decrease
- Scaling policies
- Simple / Step with CloudWatch Alarm
- Target (CPU maintain at 50%)
- Scheduled
- Default termination policy: number of instances > oldest launch configuration > nearest billing hour
Section 9: AWS Fundamentals: RDS + Aurora + ElastiCache
Amazon RDS Overview
- Automated backup:
- Retention 7 days (max 35 days)
- Storage Auto Scaling (with maximum storage threshold)
- Read replicas: ASYNC replication - same AZ, cross AZ (free) or cross regions (fees)
- Multi-AZ Disaster Recovery: SYNC replication, based on DNS name (automatic)
- IAM-based authentication for login (MySQL & PostgreSQL)
- Encryption at rest
- AWS KMS – AES 256,
- Transparent Data Encryption (TDE) Oracle and MS-SQL
- In-flight encryption rds.force_SSL (Postgress & MS-SQL) , grant … REQUIRE SLL (MySQL)
- Access Management
- Who can manage DB? IAM policies
- Who can log in? Tradition user/pwd or IAM authentication with 15 minutes token (MySQL, Postrgress)
Amazon Aurora
- Support MySQL & Progress
- 5x performance
- Read replicas up to 15 for Aurora (up to 5 for MySQL)
- Automatic storage (up to 64TB, increments of 10GB)
- Instantaneous failover
- Price +20% (vs. RDS)
- Read-Replica autoscaling
- Writer endpoint + Reader Endpoint + optional custom Endpoint
- Aurora Serverless, via a proxy fleet (endpoint), manage the number of needed instances, pay per second for unpredictable workload
- Muti-Masters, immediate failover for write, client manage multiple DB connections for failover
- Multi-regions:
- Aurora cross-regions read replica
- Aurora Global Database, 1 primary region (R/W), up to 5 secondary regions (R-only), up to 16 replicas per region, promoting another region RTO < 1 minutes, replication lag is less than 1 second
Feature | Amazon Aurora Replicas | MySQL Replicas |
---|---|---|
Number of Replicas | Up to 15 | Up to 5 |
Replication type | Asynchronous (milliseconds) | Asynchronous (seconds) |
Performance impact on primary | Low | High |
Act as failover target | Yes (no data loss) | Yes (potentially minutes of data loss) |
Automated failover | Yes | No |
Support for user-defined replication delay | No | Yes |
Support for different data or schema vs. primary | No | Yes |
ElastiCache
- ElastiCache = managed Redis or Memcached
- Heavy application code change
- Redis (replication), Memcached (sharding)
- Redis: Multi-AZ, read-replicas, backup restore
- Do not support IAM authentication, Redis AUTH for extra level of security (password/token)(over security groups)
Section 10: Route 53
- Possible records
- A (hostname to IP4),
- AAAA (hostname to IP6),
- CNAME (sub-domain hostname to hostname),
- A + alias or AAAA + alias (hostname to AWS service hostname)
- Alias = A with alias or AAAA with alias
- Route53 can use public names I own (aap.mydomain.org) or private domain name in my VPC (app.company.internal)
- Route53 to ALB,
- via CNAME (ALB DNS name) not efficient
- via A + alias (IPv4) or AAAA + alias (IPv6)
- Need to create separately a health check:
- based on HTTP/HTTPS/TCP
- based on CloudWatch metric
- Routing policy:
- Simple, can return multiple values (all) to client (choose random), no health check
- Weighted
- Latency
- Failover: one primary (health check mandatory) and one secondary
- Geo Location (based on user location)
- Geoproximity
- Multivalues
Section 11: Classic Solutions Architecture Discussions
Beanstalk
- Only code is the responsibility of developer
- 3 architecture models: Single instance (DEV), LB + ASG (PROD, webapp), ASG (PROD, non-webapp)
- Beanstalk has 3 components: application, application version, environment name
- Go, Java, Tomcat, .Net, nodejs, php, python, ruby, docker
Section 12: Amazon S3 Introduction
Overview
- Bucket (directory), must have global unique name, at region level
- Object (file) have a key: s3://my-bucket/my_folder1/another_folder/my_file.txt (key=prefix+object name)
- No concept of subdirectory (only keys with name including “/”)
- Max object size = 5TB
- Multipart-upload if upload > 5GB
- Versioning at bucket-level
- Strong consistency, after successful write, every subsequent read receives latest version
Encryption
- SSE-S3 – key create and manage by S3 - must set header: “x-amz-server-side-encryption": “AES256”
- SSE-KMS – key create and manage by KMS - must set header: “x-amz-server-side-encryption": ”aws:kms"
- AWS manage key (aws/s3)
- User-created keys
- Other account user-crated key
- SSE-C - encryption key must be provided in HTTP headers,
- Client-side – client can use S3 encryption SDK
S3 Security
- User based: IAM policy
- Resource based: bucket policy or object ACL (finer grain)
- IAM principal can access S3 object:
- if ((use-based OR resource-based allow) AND no explicit deny)
- if (use-based AND resource-based allow) when not in same account
- By default, S3 blocks public access (prevent company data leaks)
- S3 website, make sur bucket policy allows public reads
- CORS, web-browser mechanism, could be allow in S3 permission via json CORS configuration
S3 advanced
- MFA delete (only enable by CLI with root account and versioning), when enabled:
- Permanent delete an object version
- Suspend versioning
- S3 access log: configure another bucket, any access request logged into another bucket
- S3 replication, can add rule to duplicate objects (not retroactive)
- to another region (CRR) or same region (SRR) even in a different account
- must enable versioning
- no chaining (bucket 1 > bucket 2 > bucket 3)
- 3500 PUT/COPY/POST/DELETE 5500 GET/HEAD requests per second per prefix
- Copy data from the source bucket to the destination bucket using the aws S3 sync command
- S3 Select & Glacier Select: server-side filtering
S3 Events
- You can create as many “S3 events” as desired (S3:ObjectCreated, S3:ObjectRemoved, S3:ObjectRestore, S3:Replication, etc.), directly form S2 bucket > properties
- But not possible to create 2 events of same type - IF YOU WANT to send event for PUT two 2 SQS, create a SNS and dispatch to 2 SQS
- Publish events to:
- SQS
- SNS
- Lambda
S3 storage classes
Section 15: CloudFront & AWS Global Accelerator
CloudFront
- Content Delivery Network (CDN)
- Origins
- S3 bucket, enhanced security with Origin Acces Identity (OAI)
- Custom HTTP Origin
- Geo-restriction to whitelist/blacklist countries
- Signed-URL and signed-cookies
- Multi origin:
- Multi-origin based on path patterns (/API/*)
- Origin Groups for failover (one primary, one secondary)
Global accelerator
- Anycast IP: all servers hold the same IP address and the client is routed to the nearest one
- Send via closest Edge Location instead of Internet (latency) via 2 anycast IP
- Works with Elastic IP, EC2 instances, ALB, NLB, public or private
Section 16 – AWS Storage Extra
AWS Snow Family Overview
- Snowball Edge 80TB of HDD (Storage Optimized) or 42TB (Compute Optimized)
- Snowmobile > 10PB
- Need CLI or AWS OpsHub
- Cannot import directly in Glacier
Storage Gateway Overview
- For hybrid cloud (see locally an AWS file)
- Only for S3
- 3 types of Storage Gateway:
- File Gateway: integrated with Active Directory, NFS or SMB
- Volume Gateway: iSCSI protocol, cache volume or stored volume
- Tape Gateway
** |
DataSync | Storage Gateway | |
---|---|---|---|
Description | AWS DataSync is an online data transfer service that simplifies, automates, and accelerates the process of copying large amounts of data to and from AWS storage services over the Internet or over AWS Direct Connect. | AWS Storage Gateway is a hybrid cloud storage service that gives you on-premises access to virtually unlimited cloud storage by linking it to S3. Storage Gateway provides 3 types of storage interfaces for your on-premises applications: file, volume, and tape. | |
How it Work | Uses an agent which is a virtual machine (VM) that is owned by the user and is used to read or write data from your storage systems. You can activate the agent from the Management Console. The agent will then read from a source location, and sync your data to Amazon S3, Amazon EFS, or Amazon Fsx for Windows File Server. | Uses a Storage Gateway Appliance – a VM from Amazon – which is installed and hosted on your data center. After the setup, you can use the AWS console to provision your storage options: File Gateway, Cached Volumes, or Stored Volumes, in which data will be saved to Amazon S3. You can also purchase the hardware appliance to facilitate the transfer instead of installing the VM |
|
Protocols | DataSync connects to existing storage systems and data sources with standard storage protocols (NFS, SMB), or using the Amazon S3 API. | Storage Gateway provides a standard set of storage protocols such as iSCSI, SMB, and NFS. | |
Storage | AWS DataSync can copy data between Network File Systems (NFS), SMB file servers or self-managed object storages. It can also move data between your on-premises storage and AWS Snowcone, Amazon S3, Amazon EFS, or Amazon FSx, | File Gateway enables you to store and retrieve objects in Amazon S3 using file protocols such as NFS and SMB. Volume Getaway stores your data locally in the gateway and syncs them to Amazon S3. It also allows you to take point-in-time copies of your volumes with EBS snapshots which you can restore and mount to your appliance as iSCSI device. Tape Getaway data is immediately stored in Amazon S3 and can be archived to Amazon S3 Glacier or Amazon S3 Glacier Deep Archive. |
|
Pricing | You are charged standard request, storage, and data transfer rates to read from and write to AWS services, such as Amazon S3, Amazon EFS, AmazonFSx for Windows File Server, and AWS KMS. | You are charged based on the type and amount of storage you use, the requests you make, and the amount of data transferred out of AWS. | |
Combination | You can use a combination of DataSync and File Gateway to minimize your on-premises’ operational costs while seamlessly connecting on-premises applications to your cloud storage. AWS DataSync enables you to automate and accelerate online data transfers to AWS storage services. File Getaway then provides your on-premises applications with low latency access to the migrated data. |
Section 17: Decoupling applications: SQS, SNS, Kinesis, Active MQ
Amazon SQS – Standard Queue
- Default retention of messages: 4 days, maximum of 14 days
Kinesis Data Streams
- Retention between 1 day (default) to 365 days
- Ability to reprocess (replay) data, can’t be deleted (immutability)
Section 18: Containers on AWS: ECS, Fargate, ECR & EKS
Three choices:
- ECS -EC2 Launch Type: Amazon’s own container platform
- ECS – Fargate Launch type: Amazon’s own Serverless container platform
- EKS: Amazon’s managed Kubernetes (open source)
Section 22: AWS Monitoring & Audit: CloudWatch, CloudTrail & Config
AWS CloudWatch metrics
- CloudWatch provides metrics for every service in AWS
- Every 5 minutes, every 1 minutes for detailed monitoring (cost)
- Possibility to define and send your own custom metrics to CloudWatch (API putMetricData), standard 60 second, high resolution 1 second
- CloudWatch Unified Agent for EC2:
- CPU, Disk metrics, Netsat (out-of-the-box, but low detailed)
- RAM, process, swap space
AWS CloudWatch Dashboard
- Dashboards can display graphs from different regions
AWS CloudWatch Logs
- Applications can send logs to CloudWatch using the SDK
- From
- Beanstalk, ECS, Lambda, Flow logs, API Gateway, CloudTrail, Route 53,
- Log Agent: collect logs from to send to CloudWatch Logs
- CloudWatch logs can go to
- Batch exporter to S3
- Stream to ElasticSearch
- CloudWatch Logs insight
CloudWatch Events
- Trigger notifications for any metrics
- Auto-scaling
- EC2 Actions (start, terminate, reboot)
- Lambdas
- SQS / SNS / Kinesis Messages
AWS CloudTrail
- Get an history of events / API calls made within your AWS Account (console, CLI, SDK, AWS services) to CloudTrail console (90 days)
- Can export CloudTrail console to:
- CloudWatch logs
- S3 (long term retention)
- CloudTrail events:
- Management events, enabled by default: operations done on AWS resources (configurations changed)
- Data events, disabled by default: S3 object-level activities, Lambda activities
- Insights events
AWS Config
- Record configuration changes
- Evaluate resources against compliance rules (unrestricted SSH access to my security groups)
Section 23: Identity and Access Management (IAM) – Advanced
AWS STS – Security Token Service
- Allows to grant limited and temporary access to AWS resources.
- Token is valid for up to one hour (must be refreshed)
- 3 assume Role Options:
- STS Get Tokens:
- GetFederationToken
- GetSessionToken
Identity Federation in AWS
- SAML 2.0 Federation
- Use STS with AssumeRoleWithSAML
- Amazon Single Sign On (SSO) Federation is the new managed and simpler way
- Custom Identity Broker Application
- When non SAML2 compatible
- Uses the STS API: AssumeRole or GetFederationToken
- Web Identity Federation
- Use STS AssumeRoleWithWebIdentity
- Not recommended by AWS – use Cognito instead
- AWS Cognito
- Provide direct access to AWS Resources from the Client Side (mobile, web app)
- OpenID : Facebook, Google, Cognito User Pools (CUP)
- AWS Single Sign-On (SSO)
- Integrated with AWS Organisation
- Support SAML2
Microsoft Active Directory (AD)
- AWS Managed Microsoft AD (on-premise + AWS)
- AD Connector (on-premise side)
- Simple AD (AWS-side)
Section 24: AWS Security & Encryption: KMS, SSM Parameter Store, CloudHSM, Shield, WAF
Type of CMK | Can view | Can manage | Used only for my AWS account |
---|---|---|---|
AWS owned CMK | No | No | No |
AWS managed CMK | Yes | No | Yes |
Customer managed CMK | Yes | Yes | Yes |