AWS Well Architected Framework and White papers

(I) Overview of AWS:

  • Cloud Computing: Its the on-demand delivery of IT resources and applications via the Internet with pay-as-you-go pricing. Cloud computing provides a simple way to access servers, storage, databases, and a broad set of application services over the Internet. Cloud computing providers such as AWS own and maintain the network-connected hardware required for these application services, while you provision and use what you need using a web application.
    • Types of Cloud Computing:
      • Infrastructure as a Service (IaaS): Infrastructure as a Service, sometimes abbreviated as IaaS, contains the basic building blocks for cloud IT and typically provide access to computers (virtual or on dedicated hardware),  data storage space and networking features. e.g  Amazon EC2, Windows Azure, Google Compute Engine, Rackspace.
      • Platform as a Service (PaaS): Platforms as a service remove the need for organizations to manage the underlying infrastructure (usually hardware and operating systems) and allow you to focus on the deployment and management of your applications. e.g AWS RDS, Elastic Beanstalk, Windows Azure, Google App Engine
      • Software as a Service (SaaS): Software as a Service provides you with a completed product that is run and managed by the service provider. In most cases, people referring to Software as a Service are referring to end-user applications. e.g Gmail, Microsoft Office 365.
    • Cloud Deployment Models:
      • Cloud: A cloud-based application is fully deployed in the cloud and all parts of the application run in the cloud.
      • Hybrid: A hybrid deployment is a way to connect infrastructure and applications between cloud-based resources and existing on-premises resources.
      • On-premises (private cloud): Deploying resources on-premises, using virtualization and resource management tools, is sometimes called “private cloud”.
  • Advantages:
    • Trade Capital Expenses for variable expenses.
    • Benefit from massive economics of scale.
    • Stop guessing about capacity.
    • Increase speed and agility.
    • Stop spending money running and maintaining data centers.
    • Go global in minutes.
  • Security and Compliance:
    • State of the art electronic surveillance and multi factor access control systems.
    • Staffed 24 by 7 security gaurds
    • Access is authorized on a “least privilege basis”
    • SOC 1/SSAE 16/ISAE 3402 (formerly SAS 70 Type II), SOC 2, SOC3
    • FISMA, DIACAP, FedRAMP, PCI DSS Level 1, ISO 27001, ISO 9001, ITAR, FIPS 140-2
    • HIPA, Cloud Security Alliance (CSA), Motion Picture Association of America (MPAA)

(II) Overview of Security Process:

  • AWS Shared Security Responsibilities Model:
    • It describe what AWS is responsible for and what the customer is responsible for when it relates to security. Amazon is responsible for securing the underlying infrastructure that support the cloud (i.e security of cloud), and you are responsible for anything you put in the loud or connect to the cloud (i.e security in cloud).
      • Infrastructure Services:
        • This includes AWS services like VPC, EC2, EBS and Auto Scaling.
        • Amazon is responsible for security of the cloud.
          • The Global Infrastructure (Regions, AZs, Edge Locations)
          • The Foundation Services (Compute, Storage, Database, Networking)
        • The customer is responsible for security in the cloud.
          • Platforms and Applications
          • OS and  Network configs (patching, security groups, network ACLs)
          • Customer Data
          • Customer IAM (password, access keys, permissions)
      • Container Services:
        • The include services like RDS, ECS and EMR.
        • AWS is responsible for:
          • Platforms and Applications
          • OS and network configs
          • The Global Infrastructure (Regions, AZs, Edge Locations)
          • The Foundation Services (Compute, Storage, Database, Networking)
        • The customer is responsible for:
          • Customer Data
          • Customer IAM (password, access keys, permissions)
      • Abstracted Services:
        • The include services like S3, DynamoDB and Lambda.
        • AWS is responsible for:
          • Platforms and Applications
          • OS and network configs
          • The Global Infrastructure
          • The Foundation Services
          • Network traffic protection
        • The customer is responsible for:
          • Customer IAM
          • Data in transit and client-side
        • Additional Services:
          • Data encryption
          • Data integrity
  • AWS Security Responsibilities:
    • Amazon is responsible for protecting the Compute, Storage, Database, Networking and Data Center facilities (i.e Regions, Availability Zones, Edge Locations) that runs all of the services in AWS cloud.
    • AWS is also responsible for security configuration of its managed services such as RDS, DynamoDB, Redshift, Elastic MapReduce, WorkSpaces.
  • Customer Security Responsibilities:
    • Customer is responsible for Customer Data, IAM, Platform, Applications,  Operating System, Network & Firewall Configuration, Client and Server Side Data Encryption, Network Traffic Protection.
    • IaaS, that includes such as VPC, EC2, S3 are completely under your control and require you to perform all of the necessary security configuration and management tasks.
    • Managed Services, AWS is responsible for patching, antivirus etc, however you are responsible for account management and user access. Its recommend that MFA be implemented, connect to theses services using SSL/TLS, in addition API and user activity should be logged using CloudTrail.
  • Storage Decommissioning:
    • When a storage device has reached the end of its useful life, AWS procedures include a decommissioning process that is designed to prevent customer data from being exposed to unauthorized individuals.
    • AWS uses the techniques detailed in DoD 5220.22-M (Department of Defense) or NIST 800-88  (National Industrial Security Program Operational Manual) or Guidelines for Media Sanitization to destroy data as part of the decommissioning process.
    • All decommissioned magnetic stage devices are degaussed and physically destroyed in accordance with  industry standard practices.
  • Network Security:
    • Transmission Protection: You can connect to AWS  services using HTTP and HTTPS. AWS also offers Amazon VPC which provides a private subnet within AWS cloud, and the ability to use an IPsec VPN connection between AWS VPC and your on-premises data center.
    • Amazon Corporate Segregation: Logically, the AWS Production network is segregated from the Amazon Corporate network by means of a complex set of network security segregation devices.
  • Network Monitoring and Protection:
    • It protects from:
      • DDoS (Distributed Denial of Service)
      • Man in the middle attacks (MITM)
      • Port Scanning
      • Packet Sniffing by other tenants
      • IP Spoofing: AWS-controlled, host-based firewall infrastructure will not permit an instance to send traffic with a source IP or MAC address other than its own.
        • Unauthorized port scans by AWS customers on their EC2 instances are a violation of the AWS Acceptable Use Policy. You may request permission to conduct vulnerability scans as required to meet your specific compliance requirements.
        • You must request vulnerability scan in advance by submitting a official form and declaring the time period and the instances (t2.small or t2.micro instance type are not allowed).
        • These scans must be limited to your own instances and must not violate the AWS Acceptable Use Policy.
  • AWS Credentials:
    • Passwords: Used for AWS root account or IAM user account login to the AWS Management console. AWS passwords must be 6-128 chars.
    • Multi-Factor Authentication (MFA): Its a six digit single-use code that’s required in addition to your password to login to your AWS root account or IAM user account.
    • Access Keys: Digitally signed requests to AWS APIs (using the AWS SDK, CLI or REST/Query APIs). Include an Access key ID and a Secret Access Key. You use access keys to digitally sign programmatic requests that you make to AWS.
    • Key Pairs: Its a used for SSH login to EC2 instances and CloudFront signed URLs. Its required to  connect to EC2 instance launched from a public AMI. They are 1024-bit SSH-2 RSA keys. You can get automatically generated key pair by AWS when you launch EC2 instance or you can upload your own before launching the instance.
    • X.509 Certificates: Used for digitally signed SOAP requests to AWS APIs (for S3) and SSL server certificate for HTTPS. You can have AWS create X.509 certificate and private key or you can upload your own certificate using Security Credentials page.
  • AWS Trusted Advisor: It analyzes your AWS environment and provides best practice recommendations in following five categories:
    • Cost Optimzation
    • Performance
    • Security
    • Fault Tolerance
    • Service Limits
    • Available to all customers for access to seven core checks:
      • Security (security groups, IAM MFA on root account, EBS and RDS public snapshots)
      • Performance (service limits)
    • Available to Business and Enterprise support plans:
      • Access to full set of checks
      • Notifications (weekly updates)
      • Programmatic access (retrieve results from AWS Support API)
    • Trusted Advisor inspects your AWS environment and makes recommendations when opportunities may exist to save money, improve system performance or close security gaps.
    • It provides alerts on several of the most common security misconfigurations that can occur, including:
      • Leaving certain ports open that make you vulnerable to hacking and unauthorized access
      • Neglecting to create IAM accounts for your internal users
      • Allowing public access to S3 buckets
      • Not turning on user activity logging (AWS CloudTrail)
      • Not using MFA on your root AWS account.
  • Instance Isolation:
    • Different instances running on the same physical machines are isolated from each other via the Xen hypervisor. In addition, the AWS firewall resides within the hypervisor layer, between the physical network interface and the instance’s virtual interface.
    • All packets must pass through this layer, thus an instance’s neighbors have no more access to that instances than any other host on the Internet and can be treated as if they are on separate physical host. The physical RAM is separated using similar mechanism.
    • Customer instances have no access to raw disk devices, but instead are presented with virtualized disks. The AWS proprietary disk virtualization layer automatically resets every block of storage used by the customers, so that one customer’s data is never unintentionally exposed to another.
    • In addition, memory allocated to guests is scrubbed (set to zero) by the hypervisor when its unallocated to a guest. The memory is not returned to the pool of free memory available for new allocations until the memory scrubbing is complete.
  • Guest Operating System: Virtual instances are completely controlled by the customer. You have full root or administrative access over accounts, services and applications. AWS doesn’t have any access rights to your instances or the guest OS.
    • Encryption of sensitive data is generally a good security practice, and AWS provide the ability to encrypt EBS volumes and their snapshots with AES-256.
    • In order to be able to do this efficiently and with low latency, the EBS encryption feature is only available on EC2’s more powerful instance types (e.g M3, C3, R3, G2).
  • Firewall: Amazone EC2 provides a complete firewall solution, this mandatory inbound firewall is configured in a default deny-all mode and AWS EC2 customers must explicitly open the ports needed to allow inbound traffic. All ingress traffic is blocked and egress traffic is allowed by default.
  • Elastic Load Balancing: SSL termination on the load balancer is supported. Allows you to identify the originating IP address of a client connecting to your servers, whether you are using HTTPS or TCP load balancing.
  • Direct Connect: Bypass Internet service providers in your network path. You can procure rack space withing the facility housing the AWS Direct Connect location and deploy your equipment nearby. Once deployed, you can connect this equipment to AWS Direct Connect using a cross-connect.
    • Using industry standard 802.1q VLANs, the dedicated connection can be partitioned into multiple virtual interfaces. This allows you to use the same connection to access public resources (e.g S3 buckets) using public IPs, and private resources such (e.g EC2 instances in a VPC) using private IPs, while maintaining network separation between the public and private environment.

(III) AWS Risk and Compliance:

  • Risk: AWS management has developed a strategic business plan which includes risk identification and the implementation of controls to mitigate or manage risks. AWS management re-evaluates the strategic business plan at least biannually.
    • This process requires management to identify risks within its areas of responsibility and to implement appropriate measures designed to address those risks.
    • AWS Security regularly scans all Internet facing service endpoint IP addresses for vulnerability (these scans don’t include customer instances). AWS Security notifies the appropriate parties to re-mediate any identified vulnerabilities. In addition, external vulnerability threat assessments are performed regularly by independent security firms.
    • Findings and recommendations resulting from these assessments are categorized and delivered to AWS leadership. These scans are done in a manner for the health and viability of the underlying AWS infrastructure and are not meant to replace the customer’s own vulnerability scans required to meet their specific compliance requirements.
    • Customers can request permission to conduct scans of their cloud infrastructure as long as they are limited to the customer’s instances and don’t violate the AWS Acceptable Use Policy.

(IV) Storage Options in the AWS cloud:

(V) Architecting for the AWS Cloud: Best Practices:

  • Business Benefits of Cloud:
    • Almost zero upfront infrastructure investment
    • Just-in-time Infrastructure
    • More efficient resource utilization
    • Usage-based costing
    • Reduced time to market
  • Technical Benefits of Cloud:
    • Automation – Scriptable infrastructure
    • Auto-scalling
    • Proactive scaling
    • More Efficient Development lifecycle
    • Improved Testability
    • Disaster Recovery and Business Contiuity
    • Overflow the traffic to the cloud
  • Design For Failure:
    • Rule of thumb: Be a pessimist when designing architectures in the cloud, assume things will fail. In other words always design, implement and deploy for automated recovery from failure.
    • In particular, assume that your hardware or software will fail, outages will occur, some disaster will strike, requests will increase.
  • Decouple Your Components:
    • The key is to build components that don’t have tight dependencies on each other, so that if one component were to die, sleep or remain busy for some reason, the other components in the system are built so as to continue to work as if no failure is happening.
    • In essence, loose coupling isolates the various layers and components of your application so that each components interacts asynchronously with the others and treats them as black box.
  • Implement Elasticity:
    • The cloud brings new concept of elasticity in your applications. Elasticity can be implemented in three ways:
      • Proactive Cyclic Scaling: Periodic scaling that occurs at fixed interval (daily – during working hours, weekly – weekdays, monthly, quarterly)
      • Proactive Event-based Scaling: Scaling just when you are expecting a big surge of traffic requests due to a scheduled business event (new product launch, marketing campaigns, black friday sale)
      • Auto-scaling based on demand: By using monitoring service, your system can send triggers to take appropriate actions so that it scales up or down based on metrics (cpu utilization of servers or network i/o for instance).

(V) AWS Well-Architected Framework:

  • Well-Architected Framework is a set of questions that you can use to evaluate how well your architecture is aligned to AWS best practices. It consists of 5 pillars: Security, Reliability, Performance Efficiency, Cost Optimization and Operational Excellence.
  • General Design Principles:
    • Stop guessing your capacity needs.
    • Test systems at production scale.
    • Automate to make architectural experimentation easier.
    • Allow for evolutionary architectures.
    • Data-Driven architectures
    • Improve through game days (such as black Friday)
  • (1) Operational Excellence: It includes operational practices and procedures used to manage production workloads. I addition, how planned changes are executed, as well as responses to unexpected operational events.
    • Design Principles:
      • Perform operations with code
      • Annotate documentation
      • Make frequent, small, reversible changes
      • Refine operations procedures frequently
      • Anticipate failure
      • Learn from operational failures
    • Best Practices:
      • Prepare: AWS Config and rules can be used to create standards for workloads and to determine if environments are compliant with those standards before being put into production. 
        • Operational Priorities
        • Design for Operations
        • Operational Readiness
      • Operate: Amazon CloudWatch allows you to watch operational health of a workload.
        • Understanding Operational Health
        • Responding to Events
      • Evolve: Amazon  Elasticsearch Service (Amazon ES) allows you to analyze your log data to gain actionable insight quickly and securely.
        • Learning from Experience
        • Share Learnings
  • (2) Security: It  include the ability to protect information, systems and assets while delivering business value through risk assessments and mitigation strategies.
    • Design Principles:
      • Implement a strong identity foundation
      • Enable traceability
      • Apply security at all layers
      • Automate security best practices
      • Protect data in transit and at rest
      • Prepare for security events
    • Best Practices:
      • Identity and Access management
      • Detective Controls
      • Infrastructure Protection
      • Data Protection
      • Incident Response
  • (3) ReliabilityIt covers the ability of a system to recover from service or infrastructure outages/disruptions as well as the ability to dynamically acquire computing resources to meet demand, and mitigate disruptions such as misconfigurations or transient network issues.
    • Design Principles:
      • Test recovery procedures
      • Automatically recovery from failure
      • Scale horizontally to increase aggregate system availability
      • Stop guessing capacity
      • Manage change in automation
    • Best Practices:
      • Foundations
      • Change Management
      • Failure Management
  • (4) Performance Efficiency: It focuses on how to use computing resources efficiently to meet your requirements and how to maintain that efficiency as demand changes and technology evolves.
    • Design Principles:
      • Democratize advanced technologies
      • Go global in minutes
      • User server-less architectures
      • Experiment more often
      • Mechanical sympathy
    • Best Practices:
      • Selection
        • Compute
        • Storage
        • Database
        • Network
      • Review
      • Monitoring
      • Trade-Offs
  • (5) Cost Optimization: The Cost Optimization pillar includes the ability to avoid or eliminate unneeded cost or suboptimal resources.
    • Design Principles:
      • Adopt a consumption model
      • Measure overall efficiency
      • Stop spending money on data center operations
      • Analyze and attribute expenditure
      • Use managed services to reduce cost of ownership
    • Best Practices:
      • Cost-Effective Resources:
        • Appropriately Provisioned
        • Right Sizing
        • Purchasing Options
        • Geographic Selection
        • Managed Services
      • Matching Supply and Demand:
        • Demand-Based
        • Buffer-Based
        • Time-Based
      • Expenditure Awareness:
        • Stakeholders
        • Visibility and Controls
        • Cost Attribution
        • Tagging
        • Entity Lifecycle Tracking
      • Optimizing Over Time:
        • Measure, Monitor, and Improve
        • Staying Ever Green
Advertisements
Posted in aws, cloud

AWS Monitoring, Management and Deployment Services

  1. CloudWatch (Monitoring):
    • Its AWS proprietary, integrated performance monitoring service. It allows for comprehensive and granular monitoring of all AWS provisioned resources, with the added ability to trigger alarms/events based off metric thresholds.
    • It monitors operational and performance metric for your AWS services (EC2, EBS, ELB and S3) and your applications.
    • You monitor your environment by configuring and viewing CloudWatch metrics.
    • Alarms can be created to trigger alerts, based on threshold you set on metrics.
    • You can create CloudWatch metrics to stop and start the instance based off of status check alarms.
    • Auto Scaling heavily utilizes Cloudwatch, relying on threshold and alarms to trigger the addition or removal of instances from an auto scaling group.
    • Metrics are specific to each AWS service or resource, and such metrics include:
      • EC2 per-instance metrics: CPUUtilization, CPURCreditUsage
      • S3 Metrics: NumberOfObjects, BucketSizeBytes
      • ELB Metrics: RequestCount, UnhealthyHostCount
    • Detailed vs Basic level monitoring:
      • Basic/Standard (5 mins): CloudWatch for Basic monitoring of EC2 instances, the CPU load, Disk I/O and Network I/O metrics are collected at 5 mins intervals and are stored for 2 weeks.
      • Detailed (1 min): Data is available in 1 mins period at additional fees.
    • CloudWatch EC2 monitoring:
      • System (Hypervisor) status checks: Things that are outside of our controls.
        • Loss of network connectivity or system power.
        • Hardware or Software issues on physical host
        • How to Solve: Generally restarting the instance will fix the issue. This cause the instance to launch on a different physical hardware device.
      • Instance status checks: Software issues that we control.
        • Failed system status checks.
        • Mis-configured networking or start configuration
        • Exhausted memory, corrupted filesystem or incompatible kernel
        • How to solve: Generally a reboot or solving the file system configuration issues.
      • Default metrics: CloudWatch will automatically monitor metrics that can be viewed at the host level (Not the software level) Such as:
        • CPU Utilization, DiskReadOps, DiskWriteOps, Network In/Out, StatusCheckFailed, Instance/System.
      • Custom metrics: OS level metrics that required a third part script (perl) to be installed (provided by AWS)
        • Memory utilization, memory used and available.
        • Disk swap utilization
        • Disk space utilization, disk space used and available.
    • Alarms: Allows you to set alarms that notify you when particular thresholds are hit.
      • Using CloudWatch alarm actions, you can create alarms that automatically stop, terminate, reboot or recover your EC2 instances.
      • You can use stop or terminate actions to help you save money when you no longer need an instance to be running.
      • You can use the reboot and recover actions to automatically reboot those instances or recover them onto new hardware if a system impairment occurs.
    • Events: Helps you to respond to state changes in your AWS resources. When your resources change state they automatically send events into an event stream. You can create rules that match selected events in the stream and route them to targets to take action. You can also use rules to take action on a pre-determined schedule. For example, you can configure rules to:
      • Automatically invoke an Lambda function to update DNS entries when an event notifies you that Amazon EC2 instance enters the Running state
      • Direct specific API records from CloudTrail to a Kinesis stream for detailed analysis of potential security or availability risks
      • Take a snapshot of an EBS volume on a schedule
    • Logs: Helps you to aggregate, monitor, and store logs from EC2 instances, CloudTrail and other sources. e.g:
      • Monitor HTTP response codes in Apache logs
      • Receive alarms for errors in kernel logs
      • Count exceptions in application logs
    • Retention period:
      • 1 min datapoints are available for 15 days
      • 5 mins datapoints are available for 63 days
      • 1 hour datapoints are available for 455 days
    • VPC Flow LogsAllows you to collect information about the IP traffic going to and from network interface in your VPC.
      • VPC Flow Log data is stored in a log group in CloudWatch and can be accessed from Logs in CloudWatch.
      • Flow logs can be created on:
        • VPC
        • Subnet (i.e include all network interfaces in it)
        • Network Interface and each interface have its own unique log stream.
      • The logs can be set on accepted, rejected or all traffic.
      • Flow logs are not captured in real-time,  but data is captured in approx. 10 mins window and then data is published.
      • Its can be used to troubleshooting why certain traffic is not reaching EC2 instance.
      • VPC Flow log consists of of specific 5-tuple of network traffic:
        1. Source IP address
        2. Source port number
        3. Destination IP address
        4. Destination port number
        5. Protocol
      • Following traffic is not captured by VPC Flow logs:
        • Traffic between EC2 instances and AWS DNS server
        • Instance metadata (168.2554.169.254) requests
        • DHCP traffic
  2. CloudTrail (Auditing):
    • Its an auditing service for compliance, which logs all API calls made via either AWS CLI or SDK or Console to your AWS resources. It provides centralized logging (stored in S3) so that we can log each action taken in our environment and store it for later use if needed.
    • With CloudTrail, you can view events for your AWS account. Create a trail to retain a record of these events. With a trail, you can also create event metrics, trigger alerts, and create event workflows.
    • The recorded logs include the following information:
      • identity of the user
      • start time of the AWS API call
      • source IP address
      • request parameters
      • response elements returned by the service.
    • CloudTrail logs are placed into a designated S3 bucket and are encrypted by default using SSE-S3 (AES-256) keys but you may use SSE-KMS key as well. In addition, you can apply S3 Lifecycle rules to archive CloudTrail logs into Glacier or delete them. You may also setup SNS notification about log delivery and validation.
    • ClouTrail logs help to address security concerns by allowing you to view what actions users on your AWS account have performed.
    • Since AWS is just one big API, CloudTrail can log every single action taken in your account.
    • You can now turn on trail across all regions in your AWS account. CloudTrail will deliver logs files from all regions to the S3 bucket and optional CloudWatch Logs log group you specify.
    • Additionally, when AWS launches a new region, CloudTrail will create the same trail in new region. As a result, you will receive the log files containing API activity for the new region without taking any action.
  3. AWS Config:
    • AWS Config provides an inventory of your AWS resources and a history of configuration changes to these resources. You can use AWS Config to define rules that evaluate these configurations for compliance.
    • AWS Config automatically discovers your AWS resources and starts recording configuration changes. You can create Config rules from a set of pre-built managed rules to get started.
    • You can configure any of the pre-built rules to suit your needs, or create your own rules using AWS Lambda to check configurations for compliance.
    • AWS Config continuously records configuration changes to resources and automatically evaluates these changes against relevant rules. You can use a dashboard to assess overall configuration compliance.
    • Evaluate resource configurations for desired settings
    • Get a snapshot of the current configurations associated with your account
    • You can retrieve current and historical configurations of resources in your account
    • Retrieve a notification for creations, deletions, and modifications
    • View relationships between resources (e.g member of a security groups)
    • Administrating resources: Notification when a resource violates config rules (e.g a user launches an EC2 instance in a region).
    • Auditing and Compliance: Historical records of configs are sometimes needed in auditing.
    • Configuration management and troubleshooting: Configuration changes on one resource might affect others, can help find these issues quickly and restore last known good configurations.
    • Security Analysis: Allows for historical records of IAM policies and security group configurations. e.g what permissions a user had at the time of an issue.
  4. Systems Manager:
    • With Systems Manager you can gain operational insight and take Action on AWS resources. View operational data for groups of resources, so you can quickly identify and act on any issues that might impact applications that use those resources.
    • It allows for grouping resources and performing automation, patching, and running commands.
    • Resource Group: Allows you to group your resources logically (e.g PRD, STG, TST). Make sense out of your AWS footprint by grouping your resources into applications.
    • Insights & Dashboard: View account-level and group-related insights through operational dashboards. Aggregates CloudTrail, CloudWatch, TrustedAdvisor, and more into a single dashboard for each resource group.
    • Software Inventory: Collect software catalog and configuration for your instances. A listing of your instances and software installed on them. Can collect data on applications, files, network configs , services and more.
    • Automations: Use built-in automations or build your own to accomplish complex operational tasks at scale. Automate IT operations and management tasks through scheduling, triggering from an alarm or directly.
    • Run Command: Safe and secure remote execution across instances at scale without SSH or PowerShell. Secure remote management replacing need for bastion hosts or SSH.
    • Patch Manager: Helps deploys OS and software patches across EC2 or on-premises.
    • Maintenance Window: Allows for scheduling administrative and maintenance tasks.
    • State Manager and Parameter Store: Centralized hierarchical store for managing secrets or plain-text data. Used for configuration management.
  5. CloudFormation:
    • CloudFormation allows you to quickly and easily deploy your infrastructure resources and applications on AWS. Its allows you to turn infrastructure into code. This provides numerous benefits including quick deployments, infrastructure version control, and disaster recovery solutions.
    • You can convert your architecture into JSON formatted template, and that template can be used to deploy out updated or replicated copies of that architecture into multiple regions.
    • It automate and saves time by deploying architecture in multiple regions.
    • It can be used to version control your infrastructure. Allowing for rollbacks to previous versions of your infrastructure if a new version has issues.
    • Allows for backups of your infrastructure and its a great solution for disaster recovery.
    • There are no additional charges for CloudFormation. You only get charged for underlying resources that are created using CloudFormation template.
    • Stack: A stack is a group of related resources that you manage as a single unit. You can use one of the templates provided by AWS to get started quickly with applications like WordPress or Drupal or create your own template.
    • StackSet: A StackSet is a container for CloudFormation stacks that lets you provision stacks across AWS accounts and regions by using a single AWS CloudFormation template.
    • TemplateTemplates tell CloudFormation which AWS resources to provision and how to provision them. When you create a CloudFormation stack, you must submit a template.
      • If you already have AWS resources running, the CloudFormer tool can create a template from your existing resources. This means you can capture and redeploy applications you already have running.
      • To build and view templates, you can use the drag-and-drop tool called CloudFormation Designer. You drag-and-drop the resources that you want to add to your template and drag lines between resources to create connections.
  6. Elastic Beanstalk:
    • With Elastic Beanstalk, you can deploy, monitor, and scale an application quickly and easily.
    • Elastic Beanstalk is an easy-to-use service for deploying and scaling web applications and services developed with Java, .NET, PHP, Node.js, Python, Ruby, Go, and Docker on familiar servers such as Apache, Nginx, Passenger, and IIS.
    • It allows for quick creation of simple single tier application infrastructure, and the deployment of code out into that infrastructure.
    • Its designed to make it easy to deploy less complex applications.
    • This helps to reduce the management required for building and deploying applications.
    • Its used to deploy out easy, single-tier application that take advantage of code services such as: EC2, Auto Scaling, ELB, RDS , SQS, CloudFront
    • In order to quickly provision an AWS environment that require little to no management. The operation fits within the parameters of the Beanstalk service.
    • Can deploy from repositories or from upload code files.
    • Easily update applications by uploading new code files or requesting a pull from a repository.
    • It can be used to host Docker containers and supports the deployment of web application from Docker containers.
    • It stores your application files and optionally server log files in S3. You may configure (by editing environment config settings) Elastic Beanstalk to copy your server log files every hour to S3.
  7. WorkSpaces
    • Amazon WorkSpaces is a fully managed, secure Desktop-as-a-Service (DaaS) solution which runs on AWS.
    • With Amazon WorkSpaces, you can easily provision virtual, cloud-based Microsoft Windows 7 Experience (provided by Windows Server 2008 R2) desktops for your users, providing them access to the documents, applications, and resources they need, anywhere, anytime, from any supported device.
    • With Amazon WorkSpaces, you pay either monthly or hourly just for the Amazon WorkSpaces you launch, which helps you save money when compared to traditional desktops and on-premises Virtual Desktop Infrastructure (VDI) solutions.
    • You don’t need AWS account to login to workspaces and you are given local Administrator access by default.
    • Workspaces are persistent and all data on D:\ drive is backup every 12 hrs.
  8. OpsWorks:
    • AWS OpsWorks is a configuration management service that helps you build and operate highly dynamic applications, and propagate changes instantly.
    • AWS OpsWorks provides three solutions to configure your infrastructure:
      • OpsWorks Stacks: Define, group, provision, deploy, and operate your applications in AWS by using Chef in local mode.
        • It lets you manage applications and servers on AWS and on-premises.
        • With OpsWorks Stacks, you can model you application as a stack containing different layers, such as load balancing, database, and application servers. You can deploy and configure EC2 instances in each layer or connect other resources as RDS databases.
      • OpsWorks for Chef Automate: Create Chef servers that include Chef Automate premium features, and use the Chef DK or any Chef tooling to manage them.
      • OpsWorks for Puppet Enterprise: Create Puppet servers that include Puppet Enterprise features. Inspect, deliver, update, monitor, and secure your infrastructure.
Posted in aws, cloud

AWS Application Integration Services

  1. Simple Notification Service (SNS): Its a flexible, fully managed pub/sub messaging and mobile notifications service for coordinating the delivery of messages to subscribing endpoints and clients.
    • SNS a push-based whereas SQS is poll-based messaging service.
    • Its a integrated notification service that allows for sending messages to various endpoints. Generally these messages are used for alert notifications to sysadmins or to create automation.
    • Its integrated into many AWS services, so its very easy to setup notifications based on events that occur in those services.
    • With CloudWatch and SNS, a full-environment monitoring solution can be created that notifies administrators of alerts , capacity issues, downtime, changes in the environment and more.
    • This service can also be used for publishing IOS/Android app notifications and creating automation based off of notifications.
    • In SNS there are two types of clients – Publishers (producers) and Subscribers (consumers). Publishers communicate asynchronously with subscribers by producing and sending a message to a Topic, which is a logical access point and communication channel.  Subscribers consume or receive the message or notification over one of the supported protocols when they are subscribed to  the topic.
    • Topic: Its a group of subscriptions that you send message to. Its  a logical access point or communication channel.
    • Subscription: Also known as consumer (subscriber) is an endpoint that a message is sent. Available endpoints are:
      • HTTP, HTTPS, Email, Email-JSON, Application, SQS, Lambda, SMS
    • Publisher: Also known as producers is an entity that triggers the sending of a message. SNS enables you to publish notifications to all subscriptions associated with a topic as well as to an individual endpoint associated with a platform application. e.g:
      • Human, S3 Event, Cloudwatch Alarm.
  2. Simple Queue Service (SQS): Its a reliable, scalable, fully-managed message queuing service. It provides the ability to have hosted and highly available queues that can be used for messages being sent between servers.
    • It allows for highly available and distributed decoupled application architecture. This is accomplished through utilizing the use of message and queues, and retrieving messages is polling, so its a pull based system.
    • Message Retention: Messages (256 KB of text in any format) can be kept in queue from 1 min to 14 days, whereas default is 4 days.
    • Visibility Timeout is the duration of time (max is 12 hrs) a message is locked for read by other consumers once its already read by a consumer, so that the message can’t be read again by another consumer.
    • Message Delay is set if you want to configure individual message delay of up to 15 mins. It helps when need to schedule jobs with a delay.
    • In-Flight messages are the one which are received/read from the queue by a consumer app but not yet deleted from the queue.
    • SQS can be used with RedShift, DynamoDB, EC2, ECS, RDS , S3 and Lambda to make distributed decoupled applications.
    • You can use IAM policies to control who can read/write message from/to an SQS queue.
    • Server-side encryption (SSE) using KMS managed keys, lets you transmit sensitive data in encrypted queues.
    • Standard CloudWatch metrics (5 mins) for your SQS queues are automatically collected and pushed to CloudWatch, whereas Detailed monitoring (1 min) is not available currently.
    • CloudTrail can be used to collect information about SQS such as which requests are made to SQS, the source IP,  source, timestamp.
    • Many EC2 instances can poll a single queue but to keep multiple instances from processing the same SQS message, your application must delete the SQS message after processing it.
    • SQS Long polling doesn’t return a response until a message arrives in the queue, reducing your overall cost over time. Short polling will return empty responses.
    • Polling (Message retrieval) Types:
      • Short Polling: Its default for SQS. A request is returned immediately even if the queue is empty.
        • It doesn’t wait for messages to appear in the queue
        • It queries only a subset of available servers for messages
        • Increases API requests (over long polling) which increases costs.
        • ReceiveMessageWaitTime is set to 0.
      • Long Polling (1-20 secs timeout): Its preferred over short polling because it uses fewer requests and reduces cost by eliminating false empty responses by querying all the servers.
        • It reduces the number of empty responses by allowing SQS to wait until a message is available before sending the response or until connection timeout i.e 1-20 secs. It returns all messages from the queue.
        • Long polling reduces API requests (over using short polling).
        • You can enable long polling by setting ReceiveMessageWaitTimeout value greater than 0 in AWS console.
        • Don’t use long polling if your app expects an immediate response to receive message calls.
    • Queue types:
      • Standard Queue:
        • Unlimited throughput: Standard queues support a nearly unlimited number of transactions per second (TPS) per API action.
        • At-Least-Once Delivery: A message is delivered at least once, but occasionally more than one copy of a message is delivered.
        • Best-Effort Ordering: Occasionally, messages might be delivered in an order different from which they were sent.
        • Send data between applications when the throughput is important, for example:
          • Decouple live user requests from intensive background work: let users upload media while resizing or encoding it.
          • Allocate tasks to multiple worker nodes: process a high number of credit card validation requests.
          • Batch messages for future processing: schedule multiple entries to be added to a database.
      • First in First Out (FIFO) Queue:
        • High Throughput: FIFO queues support up to 300 messages per second (300 send, receive, or delete operations per second). When you batch 10 messages per operation (maximum), FIFO queues can support up to 3,000 messages per second. To request a limit increase, file a support request.
        • First-ln-First-out Delivery: The order in which messages are sent and received is strictly preserved.
        • Exactly-Once Processing: A message is delivered once and remains available until a consumer processes and deletes it. Duplicates are not introduced into the queue.
        • Send data between applications when the order of events is important, for example:
          • Ensure that user-entered commands are executed in the right order.
          • Display the correct product price by sending price modifications in the right order.
          • Prevent a student from enrolling in a course before registering for an account.
    • SQS Workflow: Generally a “worker” instance will “poll” queue to retrieve waiting messages for processing. Auto scaling can be applied based off of queue size so that if a component of your application has an increase in demand, the number of worker instances can increase.
    • SQS Message: A set of instructions that will be relayed to the worker instances via the SQS queue. The message can be up to from 1-256 KB of text in any format. Each message is guaranteed to be delivered at least once but order is not guaranteed and duplicates can occur.
    • SQS Queue: It can have 1-10 messages, up to payload size of 256 KB. The messages can be stored in queue from 1 min up to 14 days (default is 4 days), that can be retrieved through polling. Queues allows components of your application to work independently of each other (Decoupled environment).
  3. Simple Workflow Service (SWF): Its a fully managed work flow service that coordinates and manages the execution of activities. It manages a specific job  from start to finish, while still allowing for distributed decoupled architecture.
    • It allows an architect/developer to implement distributed, asynchronous applications as work flow.
    • It has consistent execution (can last up to 1 year) and guarantees the order in which tasks (without duplicate) are executed.
    • SWF helps developers build, run, and scale background jobs that have parallel or sequential steps. You can think of SWF as a fully-managed state tracker and task coordinator in the Cloud.
      • If your app’s steps take more than 500 milliseconds to complete, you need to track the state of processing, and you need to recover or retry if a task fails, SWF can help you.
      • Maintains distributed app state
      • Tracks workflow execution
      • Ensure consistency of execution history
      • Provides visibility into executions
      • Holds and dispatch tasks
      • Provide controls over task distribution
      • Retrain workflow execution history
    • Workflow: A sequence of steps required to perform a task, also commonly referred as decider. It coordinates and manages the execution of activities that  can be run asynchronously across multiple computing devices.
    • Activities: A single step (or unit of work) in the workflow.
    • Tasks: What interacts with the workers that are part of a workflow.
      • Activity task: Tells the worker to perform a function.
      • Decision task: Tells the decider the state of the work flow execution, by communicating (back to the decider) that a given task has been completed, which allow him/her to determine the next activity to be performed.
    • Worker: Responsible for receiving a task and taking action on it.
      • Can be any type of component such as an EC2 instance or even a person.
    • Actors:
      • Workflow Starters: An application that initiate/start the workflow.
      • Deciders: Controls the flow of activity tasks in a workflow execution.
      • Activity Workers: Carry out the activity tasks.
  4. API Gateway:
    • Its a serverless component for managing access to APIs.
    • Its a fully-managed service that acts as a front door to your back-end services by allowing you to create and manage your own APIs for your application.
    • It helps developers to create and manage APIs to back-end systems running on Amazon EC2, AWS Lambda, or any publicly addressable web service.
    • With Amazon API Gateway, you can generate custom client SDKs for your APIs, to connect your back-end systems to mobile, web, and server applications or services.
    • All of the endpoints created with API Gateway are of HTTPS.
    • AWS Query API provides HTTP or HTTPS requests that use the HTTP verb GET or POST and a Query parameter named Action.
    • Streamline API development: API Gateway lets you simultaneously run multiple versions and release stages of the same API, allowing you to quickly iterate, test, and release new versions.
    • Performance at scaleAPI Gateway helps you improve performance by managing traffic to your existing back-end systems, throttling API call spikes, and enabling result caching.
    • SDK generationAPI Gateway can generate client SDKs for JavaScript, iOS, and Android, which you can use to quickly test new APIs from your applications and distribute SDKs to third-party developers.
    • It allows you to:
      • Build RESTful API with Resources, Methods (GET, POST) and Settings
      • Deploy APIs to a Stage (different environment i.e Dev, STG or PRD)
      • Create a new API version by cloning existing one and roll back to previous API deployments
      • Set throttling rules based on the number of requests per secs, requests over limited throttled (HTTP 429 response)
      • Security using Signature v.4 to sign and authorize API calls, temporary credentials generated through Cognito or STS.
    • Benefits:
      • Ability to cache API responses
      • DDos protection and reduced latency through CloudFront by throttling requests (Throttling is used to protect backend systems from traffic spikes).
      • SDK generation for iOS, Android and JavaScript
      • Supports swagger and request/response data transformation
      • Robust, secure and scalable access to backend APIs and hosts multiple versions and release stages of your APIs.
      • Create and distribute API Keys to developers
      • Uses AWS Sig-v4 to authorize access to APIs
    • API Gateway CacheIt will cache API responses so that duplicate API request don’t have to hit your back-end applications.
      • This reduces load on back-end apps and speeds up response time.
      • You can configure a cache key and Time to Live (TTL) for the API response
      • Caching can be setup on a per API or per stage basis.
    • Monitoring:
      • Monitoring through dashboard: Amazon API Gateway provides you with a REST API dashboard to visually monitor calls to the services.
      • Monitoring through CloudWatch: It can monitor API Gateway activity and usage and throttling rules are also monitored by it
        • Monitoring metrics include Caching, Latency and Detected errors
        • You can create alarms based on these metrics
      • CloudTrail with API Gateway: AWS API Gateway is integrated with CloudTrail to give a full auditable history of the changes to your REST APIs.
    • Cross Origin Resource Sharing (CORS)CORS is simply allowing sharing of resources between different domains.
      • Its a mechanism that allows restricted resources (e.g fonts) on a web page to be requested from another domain outside the domain from which the first resource was served.
      • When your API resources receive requests from a domain other than the API’s own domain , then you must enable cross-origin resource sharing (CORS) for selected methods on the resources.
      • Same Origin Policy: Its an important concept in the web application security model. Under the policy, a web browser permits scripts contained in a first web page to access data in a second web page, but only if both web pages have the same origin (domain).
      • Error: Origin policy can’t be read at the remote resource? Or The same origin policy disallows reading the remote resource…
      • Solution: You need to enable CORS on API Gateway.
    • Permissions: You control access to API Gateway with IAM permissions by controlling access to the following two API Gateway component processes:
      • To create, deploy, and manage an API in API Gateway, you must grant the API developer permissions to perform the required actions supported by the API management component of API Gateway.
      • To call a deployed API or to refresh the API caching, you must grant the API caller permissions to perform the required IAM actions supported by the API execution component of API Gateway.
Posted in aws, cloud

AWS Security, Identity & Compliance Services

Amazon Web Services (AWS): AWS is made of Regions, which are grouping of independently separated data centers in a specific geographic regions known as Availability Zones.

  • Regions: A grouping of AWS resources located in a specific geographic region. Designed to service AWS customers that are located closet to a region. Regions are comprised of multiple Availability Zones.
    • Data in a specific region is not replicated into another region by default.
    • Use regions to manage compliance with regulations and manage network latency.
    • There are 16 regions currently in different countries across the World and each region have two or more availability zones.
    • Availability Zone Geographical isolated zones a region that house AWS resources. Availability zones are where separate, physical AWS data centers are located. Multiple AZs in each Region provide redundancy for AWS resources in that region.
      • There are at least two AZs in a region and are connected through a high speed and low latency links (LAN-type connectivity).
      •  AZs are designed for fault isolation, but its up to user to configure their systems to take advantage of it.
      • Availability Zone names are unique per account and do not represent a specific set of physical resources.
    • Data Centers:
    • Edge Locations: It’s an AWS data center which doesn’t contain AWS services. Instead its used to deliver contents to the parts of the world.
      • Using a network of edge locations around the world, CloudFront cache copies of your static content close to viewers, lowering latency when they download your objects and giving you the high, sustained data transfer rates needed to deliver large popular objects to end users at scale.
    • Endpoints: They are method to access AWS services. AWS provides several different ways to connect to its services which are called endpoints.
  • AWS Organization: AWS Organizations enables you to centrally apply policy-based controls across multiple accounts in the AWS Cloud.
    • You can consolidate all your AWS accounts into an organization, and arrange all AWS accounts into distinct organizational units.
    • Enable either Consolidated Billing or All Features
    • Paying account is independent and can’t access resources of other accounts and should be used for billing purpose only.
    • All linked accounts are independent. Currently you can have up to 20 linked accounts for consolidated billing.
  • Resource Groups: Find and group your AWS resources with tag queries
    • You can create unlimited, single-region groups in your account based on resource types and tag queries, use your groups to view group related insights, and automate tasks on group resources.
  • Security Assertion Markup Language (SAML) is an open standard for exchanging authentication and authorization data between security domains.
    • SAML 2.0 s an XML based protocol that uses Security Tokens containing Assertions to pass information about a principal (user) between a SAML authority (Identity Provider/IdP) and a SAML consumer (Service Provider).
    • It enables web based, cross-domain Single-Sign-On (SSO), which helps to reduce the administrative overhead of distributing multiple authentication token to the user.
  • Security Token Service (STS): Its a web service that enable you to request temporary, limited-privilege credentials for IAM users or federated users.
    • Allows you to create temporary security credentials that grant users (trusted) access to your AWS resources.
    • It grants users limited and temporary access to AWS resources.
    • Temporary security credentials (tokens) are short-term and can be configured to last from few mins to several hours (min 15 mins to max 36 hrs). After token is expired it can’t be renewed but you can request a new one created for you.
    • Temporary credentials are the basis for IAM Roles and ID Federation.
    • Its a global service and all AWS STS requests go to single endpoint at https://sts.amazonaws.com
    • You can’t generate STS tokens through AWS Console but you can use either AWS CLI, AWS SDKs or AWS Tools for PowerShell.
    • STS API actions returns temporary credentials that consists of:
      • Access key (Access key ID and Secret Access key)
      • Session token
      • Expiration or Duration of validity of token (temporary security credentials)
      • Users (or an app that the users runs) can use these credentials to access resources.
    • STS API actions:
      • AssumeRole: Cross-Account delegation and federation through a custom identity broker. Default expiration duration for token is 1 hr.
      • AssumeRoleWithWebIdentity: Federation through a web based identity provider. Default expiration duration for token is 1 hr.
      • AssumeRoleWithSAML: Federation through an enterprise identity provider compatible with SAML 2.0. Used for SSO or federation with AD.  Default expiration duration for token is 1 hr.
      • GetFederationToken: Federation through a custom identity broker. Default expiration duration for token is 1 hr.
      • GetSessionToken: Temporary credentials for users in un-trusted environments. Default expiration duration for token is 12 hrs.
    • When to use STS:
      • Enterprise Identity Federation:
        • Authenticate through your company network.
        • STS uses SAML 2.0 to grants temporary access based off the users Active Directory credentials. Doesn’t need to be a user in IAM.
        • Single sign on allows users to login into AWS console without assigning IAM credentials.
      • Web Identity Federation: Mobile apps can use third party identity providers such as Facebook/Amazon/Google or OpenID to log in.
      • Roles Cross Account Access: Lets users from one AWS account access resources in another. Used by organization that have more than one AWS accounts.
      • Roles for EC2 (or other AWS services): Grant access to an app running inside EC2 instance to access other AWS services without having to imbed credentials.
    • Benefits:
      • No distributing or embedding long term AWS security credentials in an app.
      • Grant access to AWS resources without having to create an IAM identity for them. The basis for IAM roles and identity federation.
      • Since credentials are temporary, you don’t have to rotate or revoke them. You decide how long they are active.
    • Terminology:
      • Federation: Combining or joining a list of users in one domain (such as IAM) with a list of users in another domain (such as Active Directory, Facebook etc)
      • Identity Broker: A service that allows you to take an identity from point A and join it (federate it) to point B.
      • Identity Store: Services like Active Directory, Facebook, Google etc.
      • Identities: A user of a service like Facebook etc.
      • Access Key, Secret Access Key, Token, Duration (1-36 hrs)
  • AWS Support:
    • Basic, Developer, Business, Enterprise
  • AWS SDKs:
    • Following languages are supported by AWS SDK: Java, .NET, Python, PHP, Node.js, C++, Ruby, Go.
  • AWS Billing:
    • AWS Budget: let you quickly create custom budgets that will automatically alert you when your AWS costs or usage exceed, or are forecasted to exceed, the thresholds you set.
      • AWS Budget uses data from Cost Explorer to provide you with a quick way to see your usage-to-date and current estimate charges from AWS, and to see how much predicted usage  accrues in charges by the end of the month.
      • You can create budget for different types of usage and different types of cost. For example, you can create a budget to see how many EC2 hours you have used, or how many GB you have stored in S3 bucket.
  1. Identity and Access Management (IAM): AWS Identity and Access Management (IAM) enables you to manage access to AWS services and resources securely. Using IAM, you can create and manage AWS users and groups, and use permissions to allow and deny their access to AWS resources. You use IAM to control who is authenticated (signed in) and authorized (has permissions) to use resources.
    • IAM is the service where you manage your AWS users, groups and roles and their access to AWS accounts and services.
    • Its a universal and centralized control of your AWS account and shared access to your AWS account, its global and does not apply to regions.
    • Its used for granular permissions and Multi-Factor authentication, and Identity federation (e.g Active Directory, Facebook, LinkedIn etc)
    • Provide temporary access for users/devices and services
    • Allows you setup your own password rotation policy
    • The common use of IAM is to manage: Users, Groups, Roles, Access Policies, API Keys, specify Password Policy and MFA requirements on per user basis.
    • UsersIndividual user accounts, when created are assigned username/password and Access Key ID/Secret Access Keys (programmatic accesswhich are used to access AWS resources via CLI and APIs.
      • Root account: When you first create an AWS account, you begin with a single sing-in identity (email and password) that has complete access to all AWS services and resources in the account.
        • AWS root account have full access on all AWS services by default, and its not possible to restrict permissions on AWS root account.
        • AWS recommend root user should not be used for everyday tasks, even administrative ones.
        • The root user should have MFA enabled.
        • The access keys for root user are deleted by default and should not be created.
      • Principal: Its an entity that can take an action on an AWS resource.
        • IAM Users, Roles, Federated users and Applications are all principals.
      • AWS Root user need email address (username) and password to login to AWS console, whereas IAM users need Account ID (or alias) along with IAM username and password to login to AWS console.
      • A new IAM user has non-explicit deny (NO access) for all AWS services, access must be granted explicitly by through IAM policies.
      • An explicit deny always override an explicit allow from attached policies.
      • IAM user receive unique credentials which should not be shared with others, and should not be stored on EC2 instance.
      • IAM users can be actual person or an application.
      • Federated users (authenticated with Google, Facebook)
      • IAM Access/API Keys (Access Key ID and Secret Access Key) are required to make API (programmatic) calls to AWS resources from the HTTPS API, AWS CLI, AWS SDK, PowerShell.
        • You can view and download secret access key only when you create the access key. If you lose your secret access key, then you can’t access or recover it, but you can create a new access key and secret access key pair.
        • Access Keys are only associated with AWS user and you can create up to two access keys per user. Roles don’t have Access keys.
        • Never create or store Access Keys on EC2 instance.
        • You can allow users to change their own access keys through IAM policy.
      • service account: When you create an IAM user to represent an app that needs credentials in order to make requests to AWS resources.
      • You can create up to 5000 users in a AWS account. For more you can consider using Temporary Security Credentials (STS).
      • Use Amazon Resource Name (ARN) when you need to uniquely identify the user such the user as a Principal in an IAM policy for an S3 bucket.
        • arn:aws:iam::ID:user/myuser
      • AWS Console Sign In:
    • Groups: They are collections of IAM users and used to assign permissions / policies to multiple users at once.
      • Organize users by function (e.g admins, developers) and assign policies to groups, not individual users.
      • Group is not an identity in IAM because it can’t be identified as a Principal in a permission policy.
      • Groups can’t be nested, they contain only users.
      • You can create up to 300 groups per account.
      • A user can be a member of up to 10 groups.
    • PoliciesA policy is an entity in AWS that, when attached to an identity or resource, defines their permissions. AWS evaluates these policies when a principal, such as a user, makes a request. Permissions in the policies determine whether the request is allowed or denied.
      • Policies are stored in AWS as JSON documents attached to principals (as Identity-Based Policies) or to resources (as Resource-Based Policies).
        • Identity based policies: Attached to users, groups, roles.
        • Resource-Based Policies: Attached to resources e.g S3 buckets. The following services support resource-based policies:
          • S3 buckets, Glacier vaults, SNS topics, SQS queues.
      • Permissions are always granted through policies (single or multiples) which are attached to users, groups or roles.
      • Polices can’t be directly attached to AWS resources (e.g EC2 instance), but the polices (one or more) are attached to a role, and then role (single role only) is attached to EC2 instance.
      • Any actions (permissions) that are not explicitly allowed are denied by default.
      • An explicit deny always override an explicit allow.
        • This allows to quickly restrict ALL access that a user may have through multiple attached policies.
      • You can define tags on test and production servers, and add a condition to the IAM policy which allows access to specific tags.
      • AWS Managed Policies: An AWS managed policy is a standalone pre-built policies that are created and administered by AWS.
        • AdministratorAccess: Full access to all AWS services.
        • PowerUserAccess: Full access except it doesn’t allow user/group management (IAM service).
        • ReadOnlyAccess: View only access to AWS resources.
      • Customer Managed Policies: You can create standalone policies that you administer in your own AWS account, which are referred as customer managed policies.
      • Inline Policies: An inline policy is embedded in a principal entity (a user, group, or role)—that is, the policy is an inherent part of the principal entity.
      • P.S. Standalone policy means that the policy has its own Amazon Resource Name (ARN) that includes the policy name.
      • Access Advisor: The main function of the Access Advisor tool is to assist is auditing permissions in an AWS environment. The audit can help determine what privileges should bet set and what extra permissions can be removed.
    • RolesIAM roles are a secure way to grant permissions (temporary security credentials via STS) to entities that you trust. IAM roles issue keys that are valid for short durations, making them a more secure way to grant access.
      • Policies can’t be directly attached to AWS resources (e.g EC2 instances). But the policies (one or more) are attached to a role and then role (single only) is attached to the instances.
        • Only one role can be attached to an AWS service at a time. But you can attach multiple polices to a single role. Whereas same role can be attached to multiple AWS services.
        • You can assign the role either during instance provision or after the instance is launched. You can also update the policies which are applied to the role anytime.
      • Roles are temporary security credentials and are managed by Secure Token Service (STS).
      • A rule is granted to:
        • Same Account Role: IAM user in the same AWS account as the rule.
        • Another Account Role:  IAM user in the different AWS account as the rule.
        • Service Role: This role is assumed by a service (e.g EC2) to perform actions on your behalf.
        • Web Identity Role: An external user authenticated by an external identity provider (IDP) service that’s compatible with SAML 2.0 or OpenID Connect or a custom built identity broker.
        • SAML Federation role: Allows users that are federated with SAML 2.0 to assume this role to perform actions in your account.
      • Temporary Credentials (STS): Temporary credentials are primarily used with IAM roles that are used to provide temporary access (through STS) onto AWS resources. They expired automatically after a set period of time.
      • Instance Profiles: Its required to assign an AWS role and its associated permissions to an EC2 instance, and to make them available to apps running on the EC2 instance.
      • Role Delegation: To delegate permission to access a resource you create an IAM role that has two policies:
        • Permission policy (JSON): Where the actions and resources are defined that the role can use
        • Trust policy (JSON): Specifies which trusted accounts which are allowed to grant its users permissions to assume the role.
      • Roles are best suited in the following situatitons:
        • You are creating an app that runs on an EC2 instance and that app makes requests to AWS resources e.g S3 buckets. So you have to create an IAM role and attach it to the EC2 instance.
        • You are creating an app that runs on a mobile phone and that makes requests to AWS resources. You can use identity login via providers e.g Amazon Cognito or Google to authenticate users and map users to an IAM role.
        • Users in your company are authenticated in your corporate network and want to be able to use AWS without having to sign in again, that’s you want to allow users to federate into AWS. So you have to configure federation relationship between your enterprise identity system and AWS.
    • IAM Best Practices:
      • Lock away your AWS Root account user access keys (access key ID and secret access key).
      • Create individual IAM users
      • Use AWS Defined Policies to assign permissions whenever possible
      • Use Groups to assign permission to IAM users
      • Follow Principle of Least Privilege
      • Use access levels to review IAM permissions
      • Configure a strong password policy for users
      • Enable MFA for privileged users
      • Use roles for apps that run on EC2 instances
      • Delegate by using roles instead of sharing credentials
      • Rotate credentials regularly
      • Remove unnecessary credentials that are not used
      • Use policy conditions for extra security
      • Monitor activity in your AWS account.
    • Cross Account Access: Granting access to resources in one account to a trusted principal in a different account.
      • A user in one account can switch to a role in the same or a different account.
      • It allows you to work productively within a multi-account (or multi-role) AWS environment by making it easy for you to switch roles within AWS Management Console.
      • You can now sign into console using your IAM user name, then switch the console to manage another account without having to enter another user credentials.
    • Key Managed Service (KMS):
      • Managed service that allows you to create and control your encryption keys (Customer Master Keys/CMK).
      • Advantages of KMS over HSM are:
        • Can use IAM policies for KSM access
        • AWS services integrates directly with KMS.
        • KMS only envelopes one layer and stores the top key that’s CMK.
        • The encrypted data key is stored with the data.
    • Identity Federation: You can allow users who already have credentials elsewhere, e.g in your corporate network (Active Directory) or with an internet identity provider (Facebook, LinkedIn, Google) to get temporary access to your AWS account.
      • SAML 2.0 based federation is used when have users identities in your corporate directory (SAML 2.0 compatible), then you can configure single-sign access to AWS console for your users.
        • If your directory is not compatible with SAML 2.0 then you can develop a identity broker app to provide SSO.
      • Web federation (OpenID) based federation is used when you let users to to authenticate via Internet identity providers such as Facebook, Google. AWS recommends to use Cognito for identity federation with internet identity providers.
    • MFA (Multi-Factor Authentication): With MFA you or your users along with username and password or access key must provide a code from specially configured device (or app on mobile) to login into and work with your accoun
    • PCI DSS Compliance: IAM supports the processing, storage and transmission of credit card data by a merchant or service provider, and has been validated as being compliant with Payment Card Industry (PCI) Data Security Standard (DSS).
    • Single Sign On (SSO): SSO allow users to access  AWS console without having them IAM identity. If your organization has an existing identity system then you might want to create a SSO.
    • ARN (Amazon Resource Names):
    • Amazon CLI (awscli):
      • Install awscli:
        • pip install awscli –upgrade –user
        • aws configure (Access key ID & Secret access key from AIM user)
          • AWS Access Key ID:
          • AWS Secret Access Key:                      (~user/.aws/credentials)
          • Default region name: us-east-1          (~user/.aws/config)
          • P.S. https://docs.aws.amazon.com/general/latest/gr/rande.html
      • S3: (The user/instance must have appropriate S3 Access policy assigned)
        • aws s3 ls [mybucket]
        • aws s3 cp –recursive  s3://mybucket  /tmp  [–region us-east-1]
      • aws ec2 describe-instances | grep -i instanceid
      • aws ec2 terminate-instances –instance-ids <instance-id>
  2. Directory Service:
    • It provides multiple directory choice for customers who want to use existing Microsoft Directory Server or Lightweight Directory Access Protocol (LDAP) aware applications in the cloud.
    • AWS Directory Service includes the following services:
      • Microsoft ADAWS-managed Microsoft Active Directory powered by Windows Server 2012 R2.
        • Microsoft AD is your best choice if you have more than 5000 users and/or need a trust relationship setup between an AWS hosted directory and your on-premises directories.
        • Microsoft AD or Simple AD support automatic and manual snapshots, that can be used to restore your directory data.
        • Its supports AWS apps and services including RDS for Microsoft SQL Server, Amazon WorkSpaces, WorkDocs, QuickSight, Chime, Connect.
        • It comes in two editions:
          • Standard Edition (5,000 users and 30,000 objects)
          • Enterprise Edition (5,000 users and 500,000 objects).
      • Simple ADCreate a Microsoft Active Directory-compatible directory powered by Samba 4 that provides a subset of Microsoft Active Directory features.
        • Its your best choice if you have less than 5000 users and don’t need the more advanced Microsoft AD features.
        • Its available in two sizes:
          • Small (500 users and 2000 objects)
          • Large (5000 users and 20,000 objects).
        • Its not compatible with RDS SQL Server.
        • It doesn’t support trust relationship with other domains.
        • It creates two directory servers and a DNS servers on your behalf.
      • AD Connector: Active Directory Connector is a gateway that sends to your existing on-premises Microsoft Active Directory from AWS services.
        • Its simply connects your existing on-premises AD to AWS. No directory information is replicated into or cached in AWS.
        • It comes in two sizes:
          • Small AD Connector (500 users)
          • Large AD Connector (5,000 users).
        • It requires VPN or DirectConnect connection must be established between on-premises network and AWS.
        • AWS AD connector relies on IAM Roles to provide Security Token Serivce (STS) credentials to the on-premise MS Active Directory authenticated users.
        • Its not compatible with RDS SQL Server.
      • Amazon Cloud Directory: Its a cloud-native directory to store your application’s hierarchical data.
        • Cloud Directory scales to hundreds of millions of objects and offers support for multiple relationships and application-specific schemas.
        • Examples of application directories include device registries, catalogs, organization structures, and network topologies.
      • Amazon CognitoWith Amazon Cognito Your User Pools you can easily and securely add user sign-up and sign-in functionality to your mobile and web apps. This fully managed service scales to support hundreds of millions of users.
  3. Certificate Manager:
    • AWS Certificate Manager (ACM) makes it easy to  provision, manage, deploy, and renew SSL/TLS certificates on the AWS platform.
    • Its free and supports automatic certificate renewal
    • You may import third-party certificate as well
    • Its supported by the following AWS services:
      • Elatic Load Balancer (ELB)
      • API Gateway
      • CloudFront
      • CloudFormation
      • Elastic Beanstalk
  4. CloudHSM:
    • Dedicated physical hardware device managed by AWS, used for storage and management of secure (encryption) keys.
    • AWS have no access onto customer keys or credentials and therefore has no way to recover your keys if you lose your credentials.
    • Its in your VPC and separated from other networks for latency and security reasons.
    • Can be placed in multiple AZs and clustered (load balance and replicate keys).
    • Perfect solution if your organization requires keys to be kept on dedicated hardware.
    • Use case: Asymmetric handshakes can increase processing but can be offloaded to CloudHSM.
    • In some HSMs, there can be any number of key encryption keys (KEK), processing known as enveloping.
  5. AWS WAF and AWS Shield:
    • AWS WAF: Web Application Firewall (WAF) is a web application firewall service that helps protect your web apps from common exploits that could affect app availability, compromise security, or consume excessive resources.
      • Allows for conditions or rules to be set on web traffic on CloudFront or Application Load Balancer.
      • WAF can watch for cross-site scripting, IP addresses, location of requests, query strings and SQL injection.
      • Denial of Service (DoS) attacks: Flooding a system with traffic to overwhelm and prevent legitimate traffic access to resources. Distribute DoS (DDoS) is that same attack from multiple sources or systems.
      • AWS provides resilience for network and transport layer attacks.
    • AWS Shield: It provides expanded DDoS attack protection for your AWS resources. Get 24/7 support from our DDoS response team and detailed visibility into DDoS events.
      • Web application attacks can be handled by AWS Shield.
      • AWS Shield Standard: As an AWS customer, you automatically have basic DDoS protection with the AWS Shield Standard plan, at no additional cost beyond what you already pay for WAF and your other AWS services
      • AWS Shield Advanced: Expands service protection to include Elastic Load Balancers, CloudFront Distribution, Route 53 hosted zones, and Elastic IPs.
        • Advanced DDoS protection:
        • 24/7 DDoS response team (DRT)
        • Visibility and reporting
        • 3,000 $ per month per organization plus data transfer out usage fees.
    • AWS Firewall Manager: It simplifies your WAF administration and maintenance tasks across multiple accounts and resources.
  6. Amazon Inspector: It enables you to analyze the behavior of your AWS resources and helps you identify potential security issues.
    • Target: A collection of AWS resources
    • Assessment Template: Made up of security rules and produces a list of findings
    • Assessment Run: Applying the assessment template to a target
    • Features:
      • Configuration Scanning and Activity Monitoring Engine: Determine what a target looks like, its behavior and any dependencies it may have. Identifies security and compliance issues
      • Built-In Content Library: Rules and reports are built into Inspector. Best practice, common compliance standard, and vulnerability evaluations. Detailed recommendations for resolving issues.
      • AP Automation: Allows for security testing to be included in the development and design stages.
    • Using AWS Inspector: AWS Inspector console, API, SDK and AWS CLI.
    • Note: AWS doesn’t guarantee that following the provided recommendations will resolve every security issue.
  7. GuardDuty: Intelligent threat detection to protect your AWS accounts and workloads.
    • Continuous: Continuously monitor your AWS environment for suspicious activity and generate findings.
    • Comprehensive: Analyze multiple data sources, including AWS CloudTrail events and VPC Flow Logs.
    • Customizable: Customize GuardDuty by adding your own threat lists and trusted IP lists.
    • Uses thread intelligence feeds and machine learning to determine unauthorized or malicious activity. It may include Escalation of privilege, Exposed credentials or communication with malicious sources (IP, URL, Domain)
      • e.g EC2 instance mining bitcoin or serving malware can be detected.
    • Behavior Analysis: Monitor behavior for signs of compromise.
      • Unauthorized infrastructure deployments: Instance deployed in a region that has never been used.
      • Unusual API calls: Password policy change to reduce password strength.
    • Different models for pricing around the world, so consult AWS documentation.
Posted in aws, cloud

AWS Storage Services

  1. Simple Storage Service (S3):
    • Amazon Simple Storage Service is a object based storage for the Internet. It is designed to make web-scale computing easier for developers. Amazon S3 has a simple web services interface that you can use to store and retrieve any amount of data, at any time, from anywhere on the web.
    • Its Object-based i.e allows you to upload files in S3 Buckets, not suitable to install OS or databases on it.
    • S3 is region specific and it has an universal namespace, bucket names (all chars in lower case) must be unique across S3.
    • Files can be from 0 Bytes to 5 TB in size but S3 storage is unlimited.
    • S3 buckets in all region provide read-after-write (immediate or strong) consistency for PUTS of new objects and eventual consistency for overwrite of PUTS and DELETES (i.e for changes/updates to existing objects, can take some time to propagate).
    • You are charges based upon Storage, Requests, Data Transfer and Storage Management Pricing, Transfer Acceleration.
    • You pay for Gb/month storage fee, data transfer in/out of S3 in different regions (over internet), upload or download requests.
    • For high request rate (i.e more than 100 PUT/LIST/DELETE or 300 GET per sec) in S3 introduce randomness to key names, by adding hash keys or random strings as  prefix to the object key name. In such cases, partitions used to store the objects will be better distributed and hence allow better read/write performance for your objects.
    • By default EC2 instance access S3 on public endpoint (IP) over internet, but you can use VPC Endpoint to access S3 from EC2 instance (running in private subnet) on private endpoint (IP), but you can’t connect S3 over VPN.
    • Buckets: Its a unique global root level folder across all S3, and is managed globally.
      • You can create 100 buckets in a single account.
      • Bucket names contains all chars lower case, numbers and hyphen having length between 3-63 chars and can’t be changed after bucket is created.
      • By default bucket and all objects in it are private.
      • Access Logs: You can enable access logging to track access requests to your buckets. The logs can be stored in the same or to another bucket. Each access log record provides details about a single access request, such as the requester, bucket name,  request time, action and status, error code, if any. Access log information is useful in security and access audits.
      • You can store unlimited number of objects in your bucket, but the size of an object should be less than 5 TB.
    • Folders: Any sub-folder created in a S3 bucket.
      • Its used for grouping objects and S3 does this by using key-name prefix for objects.
      • S3 has a flat structure, there’s no hierarchy  as you would like to see in a typical file system.
    • Objects: Files stored or uploaded into a bucket.
      • Object consists of: Key (name), Value (data), VerisionID, Metadata, Subresources (ACLs, Torrent)
      • Successful uploaded files/objects generate a HTTP 200 status code.
      • Objects are cached for the life of the TTL.
      • Objects can be from 0 Bytes to 5 TB in size
      • Objects stored in a S3 bucket in a region and are synced across all AZs in the region.  The objects will never leave that region unless you explicitly move them to another region, or enable Cross Region Replication.
      • Object can be made publicly available via public URL.
    • Permissions:
      • All buckets and objects are private by default.
      • A bucket owner can grant cross-account permissions, to another AWS account (or users in the same account) to upload objects.
      • The access can be granted either by IAM policy or Resource policy:
      • Resource based Policy:
        • Bucket Policy: Are attached to S3 bucket only and permissions are applied to all objects in a bucket.
        • Access Control List(ACLs): Grant access to other AWS account or to the public. Both buckets and objects have ACL. Object ACL allow us to share an S3 object via a public URL.
      • User Access Policies:
        • You can use IAM to manage access to your S3 resources.
        • Using IAM you can create users, group and roles in your account and attach Access Policies to them, that allow them to access S3 or other AWS resources.
      • READ: Allows grantee to list objects in the bucket, and read object and its metadata
      • WRITE: Allows grantee to create, overwrite, and delete any object in the bucket (Not applicible on Object)
      • READ_ACP: Allows grantee to read the bucket or object ACL
      • WRITE_ACP: Allows grantee to write the ACL for the bucket or object.
      • FULL_CONTROL: Allows grantee to have all above permissions on the bucket or object.
    • Storage ClassesAre divided based upon cost, durability and availability:
      • Standard (Durable, Immediately Available, Frequently Accessed):
        • It supports Availability of 99.99% and Durability of 99.999999999% (11*9’s).
        • Its default, general and all purpose storage and is most expensive.
        • It supports data encrypted in-transit and at rest in S3.
        • Its designed to sustain the concurrent loss of two facilities.
      • Reduced Redundancy Storage/S3-RRS (Not-Durable, Immediately Available, Frequently Accessed):
        • It has object Availability of 99.99% and Durability of 99.99%.
        • For non-critical reproducible objects such as thumbnails.
        • Designed to sustain the data loss in one facility only.
        • If object is lost, AWS will return 405 error and S3 can send notification when object is lost.
        • Less expensive than Standard class.
      • Infrequent Access / S3-IA (Durable, Immediately Available, Infrequently Accessed):
        • It has object Availability of 99.9% and Durability of 99.999999999%.
        • It has min 30-day retention period and 128 KB min object size.
        • Designed for less frequently accessible objects and backups.
        • Less expensive than Standard and RRS.
      • Glacier: Designed for long-term archival storage (not to be used for backups) that’s very rarely accessed and its cheapest among all.
        • Object can only be moved by Life Cycle Polices to Glacier from S3-Standard (60 days) or S3-IA (30 days).
        • May take from several mins to hours (3-5 hrs) to retrieve the objects and is the cheapest among all storage classes.
    • Encryption:
      • Server Side Encryption (SSE): Also known as encryption at rest.
        • Data is encrypted by S3 service before its saved into S3 storage disk. And its would be decrypted by the S3 service before you download it.
        • AES-256 (SSE-S3)Server Side Encryption using S3 managed data and master key. The encryption key is stored along with the data.
        • AWS-KMS (SSE-KMS): Server Side Encryption in which AWS manages the data key but customer manages Customer Master Key (CMS) in KMS.
        • SSE-C: Server Side Encryption using Customer managed encryption key.
        • S3 bucket encryption policies override the settings of the folders within them. If you need to use separate encryption keys for some documents within a bucket, you will need to change the settings on each document individually.
      • Client Side Encryption: Also known as encryption in transit.
        • Client encrypts the data on the client side, then transfer the encrypted data to S3 bucket.
        • Client side encryption with KMS managed Customer Master Key (CMK).
        • Client side encryption using Client-Side Master key.
    • Versioning: It’s a feature to manage and store all old/new/deleted versions of objects. Used to protect against accidental object updates or deletion.
      • Its set on the bucket level and applies to all objects.
      • By default its disabled but once enabled, then it can only be suspended  and it will apply for newer objects.
      • Older objects’s class automatically changed to Reduced Redundancy Storage (RRS).
      • Versioning can be used with life cycle policies to create a great archiving and backup solution in S3.
      • MFA (Multifactor Authentication) to delete, is a versioning capability that adds another layer of security for:
        • Changing bucket’s versioning state
        • Permanently deleting an object version
      • Only S3 bucket owner can permanently delete objects once versioning is enabled.
      • When you delete an object, a DELETE marker is placed on the object. When you delete the DELETE marker, the object will be available again.
      • Suspending versioning only prevents new versions from being created. All object with existing versions will maintain their old versions.
    • Lifecycle Policies (Object): Set of rules that automate the migration an object’s storage class to a different storage class (e.g from Standard to IA) and delete based on time interval.
      • By default life cycle policies are disabled on a bucket/object.
      • Can be used with versioning (current and previous versions) to create a great archiving and backup solution in S3.
      • Transition actions:
        • S3-Standard to S3-IA: Minimum 30 days and 128 KB.
        • S3-Standard to Glacier: Minimum 0+ days.
        • S3-IA to Glacier: Minimum 0 days.
      • Expiration actions: Minimum 0+ days.
      • You can’t use life cycle policies to move an archived object from Glacier to S3-Standard or S3-IA.
      • You can’t change an object from S3-Standard or S3-IA into RRS.
    • EventsS3 Event notification allows you to setup automated communication between S3 and other AWS services when a selected event occurs in S3 bucket.
      • Common event notification triggers include:
        • RSSObjectLost (Used for automating the recreation of lost RSS objects)
        • ObjectCreated (Put, Post, Copy)
        • CompleteMultiPartUpload
      • Even notification can be sent to SNS, Lambda, SQS queue.
      • CloudTrail captures all API requests made to S3. By default it logs bucket level actions but you can configure it to log object level actions as well. The CloudTrail logs are stored in a S3 bucket.
      • S3 metrics that can be monitored by CloudWatch include:
        • S3 Requests, Bucket Storage, Bucket Size, All requests, HTTP 4XX messeges, 5XX errors.
    • Static Web Hosting: Amazon S3 provides an option for a low cost, highly reliable web hosting service for static web sites.
      • Its server-less, very cheap and scales automatically.
      • When enabled it provide you a unique endpoint url that you can point to a properly formatted file stored in S3 bucket. It supports: Images, VideosHTML, CSS, JavaScript
      • Amazon Route 53 can also map human readable domain names to static web hosting buckets, which are ideal for DNS failover solution.
        • http://&lt;bucketname>.s3-website-us-east-1.amazonaws.com/
      • It supports HTTP (NOT https) connection and publicly readable contents only.
      • Enable website hosting to your bucket and must specify default Index document and Error document (optional).
      • You can redirect to another object in the same bucket or to an external URL.
    • Cross-Origin Resource Sharing (CORS): Its a method of allowing a web application located in one domain to access and use the resources in another domain.
      • This allows web applications running JavaScript and HTML5 to access resources in S3 buckets without using a proxy server.
      • If enabled, then a web app hosted in a S3 bucket can access resources in another S3 bucket.
    • Cross Region Replication (CRR): Cross-region replication is a bucket-level configuration that enables automatic, asynchronous copying of objects across buckets in different AWS Regions.
      • Versioning must be enabled on both source and destination buckets.
      • The source and destination buckets must be in different regions and replicating to multiple buckets is not allowed (one-to-one relationship).
      • The buckets can be owned by different accounts but S3  must have permissions (IAM Role to read objects and ACLs) to replicate objects from source to destination bucket.
      • You can replicate all or subsets of objects with specific key name prefixes.
      •  The existing objects (exist before CRR is enabled) and the the objects created with SSE-C or SSE-KMS will not be replicated.
      • Deleting individual versions or delete markers will not be replicated.
      • AWS will encrypt data in-transit across regions using SSL.
      • For existing objects you need to copy the objects yourself using cli:
        • pip install awscli –upgrade –user
        • aws configure
        • aws s3 cp –recursive s3://src  s3://dst  (Copy from src to dest bucket)
        • aws s3 ls dst
    • Transfer Acceleration: It utilizes the CloudFront Edge Network to accelerate your uploads to S3 bucket. Instead of directly uploading files to your S3 bucket, you can use a distinct URL to directly upload to an edge location close to you, which will then transfer that file to S3 bucket.
      • You will get a distinct URL to upload to:
      • Once enabled it can only be suspended but can’t be disabled.
      • Bucket names must be DNS compliant and must not have period (.) in the bucket name.
      • If there’s no speed enhancement, then there would be no charge using it.
      • You can use multi-part uploads and no data is cached at Cloudfront edge locations.
    • Pre-Signed URLs:
      • A pre-signed URL gives you access to the S3 object identified in the URL, provided that the creator of the pre-signed URL has permissions to access that object. That is, if you receive a pre-signed URL to upload an object, you can upload the object only if the creator of the pre-signed URL has the necessary permissions to upload that object.
      • All objects and buckets by default are private. The pre-signed URLs are useful if you want your user/customer to be able to upload a specific object to your bucket, but you don’t require them to have AWS security credentials or permissions.
      • When you create a pre-signed URL, you must provide your security credentials and then specify a bucket name, an object key, an HTTP method (PUT for uploading objects), and an expiration date and time. The pre-signed URLs are valid only for the specified duration.
      • The URLs can be generated using SDKs for Java and .Net, it can be used to download and upload objects from/to S3.
    • File Upload options:
      • Single Operation Upload: Its a traditional method in which you upload a file in one part. Can upload a file up to 5 GB, however any file over 100 MB should use multipart upload.
      • Multipart Upload: It allows you to upload a single file as a set of parts and all parts can be uploaded concurrently.
        • After all parts are uploaded, then AWS S3 assemble these parts and create the object.
        • It must be used for objects over 5 GB and up to 5 TB in size. But its recommended to use it for objects over 100 MB.
    • AMZ Import/Export Disk: It accelerates moving large amounts of data (up to 16 TB) into and out of the AWS cloud using portable external disks for transport. You mail on-premise data in a storage device to AWS data center, that would be imported into S3 (can export also) or EBS in one business day.
    • Snowball: AWS Snowball is a service used to transfer petabyte-scale data in and out of AWS cloud at faster-than-Internet speeds and harness the power of the AWS Cloud locally using AWS-owned appliances. Its used to import/export data from/to S3.
      • Snowball: Its 80 TB data transfer device that uses secure appliances to transfer data into and out of S3.
      • Snowball Edge: Its a 100 TB data transfer device with on-board storage and compute capabilities.
      • Snowmobile: Its an Exabyte-scale data transfer service used to move extremely large amounts of data to AWS data center. You can transfer up to 100 PB per Snowmobile, a 45-foot long rugged sized shipping container, pulled by a semi-trailer truck.
  2. Elastic Block Store (EBS):
    • Allows you to create block based storage volumes and attach them to EC2 instances. Once attached, then you can  create a file system on top of these volumes or run a database.
    • EBS volumes (virtual hard-disk) are persistent and are block based network attached storage.
    • EBS volume can only be attached to a single EC2 instance and both must be in the same AZ.
    • EBS by default stores two copies of each volume in the same AZ, that helps with hardware failures but is not intended to help AZ failure. AWS recommends to keep EBS volume snapshots (stored in S3) for backup and durability.
    • Delete on Termination is checked on by default for EBS root volume but for data/additional volumes its unchecked. So the root volume is deleted automatically when the instance is terminated.
    • EBS volume types and size can be changed while instance is running but best practice is to stop the instance.
    • A volume can be copied into another AZ by taking a snapshot of it, and then creating a new volume out of the snapshot in the desired AZ.
    • The root device for an instance launched from the AMI is an Amazon EBS volume created from an Amazon EBS snapshot.
    • EBS volume backed instances can be stopped and the volume remains attached to the instance, the data is not erased, but you will be charged for the volume.
    • It has 99.999% availability. 5,000 EBS volumes and 10,000 snapshots can be created per account.
    • Server-side Encryption:
      • EBS root volumes can’t be encrypted from the console, but you can either:
        • MS Encrypted File System, MS Bitlocker, Linux dmcrypt or any third party tools to encrypt root volume.
        • First take a snapshot of root volume, and then create a new encrypted volume from it, and finally boot a new instance from the encrypted root volume.
      • Additional data volumes can be encrypted from the console using KMS.
      • All EBS volume types support server-side encryption, but not on all EC2 instance types support server-side encryption (e.g T2 Micro/free tier instance family doesn’t support EBS volume encryption).
      • EBS volumes or snapshots are encrypted by Customer Master Keys (CMKs) which are managed by Key Management Service (KMS).
    • Snapshots:
      • EBS Snapshots are point-in-time Images/copies of your EBS volumes.
      • EBS volumes are backed up by snapshots which are asynchronous and incremental, and are stored in S3 (that can be viewed by EC2 API only).
        • When you delete a snapshot, only the data unique to that snapshot is removed.
        • Each snapshot contains all of the information needed to restore your data (from the moment snapshot was taken) to new EBS volume.
      • Encryption:
        • Snapshot of an encrypted volume is automatically encrypted and a volume restored from the encrypted snapshot is also encrypted automatically.
        • Whereas, the snapshots of an unencrypted volume is automatically unencrypted and a volume restored from the unencrypted snapshot would also be unencrypted.
        • Un-encrypted snapshots can be shared with AWS community by setting them as public.
        • Un-encrypted private snapshots can be shared with other AWS accounts, but for encrypted private snapshots you can share via Cross-account permissions that uses custom CMK key.
        • You can’t create an encrypted snapshot of an unencrypted volume or change existing volume from unencrypted to encrypted. You have to create a new encrypted volume and transfer data to the new volume.
        • The other option is to encrypt a volume’s data by means of snapshot copying. That’s firstly create a snapshot of your unencrypted EBS volume which would be unencrypted. Secondly, make a copy of the snapshot and apply encryption parameter, the resulting target snapshot will be encrypted. Finally, restore the copied encrypted snapshot to a new volume, which will be encrypted.
      • You can create an image or a volume (in same or separate AZ) from a snapshot.
      • You can copy a snapshot into a separate region (can also encrypt at the same time), and then create a volume or image from it, and finally launch the instance from it in the different region.
      • When you take a snapshot of attached EBS volume that’s in use, the snapshot exclude data cached by applications or operation system.
      • To create snapshot of root EBS volume, the instance should be stopped, whereas for non-root volume its recommended either pause IO activity, or unmount and detach the volume.
      • EBS volumes are AZ specific whereas Snapshots are Region specific.
      • You can create/restore a snapshot to an EBS volume of the same or larger size (but NOT smaller size) than the original volume size from which snapshot was initially created.
      • Volumes created from an EBS snapshot must be initialized. Initializing occurs the first time a storage block on the volume is read, and the performance impact can be up to 50%. You can avoid this impact in production environments by manually reading all the blocks.
      • To take snapshots of RAID array you need to either freeze filesytem or suspend disk IO, unmount RAID array or shutdown the associated instance.
      • Note that you can’t delete a snapshot of the root device of an EBS volume used by a registered AMI. You must first deregister the AMI before you can delete the snapshot.
    • EBS Volume Types:
      • General Purpose SSD (GP2): Default general purpose, balances both price and performance, DEV/TST environments.
        • Min: 1 GiB, Max: 16384 GiB
        • Baseline of 3 IOPS per GiB, with a minimum of 100 IOPS, burstable to 3000.
      • Provisioned IOPS SSD (IO1): Designed for I/O intensive apps such as large Relational or NoSQL databases (10-20k  IOPS).
        • Min: 4 GiB, Max: 16384 GiB
        • Min: 100 IOPS, Max: 32000 IOPS (EBS-optimized instances)
        • Maximum ratio of 50:1 is permitted between IOPS and volume size e.g if volume size is 8 GB then you can have max 400 IOPS.
      • Throughput Optimized HDD (ST1): Used for frequently accessed workload e.g Big data, Data warehousing, Log processing, Streaming.
        • Min: 500 GiB, Max: 16384 GiB
        • Min: 100 IOPS, Max: 20000 IOPS
        • Can’t boot volume from it.
      • Cold HDD (SC1): Used for less frequently accessed data.
        • Min: 500 GiB, Max: 16384 GiB
        • Min: 100 IOPS, Max: 20000 IOPS
        • Can’t be boot volume.
      • Magnetic (Standard): Cheap, infrequently accessed storage.
        • Min: 1 GiB, Max: 1024 GiB
        • Min: 100 IOPS, Max: 20000 IOPS
        • Can be used to boot volume from it
    • Instance Store volume:
      • Its a Ephemeral Block Storage device and is a virtual hard disk that’s allocated to the EC2 instance (guest) on physical hostIt exists for the duration of instance life cycle and  limited to 10 GB in size.
      • You can attach additional instance store volumes during launch only. After the instance is launched, then you can attach EBS volumes only.
      • Instance-Store backed (root volume) EC2 instances can’t be Stopped but can be Rebooted (data is preserved) or Terminated (data will be lost).
      • Instance store-backed EC2 instances boot from an AMI stored in S3.
      • Use Instance Store over EBS, if very high IOPS rate is required.
  3. Elastic File Systems (EFS):
    • Amazon EFS provides block based file storage for use with your EC2 instances.
    • EFS storage capacity is elastic , grows and shirks automatically as you add and remove files.
    • It allows to be mounted and shared among multiple EC2 instances.
    • EFS file system can be mounted on-premises servers when connected to your VPC via Direct Connect.
    • It supports Network File System v4 (NFSv4) protocol and data is stored across multiple AZs within a region.
    • Best performance when using EC2 AMI with Linux kernel 4.0 or newer.
    • You only pay for the storage you use (unlike EBS, with EFS no pre-provisioning required).
    • It can scale up to Petabytes and can support thousands of concurrent NFS connections, and provide read after write consistency.
    • Amazon EBS is designed for application workloads that benefit from fine tuning for performance, cost and capacity.
    • Typical use cases include Big Data analytics engines (like the Hadoop/HDFS ecosystem and Amazon EMR clusters), relational and NoSQL databases (like Microsoft SQL Server and MySQL or Cassandra and MongoDB), stream and log processing applications (like Kafka and Splunk), and data warehousing applications (like Vertica and Teradata).
    • Mount Target: Instances connect to a file system by using a network interface called a mount target. Each mount target has an IP address, which AWS assign automatically or you can specify.
      • P.S. You must assign default security group to the instance to successfully connect to EFS mount target from the instance.
    • File Syncs: EFS File Sync provides a fast and simple way for you to securely sync data from existing on-premises or in-cloud file systems into EFS file systems.
      • To use EFS File Sync, download and deploy a File Sync agent into your IT environment, configure the source and destination file systems, and start the sync. Monitor progress of the sync in the EFS console or using AWS CloudWatch.
  4. Glacier:
    • Designed for long-term archival storage (not to be used for backups) that’s very rarely accessed. Amazon Glacier is a secure, durable, and extremely low-cost cloud storage service for data archiving and long-term backup.
    • It is designed to deliver 99.999999999% durability, but there’s no SLA on availability.
    • Glacier provides query-in-place functionality, allowing you to run powerful analytics directly on your archive data at rest.
    • Object are moved by Life Cycle Polices from Standard or IA to Glacier storage class.
    • Archive objects are not for real time access, you need to submit a retrieval request, then data is copied into S3-RRS by AWS (which can take from mins to hrs). Then you can download it from there in 24 hrs (the time period can be mentioned during retrieval request).
      • You can’t use AWS Console for archive jobs retrieval.
      • SNS can be used to notify you, when a retrieval job is completed.
      • You pay for Glacier archive itself and the restored copy into S3-RRS for the duration you specify during retrieval request.
    • Its designed to sustain loss in two facilities.
    • You need to keep your data for a minimum of 90 days.
    • All data is encrypted automatically at rest using AES-256.
    • It doesn’t archive object metadata, you need to maintain a client-side database to maintain this information.
    • You can upload archives to Glacier from 1 Byte to 40 TB. File sizes from 1 byte to 4 Byte can be done in one shot. Whereas for file sizes larger than 100 MB its recommended to use multi-part upload.
    • Upload is synchronous on multiple facilities but download is asynchronous.
    • You can upload files directly from CLI, SDK or through APIs but not from AWS console.
    • Its recommended to group many smaller files into a single tar or zip file to reduce overhead charges (i.e 32-40 KB for indexing and archive metadata). If you need to access a file into an archived file, then make sure you use the compression techniques that allows to access individual files.
    • If you delete your data from Glacier before 90 days from when it was archived, then you will be charged a deletion fee.
    • Expedited Retrieval (1-5 mins): More expensive, use for urgent requests only.
    • Standard Retrieval (3-5 hrs):  Less expensive, you get 10 GB data retrieval free per month.
    • Bulk Retrieval (5-12 hrs): Cheapest, use to retrieve large amounts up to Petabytes of data in a day.
  5. Storage Gateway:
    • Its a service that connects an on-premises software appliance with a cloud-based storage to provide seamless and secure integration between an organization’s on-premises IT environment and AWS’s storage infrastructure.
    • The service enable you to securely store data to the AWS cloud for scalable and cost-effective storage.
    • Connects local data center software appliances to cloud based storage such as S3, Glacier or EBS.
    • Its a software appliance available for download as a VM image that you install on a host in your data center. It supports VMWare ESXi, Microsoft Hyper-V, Amazon EC2 host platforms.
    • Once its installed and activated with AWS account, then you can use AWS Management Console to create storage gateway option as per requirements.
    • Storage Gateway types:
      • File Gateway (NFS): Store files as objects in S3, with a local cache for low-latency access to your most recently used data.
        • Used for flat files which are stored directly in S3 buckets, and are accessed through a Network File System (NFS) mount point.
      • Volumes Gateway (iSCSI): Block storage in S3 with point-in-time backups as EBS snapshots.
        • It provides an iSCSI target, which enables you to create volumes and mount them as iSCSI devices from your on-premises or EC2 application servers.
        • The volume interface presents your applications with disk volumes using the iSCSI protocol. Its like a virtual hard disk which is based on block storage.
        • Snapshots are incremental backups that capture only changed blocks. All snapshot storage is also compressed to minimize storage cost.
        • Cached Volumes: Low-latency access to your most recently used data. Entire data is stored on S3 and most frequently accessed data is cached in on-premise storage devices. 1 GB to 32 TB in size for Cached Volumes.
        • Stored Volumes:  On-premises data with scheduled offsite backups. Entire data is stored in on-premises storage devices and asynchronously backed up to S3 as incremental snapshots. 1 GB to 16 TB volume size.
      • Tape Gateway: Also known as Virtual Tape Library (VTL). Back up your data to S3 and archive in Glacier using your existing tape-based processes.
        • Used for backup and uses popular backup applications like NetBackup, Backup Exec, Veeam etc.
Posted in aws, cloud

AWS Database and Analytics Services

  1. Relational Database Services (RDS/Full-managed SQL Databases):
    • RDS provides you with multiple options for hosting a fully-managed relational database on AWS. RDS provides many advantages over hosting your own database server, including automated backups, multi-AZ failover, and read replicas.
    • Its an Online Transaction Processing Database (OLTP) and SQL database service that consists of six database engines that include:
      • AWS Aurora
      • MySQL
      • MariaDB
      • PostgreSQL
      • Oracle
      • Microsoft SQLServer
    • It doesn’t allow access to the underlying operating system and its fully-managed by Amazon. AWS is responsible for:
      • Security, patching and update of the database and underlying OS
      • Automated backup for your DB instance
    • You are responsible for:
      • Managing DB settings
      • Building a relational DB schema
      • DB performance tuning
    • Two licensing models: Bring Your Own License (BYOL), License Included.
    • Up to 40 database instances per account, 10 of this can be Oracle or MS SQLServer under License Included model. But for BYOL model, all 40 can be any DB engine.
    • Maximum storage capacity for all DB instances is 16 TB except for Aurora (64 TB) and MS SQLServer (4 TB).
    • RDS use EBS volumes for DB and Logs storage.
    • It has the ability to provision/resize hardware on demand for scaling.
    • Automated minor updates and backups, and recovery in event of a failover.
    • When RDS (db) instance is deleted , all automated backups, system snapshots and point-in-time recovery are removed.
    • Use IAM policies on users, groups or roles to limit access (Least Privilege Rule).
    • Sharding: Its a common concept used to improve performance by splitting data across multiple tables in a database.
    • Parameter Groups: You manage your DB engine configuration through the use of parameters in a DB parameter group. DB parameter groups act as a container for engine configuration values that are applied to one or more DB instances.
    • Multi-AZ Failover: Its a synchronous replication from production (primary) database to the standby database (in different AZ and same Region) and used for Disaster Recovery (DR) or fail-over.
      • AWS will automatically fail-over (in few mins) to standby database by updating alias CNAME DNS record (endpoint) from primary to the standby instance IP, in the event of:
        • Service outage in an AZ
        • Primary DB instance failure
        • Instance server type is changed
        • Updating software version
        • Promote standby to primary instance by manual failover (reboot with fail-over on primary)
      • Note that Multi-AZ deployments don’t failover automatically in response to database operations such as long running queries, deadlocks or database corruption errors.
      • You can’t read/write onto standby RDS DB instance in Multi-AZ.
      • In order for Multi-AZ to work, your primary database instance must be launched into a subnet group, just like EC2 instance. So same security/connectivity rules, and highly available/fault tolerant concepts apply.
      • Automatic backups and snapshots are taken against the standby instance to reduce I/O freeze and slow down on primary.
      • For Aurora, Multi-AZ is turned-on by default but for other AWS supported database types you have to turn it on explicitly.
      • You will be alerted by DB instance event when a fail-over occurs via SNS notification.
      • Through CLI or API you can list RDS events in past 14 days but from AWS Console you can view events of last 1 day only.
      • You can manually upgrade DB instance to a supported DB engine version from AWS console but upgrade will be applied on both primary and standby at the same time. So its advised to do it during change/maintenance window.
      • Make sure security group and Network ACLs on both primary and standby instances are allowed to be accessed by the app.
      • Multi-AZ deployments for the MySQL, MariaDB, Oracle, and PostgreSQL engines utilize synchronous physical replication to keep data on the standby up-to-date with the primary. Multi-AZ deployments for the SQL Server engine use synchronous logical replication to achieve the same result, employing SQL Server-native Mirroring technology.
    • RDS Read Replicas: Its a asynchronous replication from production (master/primary) database to read replica (in another AZ or region), and used for improving performance.
      • Read replica is a replica of the primary RDS DB instance, but they can only be used for read actions (NO write queries). You can have 5 read replicas by default on production database and currently its supported only on Aurora, MySQL, MariaDB, PostgreSQL (NOT supported on Oracle and SQL Server).
      • Must have automated backup turned on in order to deploy a read replica.
      • You can have read replica in another region.
      • You can have read replicas that have Multi-AZ enabled.
      • And you can promote read replica to primary database in case of disruption.
      • Each read replica have its own DNS endpoint.
      • CloudWatch can be used to monitor replication lag.
      • The additional read replicas can be created from existing replica, and you can have read replicas of read replicas.
      • It should be used for high volume, non-cached databases read traffic (elasticity), running data warehousing, rebuilding indexes and import/export data into RDS.
      • Read Replicas are supported with transactional DB storage engines, and are supported on InnoDB engines not MyISAM (MySQL, MariaDB, PostegreSQL).
      • If the primary instance (source in a Multi-AZ deployment) fails over to the secondary, any associated Read Replicas are switched to use the secondary as their replication source.
      • When you delete primary (source) DB instance, then you need to delete read replicas manually, if you don’t delete read replica then it would be promoted to a stand-alone, single AZ DB instance.
      • You can only scale up the compute and storage capacity and type, of your existing RDS DB instance.
    • Automated BackupsAWS provides automated point-in-time automated backups against RDS DB instance.
      • Automated backups are enabled by default and are stored in S3, you get free storage space equal to the size of the DB.
      • Automatic backups are taken in a defined backup window.  If you don’t specify a preferred backup window when you create the DB instance, RDS assigns a default 30-minute backup window which is selected at random from an 8-hour block of time per region.
      • Automated backups allows you to recover your database to any point in time within a retention period of 1 (i.e default) to 35 days. You can disable automatic backups by setting Retention Period to 0.
      • Automatic or manual backup will create a storage volume snapshot of entire DB instance not just individual DBs.
      • The DB instance must be in active and transactional state for the automatic backups to happen.
      • All automated backups are removed when RDS DB instance is deleted.
      • They take a full daily snapshot and will also store transaction logs throughout the day.  When you do a recovery, AWS will first choose the most recent daily back up, and then apply transaction logs relevant to that day. This allows you to do a point in time recovery down to a seconds.
      • RDS automated backups and DB Snapshots for MySQL database engine are supported for InnoDB storage engine only.
      • Database Snapshots: DB Snapshots are done manually by the user and are stored in S3.
        • They are not deleted automatically when you delete RDS DB instance. Its recommended to take a final snapshot before deleting RDS DB instance.
        • Whenever you restore either Automatic Backup or a manual Snapshot, the restored version of the database will be a new RDS instance with a new DNS endpoint.
          • You need to apply custom DB parameters and security group settings after restore is complete.
          • You can change storage type (magnetic, Provisioned IOPS, General purpose) during restore process.
        •  I/O operations are suspended when the snapshot is being taken.
        • DB manual snapshots are not used for point-in-time recovery.
        • DB snapshots can be shared with other AWS accounts directly.
    • DB Subnet groups: Its a collection of subnets in a VPC that you want to be allocated for DB instance launched into.
      • Each subnet group must have at least one subnet in each AZ in a region.
      • During creating RDS instance you can select a preferred AZ, and specify which Subnet Group, and subnet of that group.
    • Encryption: Encryption at rest is done by Key Management Service (KMS) and is supported by all database types in AWS (Except t2 micro instance type).
      • Once your RDS instance is encrypted, the data stored in underlying storage, its automated backups, snapshots, read replicas are also encrypted.
      • At present time you can’t encrypt an existing un-encrypted DB instance, but you can create a snapshot of it and make a copy of that snapshot and then can encrypt the copy.
      • RDS supports SSL encryption between App Instance and RDS DB instance. RDS generates the certificate for the instance.
      • MySQL, Oracle, MS SQL Server have cryptographic functions at the platform level, and the keys are managed at application level, so it must reference the encryption and key in queries on encrypted database fields.
    • Aurora:
      • Its is a MySQL and PostgreSQL compatible enterprise-class relational database engine, that provides up to 5 times of MySQL and 3 times of PostgreSQL throughput, at 1/10th the cost of commercial databases and it offers greater than 99.99% availability.
      • Start with 10 GB up to 64 TB of auto-scaling SSD storage (in 10 Gb increments).
      • Compute resources can scale up to 32 vCPUs and 244 Gb memory (during maintenance window).
      • 6-way replication across three Availability Zones (2 copies of data storage in each AZ).
      • Up to 15 Aurora Read Replicas and up to 5 MySQL Read Replicas with sub-10 ms replica lag.
      • Automatic monitoring and fail-over in less than 30 seconds, during fail-over RDS will promote the replica with highest priority to primary. Priority tier logic: tier-0 > .. > tier-15.
      • Designed to transparently handle, the loss of 2 write and 3 read copies without affecting availability.
      • Aurora storage is self-healing. Data blocks and disks are continuously scanned for errors and repaired automatically.
    • Storage Type: (Applicable only for MySQL, MariaDB and PostgreSQL)
      • General Purpose (SSD) storage is suitable for a broad range of database workloads. Provides baseline of 3 IOPS/GB and ability to burst to 3,000 IOPS.
      • Provisioned IOPS (SSD) storage is suitable for I/O-intensive database workloads. Provides flexibility to provision I/O ranging from 1,000 to 30,000 IOPS.
    • Database Migration Service (DMS): It helps you to migrate databases to AWS easily and securely.
      • The source database remains fully operational during migration, minimizing downtime to applications that rely on the databases.
      • The service supports homogenous migrations such as Oracle to Oracle, as well as heterogeneous migrations between different database platforms, such as Oracle to Aurora or Microsoft SQL Server to MySQL.
      • DMS tasks require at least a source, a target, and a replication instance.
        • Your source is the database you wish to move data from and the target is the database you’re moving data to.
        • The replication instance processes the migration tasks and requires access to your source and target endpoints inside your VPC.
      • If you’re migrating to a different database engine, AWS Schema Conversion Tool can generate the new schema for you.
  2. DynamoDB (Serverless NoSQL Database):
    • Amazon DynamoDB is a fully managed fast and flexible NoSQL database service for all applications that need consistent, single-digit millisecond latency at any scale.
    • Non-relational (NoSQL) database uses a variety of data models i.e document, key-value, graph and columnar. DynamoDB supports  document and key-value data model. Its similar to MongoDB but is a home grown AWS solution.
    • It synchronously replicate data across three facilities in a Region to provide fault tolerance.
    • DynamoDB is a schema-less database that only requires a table name and primary key.
      • The table’s primary key is made up of one or two attributes that uniquely identify items, partition the data, and sort data within each partition.
      • Aggregate size of an item can’t exceed 400 KB including primary key and all attributes.
      • Items larger than 400 KB can be stored in S3 and their pointers can be used in DynamoDB.
      • One read capacity unit represents one strongly consistent read per second, or two eventually reads per second for an item up to 4 KB in size.
      • One write capacity unit represents one write per second for an item up to 1 KB in size.
      • Reads are cheaper than writes when using DynamoDB.
      • There’s no limit to the number of items (data) you can store in a DynamoDB table.
      • 10,000 write and read capacity units/sec per table and 20,000 write and read capacity units/sec per account. DynamoDB can throttle exceeded requests.
      • 256 tables per account per region. No limit on size of any table.
    • You specify the required throughput (read and write) capacity and DynamoDB does the rest being fully-managed.
      • Service manages all provisioning and scaling of underlying hardware. Fully distributed and scales automatically with demand and growth.
      • It provides for a push button scaling on AWS where you can increase read/write throughput and AWS will go ahead and scale it for you (up or down) without downtime or performance degradation.
        • You can scale up any number of times anytime but you can scale down only 4 times during a calendar day.
    • DynamoDB offers fully managed encryption at rest using KMS managed encryption keys.
    • Best practice to use raw binary or Base64-encoded fields when storing encrypted fields.
    • Use IAM polices on users, groups or roles to limit access (Least Privilege Rule).
    • Stored on SSD storage and spread across 3 geographically distinct data centers in a region.
    • Easily integrates with other AWS services such as Elastic MapReduce and can  move data to a hadoop cluster in Elastic MapReduce.
    • Its a web service that uses HTTPS as transport and JSON as a message serialization format.
    • Popular use cases include:
      • IoT (storing meta data)
      • Gaming (storing session information, learderboards)
      • Mobile (Storing user profiles, personalization)
    • Eventually Consistency Reads (default): Consistency across all copies of data is usually reached within a second. Repeating a read after a short time should return the updated data. (Best Performance Read)
    • Strongly Consistent Reads: A strongly consistent read returns a result that reflects all writes that received a successful response prior to the read.
    • Reserved Capacity: Its a billing feature that allows you to obtain discounts on your provisioned throughput capacity in exchange for a one-time-up-front payment and commitment to a minimum monthly usage level. It applies to a single AWS region and can be purchased with a 1-year or 3-year terms.
    • Default settings:
      • No secondary indexes.
      • Provisioned capacity set to 5 reads and 5 writes.
      • Basic alarms with 80% upper threshold using SNS topic “dynamodb”.
    • DynamoDB stores Unstructured data (audio, video, docs) and semi-structured data (JSON, XML). Whereas structured (schema based) data is stored by RDS.
    • When you copy data from DynamoDB table into RedShift, you can perform complex data analysis queries on that data, including joins with other tables in your RedShift cluster.
    • Its normal used in conjunction with S3, so after storing images in S3, you can store their metadata in DynamoDB table. You can also create secondary indexes for DynamoDB tables.
    • DynamoDB is integrated with Apache Hive, that’s a data warehousing app that runs on EMR. Hive can read and write data in DynamoDB tables, allowing you to:
      • Query live DynamoDB data using SQL-like language (HiveSQL)
      • Copy data from DynamoDB table to S3 bucket and vice versa.
      • Copy data from DynamoDB table into Hadoop Distributed File System (HDFS), and vice versa.
      • Perform join operations on DynamoDB tables.
    • Global Table: global tables provide a fully managed solution for deploying a multi-region, multi-master database, without having to build and maintain your own replication solution. When you create a global table, you specify the AWS regions where you want the table to be available. DynamoDB performs all of the necessary tasks to create identical tables in these regions, and propagate ongoing data changes to all of them.
    • DynamoDB Stream: Its an ordered flow of information about changes to items in an DynamoDB table. When you enable a stream on a table, DynamoDB captures information about every modification to data items in the table.
      • After you enable DynamoDB stream on a table, you can associate the stream ARN with a Lambda function that you write.
      • Immediately after an item is modified in the table, a new record appears in the table’s stream, and Lambda polls the stream and invokes your Lambda function synchronously.
      • You can configure the stream so that the stream records capture additional information, such as before and after images of modified items.
  3. ElastiCache (In Memory Cache Engine)
    • Its a web service that makes it easier to launch, manage, and scale a distributed in-memory cache in the cloud, and is powered by Memcached and Redis engines.
    • Its a fully managed in memory cache engine and is used to improve performance by caching results of database queries.
    • The service improves the performance of web apps by allowing you to retrieve information from fast, managed, in-memory caches instead of relying entirely on slower disk-based databases.
    • Its designed for large, high-performance or taxing queries. It can store the queries to reduce hits to the database.
    • It allows for managing web sessions and also caching dynamic generated data.
    • A cluster is a collection of one or more nodes using the same engine (i.e either Memcached or Redis).
    • Elasticache EC2 nodes can only be accessed from within the same VPC, they neither can be accessed from the internet nor by EC2 instances in other VPCs.
      • Can be On-demand or Reserved instances but not spot instances.
    • The caching strategy you want to implement for populating and maintaining your cache depends upon what data you are caching and the access patterns to that data: Lazy loadingWrite through, Adding TTL.
    • Memcached: Its a widely adopted memory object caching system. But  its not persistent and can’t be used as data store. ElastiCache is protocol compliant with Memcached, so popular tools that use today with existing Memcached environments will work seamlessly with ElastiCache.
      • It can be used to cache contents of DB, cache data from dynamically generated webpages, transient session data, high frequency counters for admission control in high volume web apps.
      • Max 100 nodes per region, 1-20 nodes per cluster (soft limits).
      • Does not support Multi-AZ failover, replication, nor does it support snapshots for backup/restore. So nodes failure means data loss.
    • Redis: Its popular open-source in-memory key-value store that supports data structures such as sorted sets and lists. Its persistent fastest NoSQL database and can be used as persistent database.
      • Supports automatic and manual snapshots to S3. Whereas automatic snapshots are deleted when Redis is deleted.
      • Backup can be used to restore a cluster or seed a new cluster.
      • ElastiCache supports Master/Slave replication (asynchronous) and Multi-AZ is done by creating read replica(s) in another AZ.
      • Shard: Its a primary read/write, from 0-5 read replica nodes.
  4. Amazon Redshift (Petabyte-Scale Data Warehouse)
    • Amazon Redshift is a SQL based data warehouse Online Analytical Processing (OLAP) service.
    • Its fully managed petabyte-scale data warehouse solution that makes it simple and cost-effective to efficiently analyze all your structured data using your existing business intelligence tools.
    • Generally used for big-data analytics and it can integrate with most popular business intelligence tools that include: Jaspersoft, Microstrategy, Pentaho, Tableau, Business Object and Cognos.
    • It uses replication and continuous automatic backups (in S3) to enhance availability and improve data durability, and can automatically recover from component and node failures.
    • Redshift always keep three copies of your data:
      • The original one
      • A replica on compute nodes (within the cluster)
      • A backup copy on S3
    • You can start with a single node that can have 160 Gb storage, but it doesn’t support data replication.
    • For multi-deployment (cluster), you need:
      • Leader Node: Manages client connections and receives the queries.
      • Compute Node(s): Stores data, perform queries and computations.
      • AWS recommends using at least two nodes in production. There can be up to 128 compute nodes.
    • Columnar Data Storage: Redshift organizes the data by column. Unlike row-based systems which are ideal for transaction processing, column-based systems are ideal for data warehousing and analytics, where queries often involve aggregates performed over large data sets. Redshift uses 1024 KB (1 MB) block size to store its data in columnar storage.
    • Advanced Compression: Columnar data stores can be compressed much more than row-based data stores because similar data is stored sequentially on disk. It does not require indexes or materialized views, so uses less space than traditional relational database systems.
    • Massively Parallel Processing (MPP): Redshift automatically distributes data and query load across all nodes.
    • Security: Encrypted in transit using SSL and at rest using AES-256. It uses a hierarchy of encryption keys to to encrypt the database. You can use either KMS or HSM to manage top-level encryption keys in this hierarchy.
    • Currently supports only one AZ but can restore snapshots to a different AZs in the event of an outage.
    • Data warehouse is relational database that’s designed for query and analysis rather than for transaction processing. IT usually contains historical data derived from transaction data, but it can include data from other sources.
      • OLAP database is characterized by relatively low volume of transactions. Queries are often very complex and involve aggregations.
      • OLTP database is characterized by detailed and current data, and a schema used to store transactional data.
    • When you provision Redshift cluster, its locked down by default so nobody have a access to it. To grant other users inbound access to the Redshift cluster, you need to associate the cluster with a security group.
    • If you intend to keep your Redshift cluster running continuously for a prolonged period, then you should consider purchasing reserved node offerings (1 or 3 year duration) which provide significant savings ove on-demand pricing.
    • Redshift achieves efficient storage and optimum performance through a combination of massively parallel processing, columnar data storage, and very efficient targeted data compression encoding schemes.
    • To enable access to the cluster from SQL client tools via JDBC or ODBC from inside EC2 instance, you use VPC security groups.
    • Redshift Enhanced VPC Routing provides Redshift access to VPC resources. Redshift will not be able to access the VPC endpoint for S3 without Enhanced VPC routing. If enhanced routing is not enabled, Redshfit routes traffic through internet, including traffic to AWS services (e.g S3).
    • Backup (Snapshot): Snapshots (Automatic or Manual) are point-in-time backup of Redshift cluster and are stored in S3. If you need to restore from a snapshot, AWS creates a new cluster and imports data from the snapshot that you specify.
    • Cross-Region Snapshots: You can configure cross-regional snapshots when you want Redshfit to automatically copy snapshots (automated or manual) to another region for backup. Coping snapshots from source to destination region will incur data transfer charges.
    • Monitoring (CloudWatch): CloudWatch metrics help you to monitor physical aspects of your RedShift cluster, such as cpu utilization, latency and throughput. Metric data is displayed directly in Redshift console, that you can also view it in CloudWatch console.
  5. Kinesis:
    • It allows to easily collect, process, and analyze video and data streams in real time, so you can get timely insights and react quickly to new information.
    • Kinesis actually can route related data records to the same record processor. Using the Kinesis Client Library (KCL), it can deliver records to the same record processor.
    • Its a real-time data processing service that continuously captures and stores large amounts of data that can power real-time streaming dashboards. Its components are as follows:
      • Kinesis is an managed streaming data service, that provides a platform for streaming data on AWS  and is used for IoT and Bigdata Analytics. It offers powerful services to make it easy to load and analyze streaming data.
      • Stream Data: Streaming Data is generated continuously by thousands of data sources, which typically send in the data records simultaneously, and in small sizes (order of Kilobytes).
      • Producers: Are devices that collect data and input into Kensis. Producers (data sources) include IoT sensors, mobile devices, EC2 instance, eCommerce purchases, In-game player activities, social media networks, stock markets, telemetry.
      • Consumers: Generally EC2 instances consume the streaming data concurrently. And then it may be stored into Real-time dashboards, S3 (storage), Redshift (big data), EMR (analytics), Lamda (event driven actions).
      • Shards (processing power): Its a base throughput unit of stream and a stream is composed of one or more shards. Its a uniquely identifiable group of data records in a stream Each shard can process 2 MB/s output/read and 1 MB/s input/write. Data records are stored in shards in your stream temporarily.
      • Retention Period: Its the time period from when a data record is added to and when its not longer accessible.
    • Kinesis offers following managed services:
      • Data Streams: Its used to collect and process large stream of data records in real time. Ingest and process streaming data with custom applications. Retention period from 24 to 168 hrs (1 to 7 days). The data is consumed by EC2 instances and stored into S3, Dynamo DB, Redshift and Elastic MapReduce.
        • Custom data-processing apps which are known as Kinesis Streams apps read data from a Kinesis stream as data records.
        • These apps use Kinesis Client Library and run on EC2 instances.
        • It replicates data synchronously across three availability zones, providing high availability and durability.
        • Kinesis Data Streams can continuously capture and store terabytes of data per hour from hundreds of thousands of sources such as website clickstreams, financial transactions, social media feeds, IT logs and location-tracking events.
        • A kinesis stream is an ordered sequence of data records  meant to be written to or read from in real time. Data records are therefore stored in shards in your stream temporary.
      • Data Firehose: Its a fully managed service used for automatically capturing real-time data stream from producers (sources) and delivering (saving) them to destinations such as S3, Redshift, Elasticsearch Service, Splunk.
        • Kinesis streams can be used as the source to Kinesis Firehose.
        • With Kenesis Firehose you don’t need to write apps or managed resources.
        • It synchronously replicate data across three facilities in a Region.
        • Each delivery stream stores data records for up to 24 hrs in case delivery destination is unavailable.
        • You can use server-side encryption if using Kinesis stream as your data source.
      • Data Analytics: Its used to process and analyze streaming data in real-time from Kinesis Streams and Firehose using SQL queries. The data can be stored in the S3, Redshift, Elasticsearch cluster.
      • Video Streams: Capture, process, and store video streams for analytics and machine learning.
    • Benefits of Kensis includes:
      • Real-time and parallel processing, Fully-manged and Scalable
    • Applications of Kensis includes:
      • Gaming, Real-time analytics, Application alerts, Log/Event data collection, Mobile data captures.
  6. Elastic MapReduce (EMR / BigData framework):
    • Its web service (managed Hadoop framework) that enable businesses, researchers, data analysts, and developers to easily and cost-effectively process vast amounts of data.
    • Its a service which deploys out EC2 instances based off the Hadoop Big Data framework to analyze and process vast amounts of data.
    • It also supports other distributed frameworks such as: Apache Spark, HBase, Presto, Flink
    • EMR Workflow is divided into following four steps:
      1. Storage: Data that’s stored in S3, DynamoDB, Redshift is sent to EMR.
      2. Mapped: Then data is mapped to a Hadoop cluster of Master/Slave nodes for processing.
        • Mapping phase defines the process which splits the large data in file for processing. The data is split in 128 MB chunks.
      3. Computations (coded by developers): are used to process the data.
      4. Reduced: The processed data is then reduced to a single out set of return information.
        • Reduce phase aggregates the split data back into one data source. Reduced data needs to be stored (e.g in S3) because data processed by EMR cluster is not persistent.
    • Master Node: This is a single node that coordinate the distribution of data and tasks among other (slave) nodes for processing. It also tracks the status of the tasks and monitors the health of the cluster.
    • Slave Nodes: There are two types of slave nodes:
      • Core node: They run tasks and stores data in the Hadoop Distributed File System (HDFS) on the cluster.
      • Task node: They are optional node and only run tasks.
    • You has the ability to access the underlying OS of EC2 instance and can add user data via bootstrapping to EC2 instances launched into the cluster.
    • Also can resize a running cluster at anytime, you can deploy multiple clusters. EMR takes advantage of parallel processing for faster processing of data.
    • AWS provides the AMIs (no custom AMIs).
    • You can use EMR to transform and move large amounts of data into and out of other AWS data stores and databases such as S3 and DynamoDB. e.g it can be used to process log files stored in S3.
    • EMR instances don’t encrypt the data at rest.
    • Data Store:
      • S3 or DynamoDB
        • S3 server-side encryption
      • HDFS (Hadoop Distributed File System)
        • If HDFS, AWS defaults to Hadoop KMS
Posted in aws, cloud

AWS Compute Services

  1. Elastic Compute Cloud (EC2/Virtual Server Based Computing)Provides scale-able virtual servers in the cloud. An EC2 virtual server is known as instance and can be made up of different instance type and size. Most commonly instance operating system are of Linux and Windows flavor.
    • Instance Termination Protection is turned off by default, you must turn it on.
    • EBS root volume is deleted by default when the instance is terminated, but the delete flag can be turned off.
    • EC2 instance can be moved from one to another region by taking snapshot or image of it and then copy that snapshot/image into other region and finally launch the instance from that image/snapshot.
    • You can create AMI images from EC2 instances.
    • EC2 SLA is 99.95% i.e 22 mins per month downtime.
    • 20 soft limit is set for EC2 instances per account.
    • EC2 instance can have root volume (boot) from either EBS or Instance-Store backed block storage devices.
    • EC2 Instances are primarily composed of the following components:
      • Amazon Machine Image (IAM): The OS (and other settings)
      • Instance Type: The Hardware (compute, ram, network bandwidth etc)
      • Network Interface: Public, Private or Elastic IP addresses
      • Storage: The instances hard drive, that include two options:
        • Elastic Block Store (EBS): Persistent (network) storage
        • Instance Store: Ephemeral storage
    • ec2config:  Allows to decrypt the Administrator password for a Windows instance.
      • Remote Desktop Protocol (RDP/3389) accessible servers should use X.509 certificates to prevent identity spoofing.
    • cloud-init: Used for copying public keys on to Linux based EC2 instances.
      • SSH is preferred for administrative connections to Linux servers
    • Amazon Machine Image (AMI)They dictate the instance’s operating system and other software settings
      • Its a preconfigured package (template) that’s required to launch an EC2 instance, that include:
        • Operating system
        • Software packages or applications
        • Other required settings (root storage type, virtualization type, authorized_keys, local accounts, file and directory permissions)
      • AMIs are only available to the region they are created in, but can be copied into other regions, and will receive a new or distinct ID. Each of its backing snapshots is by default copied to an identical but distinct target snapshot.
      • AMI image can be created from EC2 instance and EBS volume or a Snapshot.
      • AMIs can be public or private, but encrypted AMIs can’t be made public.
      •  To create an AMI image for EBS backed instances, stop the instance to ensure data consistency and integrity.
      • AMI image created from EBS backed instances, AWS registers the newly created AMIs automatically, whereas for Instance Store backed instances AMI image, you have to register it manually.
      • During AMI image creation process, EC2 creates a snapshot (in S3) of your instances’s root and data (if any) volumes attached to your instance.
      • De-register AMI image, and then delete the snapshot.
      • AMI image are point in time snapshots of the instance, so they need to be updated frequently to include new and changed configuration standards.
      • AMIs Categories:
        • Community AMIs: Free, generally with these AMIs you are selecting the OS you want.
        • AWS Marketplace AMIs: Paid, generally comes packaged with additional licensed softwares.
        • My AMIs: AMIs that you create for yourself.
      • Types of AMIs:
        • HVM AMI (Hardware Virtual Machine): Run on as bare-metal hardware.
          • This type of virtualization provides ability to run an OS directly on top of a virtual machine without any modification, as if it were run on the bare-metal hardware.
          • Tha EC2 host system emulates some or all of the underlying hardware that’s presented to the guest.
          • Unlike Paravirtual guests, HVM guests take advantage of hardware extensions that provide fast access to the underlying hardware on the host system.
        • PV AMI (Paravirtual): Can run on hardware that don’t have virtualization support.
          • Guests can run on host hardware that doesn’t have explicit support for virtualization, but they can’t take advantage of special hardware extensions such as enhanced networking or GPU processing.
          • Historically, PV guests had better performance than HVM in many cases, but because of enhancements in HVM virtualization and the availability of PV drivers for HVM AMIs, this is no longer true.
          • Supported by only M3 (General Purpose) and C3 (Compute Optimized) instance type families.
    • Instance Types (Flavors)Various configurations of vCPU, Memory, Storage, and Networking capacity for your instances.
      • General Purpose: Balanced memory and CPU
        • T2: General Purpose; Lowest cost/Web servers/Small DBs
        • M3/M4: General Purpose; Application servers
      • Compute Optimized: More CPU than memory, compute intensive use
        • C3/C4: Compute Optimized; CPU Intensive Apps/DBs
      • Memory Optimized: More memory, memory intensive, DB & cache
        • R3/R4: RAM/Memory Optimized; Memory Intensive Apps / DBs
        • X1: Extreme/Memory Optimized; SAP HANA/Apache Spark
      • GPU Compute instances: Graphic optimized, High performance and parallel computing
        • G2: Graphic Intensive; Video Encoding/3D App streaming
        • P2: Graphics Purpose GPU; Machine learning/Bit Coin Mining
      • Storage Optimized: Very High, low latency & intensive IO, IO intensive apps, data warehousing, Hadoop.
        • D2: Dense Storage; Fileservers / Data Warehousing / Hadoop
        • I2: IOPS/High Speed Storage; NoSQL DBs, OLTPs, Data Warehousing
      • F1: Field Programmable Gate Array; HW acceleration for code
    • Elastic Network Interfaces: Public IP, Private IP or Elastic IP
      • Public IPs are configured on IGW of VPC and through NAT protocol its mapped to private IP of EC2 instance.
      • When the instance is stopped, its Public IP/DNS is not retained (But for reboot its retained), whereas the Private IP/DNS and Elastic IP remains same.
      • By default eth0 that’s primary network interface and only ENI created during instance launch, and you can’t detach it.
      • Its bound to an Availability Zone. You can specify which subnet/AZ you want the additional ENI be added in
      • Security group applies to ENI, so if it has multiple IPs then it applies to all those IPs.
      • Public IPs can only be assigned to instances with one network interface. But you can assign Elastic IP manually when you assign additional network interfaces.
      • To attach a network interface in a subnet to EC2 instance (in another subnet), both must be in the same Region and AZ.
      • hot attach: Attaching an ENI when instance is Running
      • warm attach: Attaching an ENI when instance is Stopped
      • cold attach: Attaching an ENI when instance is Launching
    • Key PairsSecure login information for your instances.
      • It consists of a public key that AWS stores on instance, and a private key file that you store.
      • For EC2 instance running Windows, the private key file is required to decrypt Administrator password, that’s used to connect to the instance via Remote Desktop Connect (RDP:3389).
      • For EC2 instances running Linux, the private key file (chmod 400) allows you too ssh EC2 instances, whereas public key is copied onto the instance.
    • Bootstraping & User-Data/Meta-Data:
      • With EC2 we can bootstrap the instance (during the creation process) with custom commands such as installing software packages or running updates.
      • User-Data: A step/section during the EC2 instance creation process where you can include your own commands via a bash script (16 KB size) or cloud-init directives.
      • Instance Metadata: Once you logged into EC2 instance, then you can run the following commands to view instance user-data or meta-data:
    • Placement Group: Its a clustering/grouping of EC2 instances within a single Availability Zone that provide low latency and high network throughput (10Gbps) between the instances.
      • This service is used for applications (e.g Hadoop cluster, Grid computing) that need extremely low-latency and high throughput network connection between instances in the same AZ.
      • Its always in one AZ and never span multiple AZs.
      • It can span Peered VPCs that are in the same Region, however, you do not get full-bisection bandwidth between instances in peered VPCs.
      • The placement group name must be unique within your AWS account and AWS recommend homogenous (same size and family) instances within placement groups.
      • Only certain types of instances (Compute/Memory/Storage/GPU optimized) can be launched in a placement group.
      • You can’t move an existing instance in a placement group, but you can create an AMI image (snapshot) from the existing instance, and then launch a new instance into a placement group.
      • To guaranty the availability of the instances in a single AZ, try to launch all the required instances at the same time.
      • If you receive Capacity Error, when adding  a new instance in a placement group, then stop and start all the instances in the placement group to resolve it (AWS will try to locate/launch them as close as possible)
    • VM Import/Export:
      • Its used to migrate VMWare ESX or WorkStation (VMDK/OVA), Microsoft Hyper-V (VHD), Citrix Xen (VHD) virtual machines to and from EC2.
      • Supports both Windows and Linux VMs.
      • Its supported via API or CLI but not AWS Console.
      • Before generating the VMDK or VHD, make sure VM is stopped (Not in suspended or paused states).
      • For VMware, AWS has a VM Connector which is a plugin to VMware vCenter that allows the migration of VMs to S3 and then convert it to EC2 AMI.
    • EC2 Shared Responsibility Model:
      • The customer is responsible for managing software level security on instances, including:
        • Security groups
        • Firewall (IP tables, Firewalld etc)
        • EBS encryption utilizing KMS and encrypting file system by using different encryption methods.
        • Applying SSL certificates to ELB
      • AWS is responsible for managing the hypervisor and physical layer of security for EC2:
        • DDOS protection
        • Port scanning protection (Not allowed even in your own VPC without pre-approval from AWS)
        • Ingress network filtering
    • EC2 Purchasing Options:
      • On-Demand: Allows you to choose any instance type at anytime and pay a fixed rate by a hour or seconds (Linux only) with no commitment.
        • The most expensive but flexible purchasing option.
        • You are mostly charged when instance is running:
          • Per second pricing: Amazon Linux, Ubuntu AMIs (60 secs min)
          • Hourly pricing: Windows, RHEL or any other AMI (where per second pricing is not indicated).
          • If you stop and start the instance you will be charged for an extra hour.
      • Reserved: Allows you to purchase an instance for set time period of one or three years.
        • This allows significant price discount over using on-demand.
        • You can select to pay upfront, partial upfront or no upfront.
        • Once you buy reserved instance, you own it for selected time period and are responsible for entire price, regardless of how often you use it.
        • Reserved instances scope can be either Region (default) or Availability Zone. AZ scope reserve instances can be sold in AWS Marketplace.
        • After recent AWS updates you can now migrate reserved instances between AZs in the same region, without having to sell and repurchase.
        • Standard Reserved Instances: Up to 75% off on demand
        • Convertible Reserved Instances: Up to 54 % off on demand. Capability to change attributes of the instances
        • Scheduled Reserved Instances: Are available to launch withing time windows your reserve.
      • Spot: Allows you to bid on an instance type, and only pay for and use that instance when the spot (bid) price is equal to below your bid price.
        • This option allows Amazon to sell unused instances, for short amounts of time, at a substantial discount.
        • Spot prices fluctuate based on supply and demand in the sport market.
        • You are charged by minute or hour (same on-demand AMI conditions).
        • When you have an active bid, an instance is provisioned for you when the spot price is equal to or less than your bid price.
        • A provisioned instance automatically will be terminated when the spot price goes higher than your bid price. You will not be charged for the hour, the instance was terminated in.
        • But if you terminate the instance, you are charged for the hour, the instance is terminated in.
        • Enable you to bid whatever price you want for instance capacity, providing for even greater savings if your application have flexible start and end times.
        • Not all instances families are available for spot instances
        • Encrypted EBS volumes are not supported in Spot instances.
        • Suitable for Data analysis, Bach jobs, Background processing, Optional tasks.
        • You can run and scale apps such as stateless web services, image rendering, big data analytics, batch processing jobs and massively parallel computation on Spot instances.
      • Dedicated Hosts: Physical EC2 server dedicated for your use. Dedicated Hosts can help you reduce costs by allowing you to use your existing server-bound software licenses.
    • Auto Scaling: (Scalable and Elastic Architecture): Its a AWS feature that allows your EC2 instances to grow (scale out) or shrink (scale in) depending upon your workload.
      • You can use Auto Scaling to manage Amazon EC2 capacity automatically, maintain the right number of on-demand instances for your application (based upon CloudWatch metrics), operate a healthy group of instances, and scale it according to your needs.
      • It can’t span across multiple Regions.
      • You can add (instance should be running and AMI must still exist) or remove EC2 instances manually to ASG, the instance can be part of only one ASG at a time.
      • When you delete ASG, the EC2 instances attached with it will also be deleted. But if you want to keep the instances, then detach the instances from ASG first, and then delete it.
      • You can attach one or more ELBs to your ASG but they must be in the same VPC and Region.
      • When you have ELBs defined with ASG, you can configure ASG to use both EC2 and ELB Health Checks to determine the instance health status. But only one source reporting the instance as unhealthy is enough for ASG to mark it for replacement.
      • Elastic IPs and EBS volumes gets detached from the terminated instances, you need to manually attach them to the new instances.
      • During a very limited time, you can use the AWS CLI command as-set-instance-health to mark the instance as healthy.
      • Auto Scaling is not meant to handle instant load spikes but is built to grow with a gradual increase in usage over a short time period.
      • Launch Configuration (Reusable Instance Templates): Its the configuration template used to create new EC2 instances for ASG. It include the parameters such as Instance family/type, AMI, Key pair, Block devices and Security groups.
        • You can’t update/change launch configuration, you always have to create new one with required changes and then attach it with ASG.
        • You can use Reserve, On-demand and Spot instances in launch configuration. But you can’t mix on-demand with spot instances in your launch configuration.
        • From the console you will get Basic Monitoring but from CLI  Detailed Monitoring is enabled by default.
        • If EC2 monitoring is detailed (1 min), then set your ASG alarm to 60 secs (1 min).
      • Auto Scaling Group (Automated Provisioning): Its a logical grouping of EC2 instances managed by Auto Scaling Policy.
        • It defines all the rules that govern if/when an EC2 instance is automatically provisioned or terminated.
        • Keep your Auto Scaling group healthy and balanced, whether you need one instance or 1,000.
      • Scaling Policy (Adjustable Capacity): Maintain a fixed group size or adjust dynamically based on Amazon CloudWatch metrics. Determines when and how ASG scales or shrinks.
        • P.S. For architecture to be considered highly available and fault tolerant, it must have an ELB serving traffic to, and ASG with a minimum of two instances located in separate availability zones.
        • Manual Scaling: Manually increase the size of the group. Maintain a current number of instances all the time.
        • Dynamic Scaling (On-Demand or Event based) : You create a scaling policy to automatically increase the size of the group based on a specified increase in demand. Scaling in response to an alarm or event.
          • Simple Scaling: After a scaling activity is started, the policy must wait for the scaling activity or health check replacement to complete and the cooldown period to expire before it can respond to additional alarms.
          • Step Scaling: After a scaling activity is started, the policy continues to respond to additional alarms, even while a scaling activity or health check replacement is in progress.
        • Scheduled Scaling (Cyclic based): You set up scaling by schedule to increase the size of the group at a specific time. Used for predictable load change and occurs at a fixed interval (daily, weekly, monthly, quarterly).
        • Target Tracking Scaling: Its used when scaling is based on a metric, which is an utilization metric that increases or decreases proportionally to the number of instances in ASG. You select a predefined metric or configure a customized metric, and set a target value.
      • Scaling Cooldown: The cooldown period is a configurable setting for your Auto Scaling group that helps to ensure that it doesn’t launch or terminate additional instances before the previous scaling activity takes effect. EC2 Auto Scaling does not support cooldown periods for step scaling or scheduled scaling policies.
      • Instance Warmup: ASG waits for given number of seconds before allowing another scaling activity. With step scaling policies, you can specify the number of seconds that it takes for a newly launched instance to warm up. Until its specified warm-up time has expired, an instance is not counted toward the aggregated metrics of the Auto Scaling group.
    • Elastic Load Balancer: (Distributing traffic) Elastic Load Balancing automatically distributes incoming application traffic across multiple targets, such as EC2 instances, Containers, and IP addresses. It can handle the varying load of your application traffic in a single Availability Zone or across multiple Availability Zones.
      • They don’t have a pre-defined IPv4 addresses, you access them using a publicly resolvable DNS name.
      • At least two AZs must be selected to create the load balancer (For internet facing LB each AZ must have a public subnet).
      • The subnet defined for load balancer should be at least /27 in size and has at least 8 IPs available for ELB nodes to scale.
      • ELB is region specific so all registered EC2 instances must be within the same region. (Use Route53 to load balance across multiple regions).
      • It supports Layer 4 (TCP/SSL) and Layer 7 (HTTP/HTTPS) listeners for both front-end and back-end.
      • Load balancer (front-end) protocol and instance (back-end) protocol must be at the same layer (i.e Layer 7 HTTP/HTTPS or Layer 4 TCP/SSL).
      • To allow the backend EC2 instances to know the actual requester details (Source IP, Source Port, Destination IP…etc):
        • Enabling the X-Forwaded-For on the ELB for HTTP/HTTPS requests
        • Enabling the Proxy Protocol on the ELB for TCP/SSL requests
      • Access Logs: delivers detailed logs of all requests made to ELB. The logs are stored in S3 bucket and it must be in the same region as the ELB. Access logs contains information such as request time, clients’ IP, latencies, request path and server response.
      • Session Stickness (Session Affinity):  ELB binds client requests/sessions to a specific EC2 instances. It requires HTTPS, i.e SSL certificate configured on ELB and SSL should be terminated on it.
        • You can upload SSL certificate in IAM and must be in same region as ELB.
      • It supports TLS 1.0, 1.1, 1.2, SSL 3.0 (TLS 1.3, 2.0 are not supported)
      • ELB doesn’t support Client side certificates with HTTPS (two way authentication).
        • For client side certificates use TCP on ELB for both front and back end, and enable proxy protocol, such that EC2 instances will handle authentication and SSL termination (i.e don’t use HTTPS on ELB).
      • ELB monitoring can be achieved by CloudWatch (1 min frequency) in which you get amount of read requests and latency out of box. Access Logs are disabled by default and are stored in S3, CloudTrail can be stored in S3.
      • ELB is not designed to queue requests, it will return HTTP Error 503 for any requests above its capacity. It may take 1-7 mins to scale more ELB nodes.
      • Re-Resolving DNS: If using a single client for testing ensure your testing tool will enforce the Re-Resolving DNS with each testing request.
      • By default ELB has an idle connection timeout of 60 secs.
      •  Following components are part of the security policy required to negotiate the SSL connection parameters for SSL between a client and the ELB:
        • SSL Protocols
        • SSL Ciphers
        • Server Order Preference
      • External ELB: For Internet facing or external  ELB, you need at least one public subnet in each AZ where ELB is defined.
        •  e.g <elbname>.elb.amazonaws.com
      • Internal ELB: It will have private IP addresses, to which the DNS resolves ELB DNS name.
      • Classic Load Balancer: Classic Load Balancer provides basic and even load balancing across multiple EC2 instances, and operates at both the request level and connection level.
        • Choose a Classic Load Balancer when you have an existing application running in the EC2-Classic network or when all instances contain the same data.
        • It listens for client connection on protocols HTTP, HTTPS (Layer 7), TCP, SSL (Layer 4).
        • Up to 100 listeners can be configured with CLB.
        • 1:1 static mapping between front-end and back-end listeners.
        • ELB forwards the traffic to eth0 on primary IP address.
        • Traffic from your clients can be routed from any load balancer port to any port (1-65535) on your EC2 instances.
        • Enable Cross-Zone: Enabled by default,  ELB will distributes traffic evenly across all targets in the Availability Zones.
        • Enable Connection Draining: When enabled, will allow the ELB to stop sending any new requests to a backend EC2 instance that is being de-registered, without terminating the in-flight sessions to that EC2 instance, while the ELB takes the unhealthy instance out of service.
        • HealthCheck timeout must be less than Interval.
        • For an ELB to serve the traffic for a web server in a private subnet, they must be in the same availability zone as a public subnet that’s going to be associated with ELB.
        • Load balancer will automatically perform the health check on the EC2 instances and only route traffic to instances that pass the health check. The instance which fail the health check will automatically removed from the load balance
      • Application Load Balancer: Its designed for complex load balancing of traffic to multiple EC2 instances using Content-based rules. Choose an Application Load Balancer when you need a flexible feature set for your web applications with HTTP and HTTPS traffic.
        • Operating at the request level, Application Load Balancers provide advanced routing, TLS termination and visibility features targeted at application architectures, including microservices and containers.
        • It functions at application layer (layer 7). It supports HTTP, HTTPS, HTTP/2 and WebSockets.
        • Cross zone load balancing is enabled by default.
        • It supports load balancer generated cookies only and its name is AWSALB.
        • Listeners: It checks for connection requests from client using the configured protocol and port, and forwards the requests to one or more target groups based on the defined rules.
          • Each ALB requires at least one listener and it can support up to 50 listeners.
          • Each listener has a default rule that can’t be deleted.
        • Target Group: Its a logical grouping of targets in a single region.
          • Each target group can be associated with only one load balancer.
          • You can define one protocol and one port per target group.
          • Load balancer routes the requests to the targets in the target group using the protocol and port that you specify and performs the health checks on the targets.
        • Targets: Targets can be EC2 instances, a Microservice/app on an ECS containers or IP addresses (Not public IPs).
          • You can’t mix target of different types in one target group i.e EC2 with ECS or IPs in the same target group.
          • You can register an EC2 instance in a target group multiple times using multiple ports.
          • You can register a target with multiple target groups.
          • If no AZ contains a healthy target, then load balancers nodes route requests to all targets.
        • Rules: They consists of conditions and actions, and provides a link between listeners and target group.
          • Rules are defined on listeners and there can be up to 100 rules in a ALB.
          • Each rule specifies a (optional) condition, target group and a priority. When a condition is met, the traffic is forwarded to the target group.
        • Content-Based Routing: an ALB can route a request to a service based  on the content of the request.
          • Host-based (Domain) Routing: Route traffic based upon host field of HTTP header. You can create ALB rules to route a client request based on the domain name i.e Host (domain and port) field of the HTTP header allowing you to route to multiple domain from the same load balancer.
          • Path-based Routing: Route traffic based upon URL path of HTTP request. It routes incoming HTTP and HTTPS traffic based on the path element of the URL in the request. e.g route requests /images to one and /videos to another target group.
        • It allows containers to use dynamic host port mapping (by using instance ID and port) that support multiple tasks (using the same port) from a single service on the same container instance.
        • You can use Request Tracing (via X-Amzn-Trace-Id header) to track HTTP requests from client to targets or other services.
        • Access log is optional feature that’s disabled by default, by enabling it you can log client’s IP, latency, request path, server response, which will be stored in S3 bucket.
        • It supports for monitoring of health of each service independently, as health checks are defined at the target group level not instance level.
      • Network Load Balancer: Network Load Balancer is best suited for load balancing of TCP traffic where extreme performance is required.
        • Choose a Network Load Balancer when you need ultra-high performance and static IP addresses for your application.
        • Operating at the connection level, Network Load Balancers are capable of handling millions of requests per second while maintaining ultra-low letencies.
  2. Lambda (Serverless computing)It’s a serverless compute platform which means you can run code without provisioning or managing servers. You pay only for the compute time (in milliseconds) you consume, there’s no charge when your code isn’t running.

    • You can just create a Lambda function, drop your code in it and execute it, and it scaled out (not up) automatically.
    • For each request a new Lambda function instance is invoked (i.e 1 event = 1 function).
    • Each Lambda function has an IAM role (execution role) associated with it. You specify IAM role when you create Lambda function. Permissions you grant to this role determine what Lambda function can do when it assumes this role. Such as read an object from S3 bucket or write logs to CloudWatch Logs or polls stream from Kinesis Data Stream or DynamoDB stream.
    • Lambda Resource limits per invocation:
      • Memory allocation range: 128 MB-3008 MB (with 64 MB increments)
      • Ephemeral disk capacity (/tmp space): 512 MB
      • Max execution duration per request: 300 secs
      • Number of file descriptors: 1024
      • Number of processors and threads (combined): 1024
      • Invoke request body payload (RequestResponse/Synchronous): 6 MB
      • Invoke request body payload (Event/Asynchronous): 128 KB
    • You can set up your code to automatically trigger from in response to events in other AWS services e.g:
      • Changes in S3 bucket
      • Updates in DynamoDB table
      • Call it directly from any web or mobile app
      • Custom events generated by your applications or devices
    • For Lambda function to access resources inside your private VPC, you need to provide VPC subnet IDs and security group IDs.
    • AWS X-Ray allows you to detect, analyze and optimize performance issues with your Lambda functions.
    • Languages supported by Lambda:
      • Java, C#, PythonNode.js, Go
    • Lambda event triggers: S3, Kinesis Streams/Firehose, DynamoDB,  SNS, API Gateway, CloudFront, CloudFormation, CloudWatch logs/events, Alexa, AWS IoT, AWS SDKs.
    • Lambda function configs:
      • Allocate memory to Lambda function i.e 128 MB to 3008 MB/3 GB.
      • Maximum execution timeout for Lambda function i.e 3 to 300 seconds.
      • IAM role (execution role).
      • Handler name that refers to the method in your code where Lambda function begins execution.
    • Components of Lambda:
      • Lambda Function: Its your custom code and any dependent libraries.
      • Event Source: An AWS service (e.g SNS) or a custom service, that triggers your Lambda function. Event source mapping is maintained by Lambda for DynamoDB and Kinesis.
      • Downstream Resources: An AWS Service (e.g DynamoDB tables or S3 buckets) that your Lambda functions calls once its triggered.
      • Log Streams: You can annotate your Lambda function code with custom logging statements that allow you to analyze the execution flow and performance of your Lambda function.
      • AWS Serverless Application Model (AWS SAM): AWS SAM is natively supported by CloudFormation and defines simplified syntax for expressing serverless resources.
    • How are you charged for Lambda:
      • Requests (to execute the code)
        • First 1 million requests are free, 0.2$ per 1 million requests thereafter.
      • Duration (the length of time it takes to execute the code in 100 ms), the max threshold is 5 mins.
      • Access data from other AWS services/resources
      • AWS Serverless services: Lambda, S3, DynamoDB, API Gateway
      • AWS X-ray can be used to debug Lambda functions.
    • To help you troubleshoot failures in a Lambda function, Lambda logs all requests handled by your function and also automatically store logs generated by your code through CloudWatch Logs.
  3. Elastic Container Service (ECS / Container Service): Amazon ECS makes it easy to deploy, manage, and scale Docker containers running applications, services, and batch processes. ECS places containers across your cluster based on your resource needs and is integrated with familiar features like Elastic Load Balancing, EC2 security groups, EBS volumes and IAM roles.
    • Its a Docker compatible container service that allows for easy and fast container deployment onto fleets of EC2 instances.
    • Use cases:
      • Create distributed application and microservices
      • Batch and ETL (Extract, Transform and Load) jobs
      • Continuous Integration and Deployment
    • Amazon ECS is a regional service that’s highly available in multiple AZs within a region.
    • ECS Service can only specify a single load balancer or a target group.
    • Dynamic port mapping allows you to have multiple tasks from a single service on the same container instance.
    • With IAM roles for ECS tasks, you can specify an IAM role to be used by the containers in a task.
    • Docker Image: Its a read-only template with instructions for creating a Docker container. Its created from a DockerFile and stored into a Docker Registry or EC2 Container Registry (ECR). Its essentially a snapshot of a container.
      • DockerFile is a plain text file (script) that specifies all of the components that are included in the container.
    • Container: Are method of operating system virtualization that allow you yo run an app and its dependencies in resource-isolated processes. Its built from dockerfile and it contains all the downloaded software, code, runtime, system tools and libraries.
    • Layers / Union File System:
    • Docker Daemon/Engine:
    • Docker Client:
    • Container Registry/Hub: Its a repository where container/docker images are stored and accessed from when needed. It can be either EC2 Container Registry (ECR), Docker Hub or Self-hosted registry.
    • ECS Task Definition: Its required to run Docker containers (10) in ECS. Its a JSON formatted text file (like CloudFormation templates) that contains the blueprint for your application that include:
      • which docker image to use
      • The repository where the image is located in
      •  Which ports should be open on the container instance
      • What data volumes should be used with the containers.
    • ECS Task: An ECS task is the actual representation of the Task Definition on an EC2 instance inside of your container service. ECS Agent will start/stop these tasks based on instructions/schedule.
      • It supports Service Scheduler and Customer Scheduler.
    • Fargate Launch Type: It allows you to run your containerized apps without the need to provision and manage the back-end infrastructure. Just register your task definition and Fargate launches the container for you.
      • Its supports using container images hosted in ECR or publicly on Docket Hub.
      • You are hosting your cluster on a serverless infrastructure that’s managed by ECS by launching your services or tasks using the Fargate launch type.
      • It allows you have least amount of administrative overhead while launching containers.
    • EC2 Launch Type: It allows you to run your containerized apps on a cluster of EC2 instances that you manage.
      • Provision for more control
      • You need to manage EC2 fleet (cluster)
      • Its supports private image repositories.
      • ECS AgentIt runs on each EC2 instance in the ECS cluster and communicates information about the  Running task and Resources utilization on EC2 instances to ECS. Its also responsible for starting/stopping tasks.
      • ECS Container Instances: Its an EC2 instance that’s running ECS container agent and has been registered into a cluster. You must create and assign IAM policy and role before launching container instances.
      • IAM Roles for Tasks, you can specify an IAM role that can be used by the container in a task.
Posted in aws, cloud