AWS CSA Whitepapers

(I) Overview of AWS:

  • Cloud Computing: Its the on-demand delivery of IT resources and applications via the Internet with pay-as-you-go pricing. Cloud computing provides a simple way to access servers, storage, databases, and a broad set of application services over the Internet. Cloud computing providers such as AWS own and maintain the network-connected hardware required for these application services, while you provision and use what you need using a web application.
    • Types of Cloud Computing:
      • Infrastructure as a Service (IaaS): Infrastructure as a Service, sometimes abbreviated as IaaS, contains the basic building blocks for cloud IT and typically provide access to computers (virtual or on dedicated hardware),  data storage space and networking features. e.g  Amazon EC2, Windows Azure, Google Compute Engine, Rackspace.
      • Platform as a Service (PaaS): Platforms as a service remove the need for organizations to manage the underlying infrastructure (usually hardware and operating systems) and allow you to focus on the deployment and management of your applications. e.g AWS RDS, Elastic Beanstalk, Windows Azure, Google App Engine
      • Software as a Service (SaaS): Software as a Service provides you with a completed product that is run and managed by the service provider. In most cases, people referring to Software as a Service are referring to end-user applications. e.g Gmail, Microsoft Office 365.
    • Cloud Deployment Models:
      • Cloud: A cloud-based application is fully deployed in the cloud and all parts of the application run in the cloud.
      • Hybrid: A hybrid deployment is a way to connect infrastructure and applications between cloud-based resources and existing on-premises resources.
      • On-premises (private cloud): Deploying resources on-premises, using virtualization and resource management tools, is sometimes called “private cloud”.
  • Advantages:
    • Trade Capital Expenses for variable expenses.
    • Benefit from massive economics of scale.
    • Stop guessing about capacity.
    • Increase speed and agility.
    • Stop spending money running and maintaining data centers.
    • Go global in minutes.
  • Security and Compliance:
    • State of the art electronic surveillance and multi factor access control systems.
    • Staffed 24 by 7 security gaurds
    • Access is authorized on a “least privilege basis”
    • SOC 1/SSAE 16/ISAE 3402 (formerly SAS 70 Type II), SOC 2, SOC3
    • FISMA, DIACAP, FedRAMP, PCI DSS Level 1, ISO 27001, ISO 9001, ITAR, FIPS 140-2
    • HIPA, Cloud Security Alliance (CSA), Motion Picture Association of America (MPAA)

(II) Overview of Security Process:

  • AWS offers Shared Security Responsibility Model i.e Amazon is responsible for securing the underlying infrastructure that support the loud, and you are responsible for anything you put on the loud or connect to the cloud.
  • AWS Security Responsibilities:
    • Amazon Web Services is responsible for protecting the Compute, Storage, Database, Networking and Data Center facilities (i.e Regions, Availability Zones, Edge Locations) that runs all of the services in AWS cloud.
    • AWS is responsible for security configuration of its managed services such as Amazon DynamoDB, RDS, Redshift, Elastic MapReduce, WorkSpaces.
  • Customer Security Responsibilities:
    • Customer is responsible for Customer Data, Platform, Applications, IAM, Operating System, Network & Firewall Configuration, Client and Server Side Data Encryption, Network Traffic Protection.
    • IaaS, that includes such as Amazon VPC, EC2, S3 are completely under your control and require you to perform all of the necessary security configuration and management tasks.
    • Managed Services, AWS is responsible for patching, antivirus etc, however you are responsible for account management and user access. Its recommend that MFA be implemented, connect to theses services using SSL/TLS, in addition API and user activity should be logged using CloudTrail.
  • Storage Decommissioning:
    • When a storage device has reached the end of its useful life, AWS procedures include a decommissioning process that is designed to prevent customer data from being exposed to unauthorized individuals.
    • AWS uses the techniques detailed in DoD 5220.22-M (National Industrial Security Program Operational Manual) or NIST 800-88 (Guidelines for Media Sanitization) to destroy data as part of the decommissioning process.
    • All decommissioned magnetic stage devices are degaussed and physically destroyed in accordance with  industry standard practices.
  • Network Security:
    • Transmission Protection: You can connect to AWS  services using HTTP and HTTPS. AWS also offers Amazon VPC which provides a private subnet within AWS cloud, and the ability to use an IPsec VPN connection between Amazon VPC and your on-premises data center.
    • Amazon Corporate Segregation: Logically, the AWS Production network is segregated from the Amazon Corporate network by means of a complex set of network security segregation devices.
  • Network Monitoring and Protection:
    • It protects from:
      • DDoS
      • Man in the middle attacks (MITM)
      • Port Scanning
      • Packet Sniffing by other tenants
      • IP Spoofing: AWS-controlled, host-based firewall infrastructure will not permit an instance to send traffic with a source IP or MAC address other than its own.
        • Unauthorized port scans by Amazon EC2 customers are a violation of the AWS Acceptable Use Policy. You may request permission to conduct vulnerability scans as a required to meet your specific compliance requirements.
        • These scans must be limited to your own instances and must not violate the AWS Acceptable Use Policy. You must request vulnerability scan in advance.
  • AWS Credentials:
    • Passwords: Used for AWS root account or IAM user account login to the AWS Management console. AWS passwords must be 6-128 chars.
    • Multi-Factor Authentication (MFA): Its a six digit single-use code that’s required in addition to your password to login to your AWS root account or IAM user account.
    • Access Keys: Digitally signed requests to AWS APIs (using the AWS SDK, CLI or REST/Query APIs). Include an Access key ID and a Secret Access Key. You use access keys to digitally sign programmatic requests that you make to AWS.
    • Key Pairs: Its a used for SSH login to EC2 instances and CloudFront signed URLs. Its required to  connect to EC2 instance launched from a public AMI. They are 1024-bit SSH-2 RSA keys. You can get automatically generated key pair by AWS when you launch EC2 instance or you can upload your own before launching the instance.
    • X.509 Certificates: Used for digitally signed SOAP requests to AWS APIs (for S3) and SSL server certificate for HTTPS. You can have AWS create X.509 certificate and private key or you can upload your own certificate using Security Credentials page.
  • AWS Trusted Advisor:
    • Trusted Advisor inspects your AWS environment and makes recommendations when opportunities may exist to save money, improve system performance or close security gaps.
    • It provides alerts on several of the most common security misconfigurations that can occur, including:
    • Leaving certain ports open that make you vulnerable to hacking and unauthorized access
    • Neglecting to create IAM accounts for your internal users
    • Allowing public access to Amazon S3 buckets
    • Not turning on user activity logging (AWS CloudTrail)
    • Not using MFA on your root AWS account.
  • Instance Isolation:
    • Different instances running on the same physical machines are isolated from each other via the Xen hypervisor. In addition, the AWS firewall resides within the hypervisor layer, between the physical network interface and the instance’s virtual interface.
    • All packets must pass through this layer, thus an instance’s neighbors have no more access to that instances than any other host on the Internet and can be treated as if they are on separate physical host. The physical RAM is separated using similar mechanism.
    • Customer instances have no access to raw disk devices, but instead are presented with virtualized disks. The AWS proprietary disk virtualization layer automatically resets every block of storage used by the customers, so that one customer’s data is never unintentionally exposed to another.
    • In addition, memory allocated to guests is scrubbed (set to zero) by the hypervisor when its unallocated to a guest. The memory is not returned to the pool of free memory available for new allocations until the memory scrubbing is complete.
  • Guest Operating System: Virtual instances are completely controlled by the customer. You have full root or administrative access over accounts, services and applications. AWS doesn’t have any access rights to your instances or the guest OS.
    • Encryption of sensitive data is generally a good security practice, and AWS provide the ability to encrypt EBS volumes and their snapshots with AES-256.
    • In order to be able to do this efficiently and with low latency, the EBS encryption feature is only available on EC2’s more powerful instance types (e.g M3, C3, R3, G2).
  • Firewall: Amazone EC2 provides a complete firewall solution, this mandatory inbound firewall is configured in a default deny-all mode and Amazon EC2 customers must explicitly open the ports needed to allow inbound traffic. All ingress traffic is blocked and egress traffic is allowed by default.
  • Elastic Load Balancing: SSL termination on the load balancer is supported. Allows you to identify the originating IP address of a client connecting to your servers, whether you are using HTTPS or TCP load balancing.
  • Direct Connect: Bypass Internet service providers in your network path. You can procure rack space withing the facility housing the AWS Direct Connect location and deploy your equipment nearby. Once deployed, you can connect this equipment to AWS Direct Connect using a  cross-connect.
    • Using industry standard 802.1q VLANs, the dedicated connection can be partitioned into multiple virtual interfaces. This allows you to use the same connection to access public resources such as objects stored in Amazon S3 using public IP address space, and private resources such as EC2 instances running within a VPC using private IP space, while maintaining network separation between the public and private environment.

(III) AWS Risk and Compliance:

  • Risk: AWS management has developed a strategic business plan which includes risk identification and the implementation of controls to mitigate or manage risks. AWS management re-evaluates the strategic business plan at least biannually.
    • This process requires management to identify risks within its areas of responsibility and to implement appropriate measures designed to address those risks.
    • AWS Security regularly scans all Internet facign service endpoint IP addresses for vulnerability (these scans don’t include customer instances). AWS Security notifies the appropriate parties to re-mediate any identified vulnerabilities. In addition, external vulnerability threat assessments are performed regularly by independent security firms.
    • Findings and recommendations resulting from these assessments are categorized and delivered to AWS leadership. These scans are done in a manner for the health and viability of the underlying AWS infrastructure and are not meant to replace the customer’s own vulnerability scans required to meet their specific compliance requirements.
    • Customers can request permission to conduct scans of their cloud infrastructure as long as they are limited to the customer’s instances and don’t violate the AWS Acceptable Use Policy.

(IV) Storage Options in the AWS cloud:

(V) Architecting for the AWS Cloud: Best Practices:

  • Business Benefits of Cloud:
    • Almost zero upfront infrastructure investment
    • Just-in-time Infrastructure
    • More efficient resource utilization
    • Usage-based costing
    • Reduced time to market
  • Technical Benefits of Cloud:
    • Automation – Scriptable infrastructure
    • Auto-scalling
    • Proactive scaling
    • More Efficient Development lifecycle
    • Improved Testability
    • Disaster Recovery and Business Contiuity
    • Overflow the traffic to the cloud
  • Design For Failure:
    • Rule of thumb: Be a pessimist when designing architectures in the cloud, assume things will fail. In other words always design, implement and deploy for automated recovery from failure.
    • In particular, assume that your hardware or software will fail, outages will occur, some disaster will strike, requests will increase.
  • Decouple Your Components:
    • The key is to build components that don’t have tight dependencies on each other, so that if one component were to die, sleep or remain busy for some reason, the other components in the system are built so as to continue to work as if no failure is happening.
    • In essence, loose coupling isolates the various layers and components of your application so that each components interacts asynchronously with the others and treats them as black box.
  • Implement Elasticity:
    • The cloud brings new concept of elasticity in your applications. Elasticity can be implemented in three ways:
      • Proactive Cyclic Scaling: Periodic scaling that occurs at fixed interval (daily, weekly, monthly quarterly)
      • Proactive Event-based Scaling: Scaling just when you are expecting a big surge of traffic requests due to a scheduled business event (new product launch, marketing campaigns, black friday sale)
      • Auto-scaling based on demand: By using monitoring service, your system can send triggers to take appropriate actions so that it scales up or down based on metrics (utilization of servers or network i/o for instance)

(V) AWS Well-Architected Framework:

  • Well-Architected Framework is a set of questions that you can use to evaluate how well your architecture is aligned to AWS best practices and it consists of 5 pillars: Security, Reliability, Performance Efficiency, Cost Optimization and Operational Excellence.
  • General Design Principles:
    • Stop guessing your capacity needs.
    • Test systems at production scale.
    • Automate to make architectural experimentation easier.
    • Allow for evolutionary architectures.
    • Data-Driven architectures
    • Improve through game days (such as black Friday)
  • WAF Security Pillar:
    • Design Principles:
      • Apply Security at all layers.
      • Enable traceability
      • Automate responses to security events.
      • Focus on securing your system
      • Automate security best practices
    • Definition: Security in the cloud consists of 4 areas:
      1. Identity and Access management: It ensures that only authorized and authenticated users are able to access your resources, and only in a manner that’s intended. It includes:
        • Protecting AWS Credentials: AWS Security Token Service(STS), IAM instance profiles for EC2 instances.
        • Fine-Grained Authorization: AWS Organizations.
        • How are your protecting access to and use of the AWS root account credentials?
        • How are you defining roles and responsibilities of system users to control human access to the AWS Management Console and APIs?
        • How are you limiting automated access (such as apps, scripts or third-party tools) to AWS resources?
        • Ho are you managing keys and credentials?
        • Key AWS Services: IAM, MFA
      2. Detective controls: It can be used to detect or identify a security breach.
        • Capture and Analyze Logs: AWS Config, Elasticsearch Service, CloudWatch Logs,  EMR, S3 and Glacier, Athena
        • Integrate Auditing Controls with Notification and Workflow: Config Rules, CloudWatch API, AWS SDKs, AWS Inspector.
        • How are you capturing and anlyzing AWS logs?
        • Key AWS Services: AWS CloudTrail, CloudWatch, AWS Config, S3, Glacier
      3. Infrastructure protection: Outside of cloud, this is how you protect your data center. RFID controls, security, lockable cabinets, CCTV etc. Within AWS cloud all this handle by Amazon, so as a customer your infrastructure protection exists at a VPC level only.
        • Protecting Network and Host-Level Boundaries:
        • System Security Configuration and Maintenance:
        • Enforcing Service-Level Protection:
        • How re you enforcing network and host-level boundary protection?
        • How are you enforcing AWS service level protection?
        • How are you protecting the integrity of the operating systems on your EC2 instances?
        • Key AWS Services: VPC
      4. Data Protection:
        • Data Classification:
        • Encryption/Tokenization:
        • Protecting Data at Rest:
        • Protecting Data in Transit:
        • Data Backup/Replication/Recovery:
        • Basic data classification should be in place.
        • Implement a least privilege access system.
        • Encrypt everything where possible i.e at rest or in transit.
        • AWS customers maintain full control over their data.
        • AWS makes it easier for you to encrypt your data and manage keys.
        • Detailed logging is available that contains important content, such as file access and changes.
        • AWS has designed storage systems for exceptional resiliency.
        • Versioning, which can be part of a larger data lifecycle-management process, can protect against accidental overwrites, deletes and similar harms.
        • AWS never initiates the movement of data between regions, customer have to explicitly enable a feature or leverages a service that provides that functionality.
        • How you are encrypting and protecting your data at rest and in transit?
        • Key AWS Services: ELB, EBS, S3 & RDS.
      5. Incident response: You organisation should implement a response plan and a plan to mitigate security incidents.
        • Clean Room: By using tags to properly describe your AWS resources, incident responders can quickly determine the potential impact of an incident.
        • Key AWS Services: IAM, AWS CloudFormation, EC2 APIs, AWS Step Functions.
  • WAF Reliability PillarIt covers the ability of a system to recover from service or infrastructure outages/disruptions as well as the ability to dynamically acquire computing resources to meet demand.
    • Design Principles:
      • Test Recovery Procedures.
      • Automatically recover from failure
      • Scale horizontally to increase aggregate system availability
      • Stop guessing capacity
    • Definition:
      • Foundation – Networking:
      • Application Design for High Availability:
        • Understanding Availability Needs
        • Application Design for Availability
        • Operational Considerations for Availability
        • Key AWS Services: AWS CloudTrail
      • Example Implementations for Availability Goals:
        • Dependency Selection
        • Single Region Scenarios
        • Multi-Region Scenarios
        • Key AWS Services: AWS CloudFormation
  • WAF Performance Efficiency Pillar: It focuses on how to use computing resources efficiently to meet your requirements and how to maintain that efficiency as demand changes and technology evolves.
    • Design Principles:
      • Democratize advanced technologies
      • Go global in minutes
      • User server-less architectures
      • Experiment more often
    • Definitions:
      • Selection:
        • Compute: Autoscaling
        • Storage: EBS, S3, Glacier
        • Database: RDS, DynamoDB, Redshift
        • Network:
      • Review:
        • Benchmarking
        • Load Testing
      • Monitoring:
        • Active and Passive
        • Phases
      • Trade-Offs:
        • Caching: ElastiCache, CloudFront, DirectConnect, RDS Read Replicas
        • Partioning or Sharding
        • Compression
        • Buffering
  • WAF Cost Optimization Pillar: You should use cost optimization pillar to reduce your costs to a minimum and use those savings for other parts of your business. A cost-optimized system allows you to pay the lowest price possible while still achieving your business objectives.
    • Design Principles:
      • Transparently attribute expenditure
      • Use managed services to reduce cost of ownership
      • Trade capital expense for operating expense
      • Benefit form economies of scale
      • Stop spending money on data center operations
    • Definitions:
      • Cost-Effective Resources: EC2 (reserved instances), AWS Trusted Advisor
        • Appropriately Provisioned
        • Right Sizing
        • Purchasing Options
        • Geographic Selection
        • Managed Services
      • Matching Supply and Demand: Autoscaling
        • Demand-Based
        • Buffer-Based
        • Time-Based
      • Expenditure Awareness: CloudWatch Alarms, SNS
        • Stakeholders
        • Visibility and Controls
        • Cost Attribution
        • Tagging
        • Entity Lifecycle Tracking
      • Optimizing Over Time: AWS Blog, AWS Trusted Advisor
        • Measure, Monitor, and Improve
        • Staying Ever Green
  • WAF Operational Excellence Pillar:It includes operational practices and procedures used to manage production workloads. I addition, how planned changes are executed, as well as responses to unexpected operational events.
    • Design Principles:
      • Perform operations with code
      • Align operations processes to business objectives
      • Make regular, small, incremental changes
      • Test for responses to unexpected events
      • Learn from operational events and failures
      • Keep operations procedures current
    • Definitions:
      • Prepare
        • Operational Priorities
        • Design for Operations
        • Operational Readiness
      • Operate
        • Understanding Operational Health
        • Responding to Events
      • Evolve
        • Learning from Experience
        • Share Learnings
Advertisements
Posted in aws, cloud

Amazon Miscellaneous Services

  1. Amazon Game Development:
    1. Amazon GameLift
  2. Amazon Internet of Things:
    1. AWS IoT
    2. IoT Analytics
    3. IoT Device Management
    4. Amazon FreeRTOS
    5. AWS Greengrass
  3. Amazon Desktop & App Streaming
    1. WorkSpaces Amazon WorkSpaces is a fully managed, secure Desktop-as-a-Service (DaaS) solution which runs on AWS. With Amazon WorkSpaces, you can easily provision virtual, cloud-based Microsoft Windows 7 Experience (provided by Windows Server 2008 R2) desktops for your users, providing them access to the documents, applications, and resources they need, anywhere, anytime, from any supported device.
      • With Amazon WorkSpaces, you pay either monthly or hourly just for the Amazon WorkSpaces you launch, which helps you save money when compared to traditional desktops and on-premises Virtual Desktop Infrastructure (VDI) solutions.
      • Workspaces are persistent and you are given given local administrator access by default. All data on D:\ drive is backup every 12 hrs. You don’t AWS account to login to workspaces.
    2. AppStream 2.0
  4. Amazon Business Productivity
    1. Alexa for Business
    2. Amazon Chime
    3. WorkDocs
    4. WorkMail
  5. Amazon Customer Engagement
    1. Amazon Connect
    2. Simple Email Service
    3. Pinpoint
  6. Amazon AR & VR
    1. Amazon Sumerian (Augmented Reality and Virtual Reality)
  7. Amazon Mobile Service
    1. Mobile Hub
    2. AWS AppSync
    3. Device Farm
    4. Mobile Analytics
  8. Amazon Machine Learning
    1. Amazon SageMaker
    2. Amazon Comprehend
    3. AWS DeepLens
    4. Amazon Lex
    5. Machine Learning
    6. Amazon Polly
    7. Rekognition
    8. Amazon Transcribe
    9. Amazon Translate
  9. Amazon Media Services
    1. Elastic Transcoder: Its a media transcoder in the cloud that’s used to convert media files from their original source format into a different format that will play on smartphones, tablets and PCs.
      • You pay based on mins and resolution at which you transcode.
    2. Kinesis Video Streams:
    3. MediaConvert
    4. MediaLive
    5. MediaPackage
    6. MediaStore
    7. MediaTailor
  10. Amazon Developer Tools:
    1. CodeStar
    2. CodeCommit
    3. CodeBuild
    4. CodeDeploy
    5. CodePipeline
    6. Cloud9
    7. X-Ray
  11. Amazon Migration Services
    1. AWS Migration Hub
    2. Application Discovery Service
    3. Database Migration Service
    4. Server Migration Service
    5. Snowball
Posted in aws, cloud

Amazon Analytics Services

  1. Kinesis: It allows to easily collect, process, and analyze video and data streams in real time, so you can get timely insights and react quickly to new information. Its a real-time data processing service that continuously captures and stores large amounts of data that can power real-time streaming dashboards. Its components are as follows:
    • Kinesis, is an managed streaming data service, that provides a platform for streaming data on AWS used for IoT and Bigdata Analytics. It offers powerful services to make it easy to load and analyze streaming data.
    • Stream Data: Streaming Data is data that is generated continuously by thousands of data sources, which typically send in the data records simultaneously, and in small sizes (order of Kilobytes).
    • Producers: Are devices that collect data and input into Kensis. Producers (data sources) include IoT sensors, mobile devices, EC2 instance, eCommerce purchases, In-game player activities, social media networks, stock markets, telemetry.
    • Consumers: Generally EC2 instances consume the streaming data concurrently. And then it may be stored into Real-time dashboards, S3 (storage), Redshift (big data), EMR (analytics), Lamda (event driven actions).
    • Shards (processing power): Its a base throughput unit of stream and a stream is composed of one or more shards. Each shard can process 1 MB/s input (write) and 2 MB/s output (read).
    • Kinesis offers following managed services:
      • Data Streams: Its used to collect and process large stream of data records in real time. Ingest and process streaming data with custom applications. Retention period from 1 to 7 days. The data is consumed by EC2 instances and stored into S3, Dynamo DB, Redshift and Elastic MapReduce.
        • custom data-processing apps which are known as Kinesis Streams apps read data from a Kinesis stream as data records.
        • These apps use Kinesis Client Library and run on EC2 instances.
        • It replicates data synchronously across three availability zones, providing high availability and durability.
      • Data Firehose: Its a fully managed service used for automatically capturing real-time data stream from producers (sources) and delivering (saving) them to destinations such as S3, Redshift, Elasticsearch Service, Splunk.
        • Kinesis streams can be used as the source to Kinesis Firehose.
        • With Kenesis Firehose you don’t need to write apps or managed resources.
        • It synchronously replicate data across three facilities in a Region.
        • Each delivery stream stores data records for up to 24 hrs in case delivery destination is unavailable.
        • You can use server-side encryption if using Kinesis stream as your data source.
      • Data Analytics: Its used to process and analyze streaming data in real-time from Kinesis Streams and Firehose using SQL queries. The data can be stored in the S3, Redshift, Elasticsearch cluster.
      • Video Streams: Capture, process, and store video streams for analytics and machine learning.
    • Benefits of Kensis includes:
      • Real-time and parallel processing, Fully-manged and Scalable
    • Applications of Kensis includes:
      • Gaming, Real-time analytics, Application alerts, Log/Event data collection, Mobile data captures.
  2. Elastic MapReduce (EMR): Its web service (managed Hadoop framework) that enables businesses, researchers, data analysts, and developers to easily and cost-effectively process vast amounts of data.
    • Its a service which deploys out EC2 instances based off the Hadoop big data framework to analyze and process vast amounts of data.
    • It also supports other distributed frameworks such as: Apache Spark, HBase, Presto, Flink
    • EMR Workflow is divided into following four steps:
      1. Storage: Data is stored in S3, DynamoDB, Redshift is sent to EMR
      2. Mapped: Then data is mapped to a Hadoop cluster of Master/Slave nodes for processing.
        • Mapping phase defines the process which splits the large data in file for processing. The data is split in 128 MB chunks.
      3. Computations (coded by developers): are used to process the data.
      4. Reduced: The processed data is then reduced to a single out set of return information.
        • Reduce phase aggregates the split data back into one data source. Reduced data needs to be stored (e.g in S3) because data processed by EMR cluster is not persistent.
    • Master Node: This is a single node that coordinate the distribution of data and tasks among other (slave) nodes for processing. It also tracks the status of the tasks and monitors the health of the cluster.
    • Slave Nodes: There are two types of slave nodes:
      • Core node: They run tasks and stores data in the Hadoop Distributed File System (HDFS) on the cluster.
      • Task node: They are optional node and only run tasks.
    • You has the ability to access the underlying operating system of EC2 instance and can add user data to EC2 instances launched into the cluster via bootstrapping. Also can resize a running cluster at anytime, you can deploy multiple clusters. EMR takes advantage of parallel processing for faster processing of data.
  3. Athena
  4. CloudSearch
  5. Elasticsearch Service
  6. QuickSight
  7. Data Pipeline
  8. AWS Glue
Posted in aws, cloud

Amazon Management Tools

  1. CloudWatch: Its AWS proprietary, integrated performance monitoring service. It allows for comprehensive and granular monitoring of all AWS provisioned resources, with the added ability to trigger alarms/events based off metric thresholds. It monitors operational and performance metric for your AWS cloud resources and applications.
    • It’s used to monitor AWS services such as EC2, EBS, ELB and S3.
    • You monitor your environment by configuring and viewing CloudWatch metrics.
    • Alarms can be created to trigger alerts, based on threshold you set on metrics.
    • Auto Scaling heavily utilizes Cloudwatch, relying on threshold and alarms to trigger the addition or removal of instances from an auto scaling group.
    • Metrics are specific to each AWS service or resource, and include such metrics:
      • EC2 per-instance metrics: CPUUtilization, CPURCreditUsage
      • S3 Metrics: NumberOfObjects, BucketSizeBytes
      • ELB Metrics: RequestCount, UnhealthyHostCount
    • Detailed vs Basic level monitoring:
      • Basic/Standard: Data is available automatically in 5 mins period at no charge.
      • Detailed: Data is available in 1 mins period.
    • CloudWatch EC2 monitoring:
      • System (Hypervisor) status checks: Things that are outside of our controls
        • Loss of network connectivity or system power.
        • Hardware or Software issues on physical host
        • How to Solve: Generally restarting the instance will fix the issue. This cause the instance to launch on a different physical hardware device.
      • Instance status checks: Software issues that we control.
        • Failed system status checks.
        • Mis-configured networking or start configuration
        • Exhausted memory, corrupted filesystem or incompatible kernel
        • How to solve: Generally a reboot or solving the file system configuration issues.
      • Default metrics: CloudWatch will automatically monitor metrics that can be viewed at the host level (Not the software level) Such as: CPU Utilization, DiskReadOps, NetworkIn/Out, StatusCheckFailed_Instance/System.
      • Custom metrics: OS level metrics that required a third part script (perl) to be installed (provided by AWS)
        • Memory utilization, memory used and available.
        • Disk swap utilization
        • Disk space utilization, disk space used and available.
    • Alarms: Allows you to set alarms that notify you when particular thresholds are hit.
    • Events: Helps you to respond to state changes in your AWS resources.
    • Logs: Helps you to aggregate, monitor, and store logs.
    • VPC Flow LogsAllows you to collect information about the IP traffic going to and from network interface in your VPC.
      • VPC Flow Log data is stored in a log group in CloudWatch and can be accessed from Logs in CloudWatch.
      • Flow logs can be created on:
        • VPC
        • Subnet (i.e include all network interfaces in it)
        • Network Interface and each interface have its own unique log stream.
      • The logs can be set on accepted, rejected or all traffic.
      • Flow logs are not captured in real-time,  but data is captured in approx. 10 mins window and then data is published.
      • Its can be used to troubleshooting why certain traffic is not reaching EC2 instance.
      • VPC Flow log consists of of specific 5-tuple of network traffic:
        1. Source IP address
        2. Source port number
        3. Destination IP address
        4. Destination port number
        5. Protocol
      • Following traffic is not captured by VPC Flow logs:
        • Traffic between EC2 instances and Amazon DNS server
        • Traffic generated by request for instance metadata (168.2554.169.254)
        • DHCP traffic
  2. CloudFormation: AWS CloudFormation allows you to quickly and easily deploy your infrastructure resources and applications on AWS. Its allows you to turn infrastructure into code. This provides numerous benefits including quick deployments, infrastructure version control, and disaster recovery solutions.
    • You can convert your architecture into JSON formatted template, and that template can be used to deploy out updated or replicated copies of that architecture into multiple regions.
    • It automate and saves time by deploying architecture in multiple regions.
    • It can be used to version control your infrastructure. Allowing for rollbacks to previous versions of your infrastructure if a new version has issues.
    • Allows for backups of your infrastructure and its a great solution for disaster recovery.
    • Stack: A stack is a group of related resources that you manage as a single unit. You can use one of the templates we provide to get started quickly with applications like WordPress or Drupal, one of the many sample templates or create your own template.
    • StackSet: A StackSet is a container for AWS CloudFormation stacks that lets you provision stacks across AWS accounts and regions by using a single AWS CloudFormation template.
    • TemplateTemplates tell AWS CloudFormation which AWS resources to provision and how to provision them. When you create a CloudFormation stack, you must submit a template.
      • If you already have AWS resources running, the CloudFormer tool can create a template from your existing resources. This means you can capture and redeploy applications you already have running.
      • To build and view templates, you can use the drag-and-drop tool called AWS CloudFormation Designer. You drag-and-drop the resources that you want to add to your template and drag lines between resources to create connections.
  3. CloudTrail: Its a auditing service, which logs all API calls made via either AWS CLI or SDK or Console to AWS. It provides centralized logging so that we can log each action taken in our environment and store it for later use if needed.
    • With CloudTrail, you can view events for your AWS account. Create a trail to retain a record of these events. With a trail, you can also create event metrics, trigger alerts, and create event workflows.
    • The created logs are placed into a designated S3 bucket, so they are highly available by default.
    • ClouTrail logs help to address security concerns by allowing you to view what actions users on your AWS account have performed.
    • Since AWS is just one big API, CloudTrail can log every single action taken in your account.
  4. Config: AWS Config is a service that enables you to assess, audit, and evaluate the configurations of your AWS resources. Config continuously monitors and records your AWS resource configurations and allows you to automate the evaluation of recorded configurations against desired configurations. With Config, you can review changes in configurations and relationships between AWS resources, dive into detailed resource configuration histories, and determine your overall compliance against the configurations specified in your internal guidelines. This enables you to simplify compliance auditing, security analysis, change management, and operational troubleshooting.
  5. OpsWorks: AWS OpsWorks is a configuration management service that provides managed instances of Chef and Puppet. Chef and Puppet are automation platforms that allow you to use code to automate the configurations of your servers. OpsWorks lets you use Chef and Puppet to automate how servers are configured, deployed, and managed across your Amazon EC2 instances or on-premises compute environments. OpsWorks has three offerings, AWS Opsworks for Chef Automate, AWS OpsWorks for Puppet Enterprise, and AWS OpsWorks Stacks.
  6. Service Catalog: AWS Service Catalog allows organizations to create and manage catalogs of IT services that are approved for use on AWS. These IT services can include everything from virtual machine images, servers, software, and databases to complete multi-tier application architectures. AWS Service Catalog allows you to centrally manage commonly deployed IT services, and helps you achieve consistent governance and meet your compliance requirements, while enabling users to quickly deploy only the approved IT services they need.
  7. Systems Manager: AWS Systems Manager gives you visibility and control of your infrastructure on AWS. Systems Manager provides a unified user interface so you can view operational data from multiple AWS services and allows you to automate operational tasks across your AWS resources. With Systems Manager, you can group resources, like Amazon EC2 instances, Amazon S3 buckets, or Amazon RDS instances, by application, view operational data for monitoring and troubleshooting, and take action on your groups of resources. Systems Manager simplifies resource and application management, shortens the time to detect and resolve operational problems, and makes it easy to operate and manage your infrastructure securely at scale.
  8. Trusted Advisor: An online resource to help you reduce cost, increase performance, and improve security by optimizing your AWS environment, Trusted Advisor provides real time guidance to help you provision your resources following AWS best practices.
  9. Managed Services: AWS Managed Services provides ongoing management of your AWS infrastructure so you can focus on your applications. By implementing best practices to maintain your infrastructure, AWS Managed Services helps to reduce your operational overhead and risk. AWS Managed Services automates common activities such as change requests, monitoring, patch management, security, and backup services, and provides full-lifecycle services to provision, run, and support your infrastructure. Our rigor and controls help to enforce your corporate and security infrastructure policies, and enable you to develop solutions and applications using your preferred development approach. AWS Managed Services improves agility, reduces cost, and unburdens you from infrastructure operations so you can direct resources toward differentiating your business.
Posted in aws, cloud

Amazon Application Integration

  1. Simple Notification Service (SNS): Its a flexible, fully managed pub/sub messaging and mobile notifications service for coordinating the delivery of messages to subscribing endpoints and clients.
    • Its a integrated notification service that allows for sending messages to various endpoints. Generally these messages are used for alert notifications to sysadmins or to create automation. Its a push-based whereas SQS is poll-based messaging service.
    • Its integrated into many AWS services, so its very easy to setup notifications based on events that occur in those services.
    • With CloudWatch and SNS, a full-environment monitoring solution can be created that notifies administrators of alerts , capacity issues, downtime, changes in the environment and more.
    • This service can also be used for publishing IOS/Android app notifications and creating automation based off of notifications.
    • Topic: Its a group of subscriptions that you send message to.
    • Subscription: An endpoint that a message is sent. Available endpoints are: HTTP, HTPS, Email, Email-JSON, SQS, Application, Lambda, SMS.
    • Publisher: The entity triggers the sending of a message. Such as Human, S3 Event, Cloudwatch Alarm
  2. Simple Queue Service (SQS): Its a reliable, scalable, fully-managed message queuing service. It provides the ability to have hosted and highly available queues that can be used for messages being sent between servers.
    • It allows for highly available and distributed decoupled application architecture. This is accomplished through utilizing the use of message and queues, and retrieving messages is polling, so its a pull based system.
    • Each message can contain up to 256 KB of text in any format.
    • Messages can be kept in queue from 1 min to 14 days, whereas default is 4 days.
    • Visibility Timeout is the duration of time (max is 12 hrs) a message is locked for read by other consumers once its already read by a consumer, so that the message can’t be read again by another consumer.
    • Message Delay is set if you want to configure individual message delay of up to 15 mins. It helps when need to schedule jobs with a delay.
    • In-Flight messages are the one which are received/read from the queue by a consumer app but not yet deleted from the queue.
    • SQS can be used with RedShift, DynamoDB, EC2, ECS, RDS , S# and Lambda to make distributed decoupled applications.
    • You can use IAM policies to control who can read/write message from/to an SQS queue.
    • Server-side encryption (SSE) using KMS managed keys, lets you transmit sensitive data in encrypted queues.
    • Standard CloudWatch metrics (5 mins) for your SQS queues are automatically collected and pushed to CloudWatch, detailed monitoring (1 min) is not available currently.
    • CloudTrail can be used to collect information about SQS such as which requests are made to SQS, the source IP,  source, timestamp.
    • Polling Types:
      • Short Polling: Its default for SQS. A request is returned immediately even if the queue is empty.
        • It doesn’t wait for messages to appear in the queue
        • It queries only a subset of available servers for messages
        • Increases API requests (over long polling) which increases costs.
        • ReceiveMessageWaitTime is set to 0.
      • Long Polling (1-20 secs): Its preferred over short polling because it uses fewer requests and reduces cost by eliminating false empty responses by querying all the servers.
        • It reduces the number of empty responses by allowing SQS to wait until a message is available before sending the response or until connection timeout i.e 1-20 secs.
        • Allows the SQS service to wait until a message is available in a queue before sending a response and will return all message from all SQS services.
        • Long polling reduces API requests (over using short polling).
        • You can enable long polling by setting Receive Message Wait Timeout value greater than 0 in AWS console.
        • Don’t use long polling if your app expects an immediate response to receive message calls.
    • Queue types:
      • Standard Queue:
        • Provides high (unlimited) throughput
        • Guarantees delivery of each message at least once
        • Duplicates are possible (can’t guarantee no duplication)
        • Best effort ordering
        • Supports 3000 transactions per second.
      • First in First Out (FIFO) Queue:
        • Limited throughput i.e 300 transaction per seconds.
        • Each message is processed exactly once
        • There will be no duplicates gaurantee
        • Strict ordering i.e First In First Out
    • SQS Workflow: Generally a “worker” instance will “poll” queue to retrieve waiting messages for processing. Auto scaling can be applied based off of queue size so that if a component of your application has an increase in demand, the number of worker instances can increase.
    • SQS Message: A set of instructions that will be relayed to the worker instances via the SNS Queue. The message can be up to from 1-256 KB of text in any format. Each message is guaranteed to be delivered at least once but order is not guaranteed and duplicates can occur.
    • SQS Queue: It can have 1-10 messages, up to payload size of 256 KB. The messages messages can be stored in queue from 1 min up to 14 days (default is 4 days), that can be retrieved through polling. Queues allows components of your application to work independently of each other (Decoupled environment).
  3. Simple Workflow Service (SWF): Its a fully managed work flow service that coordinates and manages the execution of activities. It manages a specific job  from start to finish, while still allowing for distributed decoupled architecture.
    • It allows an architect/developer to implement distributed, asynchronous applications as work flow.
    • It has consistent execution and guarantees the order in which tasks are executed and there are no duplicate tasks.
    • A workflow execution can last up to 1 year.
    • Workflow: A sequence of steps required to perform a task, also commonly referred as decider. It coordinates and manages the execution of activities that  can be run asynchronously across multiple computing devices.
    • Activities: A single step (or unit of work) in the workflow.
    • Tasks: What interacts with the workers that are part of a workflow.
      • Activity task: Tells the worker to perform a function.
      • Decision task: Tells the decider the state of the work flow execution, by communicating (back to the decider) that a given task has been completed, which allow him/her to determine the next activity to be performed.
    • Worker: Responsible for receiving a task and taking action on it.
      • Can be any type of component such as an EC2 instance or even a person.
    • Actors:
      • Workflow Starters: An application that initiate/start the workflow.
      • Deciders: Controls the flow of activity tasks in a workflow execution.
      • Activity Workers: Carry out the activity tasks.
  4. Step Functions:
  5. Amazon MQ
Posted in aws, cloud

Amazon Security, Identity & Compliance Services

Amazon Web Services (AWS): AWS is made of Regions, which are grouping of independently separated data centers in a specific geographic regions known as Availability Zones.

  • Regions: A grouping of AWS resources located in a specific geographic region. Designed to service AWS customers that are located closet to a region. Regions are comprised of multiple Availability Zones.
    • Availability ZoneGeographical isolated zones a region that house AWS resources. Availability zones are where separate, physical AWS data centers are located. Multiple AZs in each Region provide redundancy for AWS resources in that region.
    • Data Centers:
    • Edge Locations: It’s an AWS data center which doesn’t contain AWS services. Instead its used to deliver contents to the parts of the world.
  • AWS Organization: AWS Organizations enables you to centrally apply policy-based controls across multiple accounts in the AWS Cloud.
    • You can consolidate all your AWS accounts into an organization, and arrange all AWS accounts into distinct organizational units.
    • Enable either Consolidated Billing or All Features
    • Paying account is independent and can’t access resources of other accounts and should be used for billing purpose only.
    • All linked accounts are independent. Currently you can have up to 20 linked accounts for consolidated billing.
  • Resource Groups: Find and group your AWS resources with tag queries
    • You can create unlimited, single-region groups in your account based on resource types and tag queries, use your groups to view group related insights, and automate tasks on group resources.
  • Security Token Service (STS): Grants users limited and temporary access to AWS resources. User can come from three sources:
    • Federation (Active Directory):
      • Uses Security Assertion Markup Language (SAML)
      • Grants temporary access based off the users Active Directory credentials. Doesn’t need to be a user in IAM.
      • Single sign on allows users to login into AWS console without assigning IAM credentials.
      • Federation with Mobile Apps: Use Facebook/Amazon/Google or OpenID providers to log in.
      • Cross Account Access: Lets users from one AWS account access resources in another.
    • Federation: Combining or joining a list of users in one domain (such as IAM) with a list of users in another domain (such as Active Directory, Facebook etc)
    • Identity Broker: A service that allows you to take an identity from point A and join it (federate it) to point B.
    • Identity Store: Services like Active Directory, Facebook, Google etc.
    • Identities: A user of a service like Facebook etc.
    • Access Key, Secret Access Key, Token, Duration (1-36 hrs)
    • Flow: Employee 1>App 2> Identity Broker 3> LDAP 4>AWS STS 5> Identity Broker 6> App 7> AWS S3 8> IAM 9> S3
    • Steps:
      1. Develop an Identity Broker to communicate with LDAP and AWS STS.
      2. Identity Broker always authenticates with LDAP first, then with AWS STS.
      3. App then gets tempororary access to AWS resources.
  • AWS Support:
    • Basic, Developer, Business, Enterprise
  1. Identity and Access Management (AIM): AWS Identity and Access Management (IAM) enables you to manage access to AWS services and resources securely. Using IAM, you can create and manage AWS users and groups, and use permissions to allow and deny their access to AWS resources. You use IAM to control who is authenticated (signed in) and authorized (has permissions) to use resources.
    • root account: When you first create an AWS account, you begin with a single sing-in identity (email and password) that has complete access to all AWS services and resources in the account.
      • AWS recommend not to use root user for everyday tasks, even administrative ones. But create other IAM user account to perform those tasks.
      • Its not possible to restrict permissions on AWS root account.
    • IAM is the service where you manage your AWS users and their access to AWS accounts and services.
    • Its a universal and centralized control of your AWS account and shared access to your AWS account, its global and does not apply to regions.
    • Its used for granular permissions and Multi-factor authentication, and Identity federation (e.g Active Directory, Facebook, LinkedIn etc)
    • Provide temporary access for users/devices and services
    • Allows you setup your own password rotation policy
    • Principal: Its an entity that can take an action on an AWS resource.
      • IAM Users, Roles, Federated users and Applications are all principals.
    • UsersIndividual user accounts, when created are assigned username/password and Access Key ID/Secret Access Keys (programmatic accesswhich are used to access AWS resources via CLI and APIs.
      • I am users can be actual person or an application
      • Federated users (authenticated with Google, Facebook)
      • By default users have no permissions, so you have to grant the permissions by attaching policies to the user to access the resources.
      • access key (access key ID and secret access key) are required to make programmatic calls to AWS resources from the HTTPS API, CLI, SDK, Tools for Windows PowerShell.
        • You can view and download secret access key only when you create the access key. If you lose your secret access key, you can create a new access key.
        • A user can have two access keys and you can disable it from AWS console.
        • You can allow users to change their own access keys through IAM policy.
      • service account: When you create an IAM user to represent an app that needs credentials in order to make requests to AWS resources.
      • You can create up to 5000 users in a AWS account. For more you can consider using Temporary Security Credentials (STS).
      • Use Amazon Resource Name (ARN) when you need to uniquely identify the user such the user as a Principal in an IAM policy for an S3 bucket.
        • arn:aws:iam::ID:user/myuser
      • AWS Console Sign In:
    • GroupsCollections of IAM users and used to assign permissions/policies to multiple users at once.
      • Group is not an identity in IAM because it can’t be identified as a Principal in a permission policy.
      • Groups can’t be nested, they contain only users.
      • You can create up to 300 groups per account.
      • A user can be a member of up to 10 groups.
    • PoliciesA policy is an entity in AWS that, when attached to an identity or resource, defines their permissions. AWS evaluates these policies when a principal, such as a user, makes a request. Permissions in the policies determine whether the request is allowed or denied.
      • Policies are stored in AWS as JSON documents attached to principals (as Identity-Based Policies) or to resources (as Resource-Based Policies).
        • Identity based policies: Attached to users, groups, roles.
        • Resource-Based Policies: Attached to resources e.g S3 buckets.
      • Any actions that are not explicitly allowed are denied by default.
      • Permissions are granted through policies that  are then attached to users, groups or roles.
      • Any actions or resources that are not explicitly allowed are denied by default.
      • To assign permissions to federated users, you can create an entity referred to as a role and define permissions for the role.
      • AWS Managed Policies: An AWS managed policy is a standalone policy that is created and administered by AWS.
      • Customer Managed Policies: You can create standalone policies that you administer in your own AWS account, which we refer to as customer managed policies.
      • Inline Policies: An inline policy is a policy that’s embedded in a principal entity (a user, group, or role)—that is, the policy is an inherent part of the principal entity.
      • P.S. Standalone policy means that the policy has its own Amazon Resource Name (ARN) that includes the policy name.
    • RolesIAM roles are a secure way to grant permissions to entities that you trust. IAM roles issue keys that are valid for short durations, making them a more secure way to grant access. Roles are created and policies are attached onto them to grant access for AWS Services.
      • Its very similar to a IAM user but it don’t have any credentials (password or access keys) associated with it.
      • Roles are more secure than storing your access key and secret key on EC2 instance.
      • You can apply only one role to an EC2 instance at a time and it can be assigned after the instance is provisioned. You can also update the policies applied onto the role anytime.
      • AWS Service role: Allows AWS services to perform actions on your behalf.
      • Another AWS account role: Allows entities in other accounts to perform actions in this account.
      • Web identity role: Allows users federated by the specified external web identity or OpenID Connect (OIDC) provider to assume this role to perform actions in your account.
      • Temporary Credentials: Temporary credentials are primarily used with IAM roles that are used to provide temporary access (through STS) onto AWS resources. They expired automatically after a set period of time.
      • SAML 2.0 federation role: Allows users that are federated with SAML 2.0 to assume this role to perform actions in your account.
    • MFA (Multi-Factor Authentication): With MFA you or your users along with username and password or access key must provide a code from specially configured device (or app on mobile) to login into and work with your account.
    • Identity Federation: You can allow users who already have password elsewhere, e.g in your corporate network (Active Directory) or with an internet identity provider (Facebook, LinkedIn, Google) to get temporary access to your AWS account.
      • Security Assertion Markup Language (SAML) 2.0 based federation is used when have users identities in your corporate directory (SAML 2.0 compatible), then you can configure single-sign access to AWS console for your users.
        • If your directory is not compatible with SAML 2.0 then you can develop a identity broker app to provide SSO.
      • Web federation (OpenID) based federation is used when you let users to to authenticate via Internet identity providers such as Facebook, Google. AWS recommends to use Cognito for identity federation with internet identity providers.
    • PCI DSS Compliance: IAM supports the processing, storage and transmission of credit card data by a merchant or service provider, and has been validated as being compliant with Payment Card Industry (PCI) Data Security Standard (DSS).
    • ARN (Amazon Resource Names):
    • STS (Security Token Service): Allows you to create temporary security credentials that grant trusted users access to your AWS resources.
    • Single Sign On (SSO): SSO allow users to access  AWS console without having them IAM identity. If your ogranization has an existing identity system then you might want to create a SSO.
    • IAM API Keys: They are required to make programmatic calls to AWS resources via:
      • AWS Command Line Interface (CLI)
      • Tools for Windows PowerShell
      • AWS SDKs
      • Direct HTTP calls using the APIs for individual AWS services
    • Cross Account Access: It allows you to work productively within a multi-account (or multi-role) AWS environment by making it easy for you to switch roles within AWS Management Console.
      • You can now sign into console using your IAM user name, then switch the console to manage another account without having to enter another user credentials.
    • Amazon CLI (awscli):
      • Install awscli:
        • pip install awscli –upgrade –user
        • aws configure (Access key ID & Secret access key from AIM user)
          • AWS Access Key ID:
          • AWS Secret Access Key:                      (~user/.aws/credentials)
          • Default region name: us-east-1          (~user/.aws/config)
          • P.S. https://docs.aws.amazon.com/general/latest/gr/rande.html
      • S3: (The user/instance must have appropriate S3 Access policy assigned)
        • aws s3 ls [mybucket]
        • aws s3 cp –recursive  s3://mybucket  /tmp  [–region us-east-1]
      • aws ec2 describe-instances | grep -i instanceid
      • aws ec2 terminate-instances –instance-ids <instance-id>
  2. Directory Service:
    • It provides multiple directory choice for customers who want to use existing Microsoft SD or Lightweight Directory Access Protocol (LDAP) aware applications in the cloud.
    • AWS Directory Service includes the following services:
      • Microsoft ADAWS-managed Microsoft Active Directory powered by Windows Server 2012 R2.
        • Microsoft AD is your best choice if you have more than 5000 users and/or need a trust relationship setup between an AWS hosted directory and your on-premises directories.
        • Microsoft AD or Simple AD support automatic and manual snapshots, that can be used to restore your directory data.
        • Its supports AWS apps and services including RDS for Microsoft SQL Server, Amazon WorkSpaces, WorkDocs, QuickSight, Chime, Connect.
        • It comes in two editions: Standard Edition (5,000 users and 30,000 objects), Enterprise Edition (500,000 objects).
      • Simple ADCreate a Microsoft Active Directory-compatible directory powered by Samba 4 that provides a subset of Microsoft Active Directory features.
        • Its your best choice if you have less than 5000 users and don’t need the more advanced Microsoft AD features.
        • Its available in two sizes Small (500 users and 2000 objects),  Large (5000 users and 20000 objects).
        • Its not compatible with RDS SQL Server.
        • It doesn’t support trust relationship with other domains.
        • It creates two directory servers and a  DNS servers on your behalf.
      • AD Connector: Active Directory Connector is a gateway that sends to your existing on-premises Microsoft Active Directory from AWS services.
        • Its simply connects your existing on-premises AD to AWS. No directory information is replicated into or cached in AWS.
        • It comes in two sizes Small AD Connector (500 users) and Large AD Connector (5,000 users).
        • It requires VPN or AWS Direct Connect connection must be established between on-premises network and AWS.
        • AWS AD connector relies on IAM Roles to provide Security Token Serivce (STS) credentials to the On-premise MS Active Directory authenticated users
        • Its not compatible with RDS SQL Server.
      • Amazon Cloud Directory: Its a cloud-native directory to store your application’s hierarchical data.
        • Cloud Directory scales to hundreds of millions of objects and offers support for multiple relationships and application-specific schemas.
        • Examples of application directories include device registries, catalogs, organization structures, and network topologies.
      • Amazon CognitoWith Amazon Cognito Your User Pools you can easily and securely add user sign-up and sign-in functionality to your mobile and web apps. This fully managed service scales to support hundreds of millions of users.
  3. Cognito
  4. GuardDuty
  5. Inspector
  6. Amazon Macie
  7. AWS Single Sign-On
  8. Certificate Manager
  9. CloudHSM
  10. WAF & Shield
  11. Artifact
  12. Cross Account Acess:
Posted in aws, cloud

Amazon Storage Services

  1. Simple Storage Service (S3)Amazon Simple Storage Service is a object based storage for the Internet. It is designed to make web-scale computing easier for developers. Amazon S3 has a simple web services interface that you can use to store and retrieve any amount of data, at any time, from anywhere on the web. It gives any developer access to the same highly scalable, reliable, fast, inexpensive data storage infrastructure that Amazon uses to run its own global network of web sites. The service aims to maximize benefits of scale and to pass those benefits on to developers.
    • Its Object-based i.e allows you to upload files in S3 Buckets, not suitable to install OS or databases on it.
    • S3 region specific and is universal namespace, bucket names must be unique globally and in lower case
    • Files can be from 0 Bytes to 5 TB in size but S3 storage is unlimited.
    • Read-after-write (immediate or strong) consistency for PUTS of new objects.
    • Eventual consistency for overwrite of PUTS and DELETES (i.e for changes/updates to existing objects, can take some time to propagate)
    • You are charges based upon Storage, Requests, Data Transfer and Storage Management Pricing, Transfer Acceleration
    • For intensive GET requests, introduce prefix (name) randomness and use CloudFront distributions.
    • You pay for Gb/month storage fee, data transfer in/out of S3 in different regions (over internet), upload or download requests.
    • Buckets: Its a unique global root level folder across all AWS S3 and managed Globally.
      • You can create 100 buckets in a single account.
      • Bucket names contains lower case, numbers and hyphen having length between 3-63 chars and can’t be changed after bucket is created.
      • By default bucket and all objects in it are private
      • Can be configured to create access logs which log all requests made to S3 bucket. The logs can be stored in the same or to another bucket.
      • You can store unlimited objects in your bucket, but an object can’t exceed 5 TB.
    • Folders: Any subfolder created in a bucket
    • Objects: Files stored or uploaded into a bucket.
      • Object consists of: Key (name), Value (data), VerisionID, Metadata, Subresources (ACLs, Torrent)
      • Successful uploaded files/objects generate a HTTP 200 status code
      • Objects stay in a region and are synced across all availability zones in it
      • Objects are cached for the life of the TTL
      • Objects can be from 0 Bytes to 5 TB in size
      • Objects stored in a S3 bucket in a region will never leave that region unless you specifically move them to another region, or enable Cross Region Replication.
    • Permissions:
      • All buckets and objects are private by default
      • A bucket owner can grant cross-account permissions, to another AWS account (or users in the account) to upload objects.
      • The access can be granted either by IAM policy or resource policy:
      • Resource based Policy:
        • Bucket Policy: Are attached to only S3 bucket and permissions are applied to all objects in a bucket.
        • Access Control List(ACLs): Grant access to other AWS account or to the public. Both buckets and objects have ACL. Object ACL allow us to share to share an S3 object via a public URL.
      • User Access Policies:
        • You can use AWS IAM to manage access to your S3 resources.
        • Using IAM you can create users, group and roles in your account and attach access policies to them granting access to AWS resources including S3.
      • READ: Allows grantee to list objects in the bucket, and read object and its metadata
      • WRITE: Allows grantee to create, overwrite, and delete any object in the bucket (Not applicible on Object)
      • READ_ACP: Allows grantee to read the bucket or object ACL
      • WRITE_ACP: Allows grantee to write the ACL for the bucket or object.
      • FULL_CONTROL: Allows grantee all above permissions on the bucket or object.
    • Storage ClassesAre divided based upon cost, durability and availability:
      • Standard (Durable, Immediately Available, Frequently Accessed):
        • It supports Availability of 99.99% and Durability of 99.999999999% (11*9’s).
        • Default general and all purpose storage and is most expensive.
        • Data encrypted in-transit and at rest in S3
        • Designed to sustain the concurrent loss of two facilities
      • Infrequent Access / S3-IA (Durable, Immediately Available, Infrequently Accessed):
        • It has object Availability of 99.9% and Durability of 99.999999999%.
        • It has min 30-day retention period and 128KB min object size.
        • Designed for less frequently accessible objects and backups.
      • Reduced Redundancy Storage/S3-RRS (Not-Durable, Immediately Available, Frequently Accessed):
        • It has object Availability of 99.99% and Durability of 99.99%.
        • For non-critical reproducible objects such as thumbnails.
        • Designed to sustain the data loss in one facility only.
        • If object is lost, AWS will return 405 error and S3 can send notification when object is lost.
      • Glacier: Designed for long-term archival storage (not to be used for backups) that’s very rarely accessed and its cheapest among all.
        • Object are moved by Life Cycle Polices from standard to glacier storage class.
        • May take from several mins to hours (3-5 hrs) to retrieve the objects and is the cheapest among all storage classes.
    • Encryption:
      • Client Side Encryption:
        • Client encrypts the data on the client side, then transfer the encrypted data to S3 bucket.
      • Server Side Encryption (SSE):
        • Data is encrypted by S3 service before its saved into S3 storage disk. And its would be decrypted by the S3 service before you download it.
        • SSE-S3 (AES-256)Server Side Encryption using S3 managed keys.
        • SSE-KMSServer side encryption using AWS KMS (Customer Master Keys/CMK) keys.
        • SSE-C: Server side encryption using customer provided keys.
    • Versioning: It’s a feature to manage and store all old/new/deleted versions of objects. Used to protect against accidental object updates or deletion.
      • Its set on the bucket level and applies to all objects.
      • By default its disabled but once enabled, then it can only be suspended  and it will apply for newer objects.
      • Older objects’s class automatically changed to Reduced Redundancy.
      • Versioning can be used with life cycle policies to create a great archiving and backup solution in S3.
      • MFA Delete is a versioning capability that adds another layer of security for:
        • Changing bucket’s versioning state
        • Permanently deleting an object version
      • Only S3 bucket owner can permanently delete objects once versioning is enabled.
      • When you delete an object, a DELETE marker is placed on the object. When you delete the DELETE marker, the object will be  available again.
    • Lifecycle Policies (Object): Set of rules that automate the migration an object’s storage class to a different storage class (i.e from Standard to IA) and delete based on time interval.
      • By default life cycle policies are disabled on a bucket/object.
      • Can be used with versioning (current and previous versions) to create a great archiving and backup solution in S3.
      • Transition actions:
        • S3-Standard to S3-IA: An object must be in S3-Standard for at least 30 days before it can be transitioned into S3-IA and its size must be 128KB or more.
        • S3-Standard/S3-IA to Glacier: An object must be minimum 60 days in s3-standard or 30 days in S3-IA before it can be transitioned into Glacier.
      • Expiration actions:
        • Used to permanently delete expired objects i.e minimum 61 days after creation.
      • You can’t use life cycle policies to move an archived object from Glacier to S3-Standard or S3-IA.
      • You can’t change an object from S3-Standard or S3-IA into RRS.
    • EventsS3 Event notification allows you to setup automated communication between S3 and other AWS services when a selected even occurs in S3 bucket.
      • S3 metrics that can be monitored by CloudWatch include:
        • S3 Requests, Bucket Storage, Bucket Size, All requests, HTTP 4XX messages, 5XX errors.
      • Common event notification triggers include:
        • RSSObjectLost
        • ObjectCreated (Put, Post, Copy)
        • ComplteMultiPartUpload
      • Even notification can be sent to SNS, Lambda, SQS Queue.
      • CloudTrail captures all API requests made to S3 API, by default it logs bucket level actions but you can configure it to log object level actions as well. The logs are stored in S3 bucket.
    • Static Web Hosting: Amazon S3 provides an option for a low cost, highly reliable web hosting service for static web sites. Its server-less, very cheap and scales automatically.
      • When enabled it provide you a unique endpoint url that you can point to a properly formatted file stored in S3 bucket. It supports: images, videos, HTML, CSS, JavaScript
      • Amazon Route 53 can also map human readable domain names to static web hosting buckets, which are ideal for DNS failover solution.
      • http://bucketname.s3-website-us-east-1.amazonaws.com
      • It supports http (NOT https) connection and publicly readable contents only.
      • Enable website hosting to your bucket and must specify default Index document and Error document (optional).
      • You can redirect to another object in the same bucket or to an external URL.
    • Cross Region Replication (CRR): Cross-region replication is a bucket-level configuration that enables automatic, asynchronous copying of objects across buckets in different AWS Regions.
      • Versioning must be enabled on both source and destination buckets. The source and destination buckets must be in different regions and replicating to multiple buckets is not allowed.
      • AWS S3  must have permissions (IAM Role to read objects and ACLs) to replicate objects from source to destination bucket. These buckets can be owned by different AWS accounts.
      • You can replicate all or subsets of objects with specific key name prefixes.
      •  Existing files are not replicated automatically from source to destination bucket, but subsequent uploaded/updated files or deleted markers will be replicated automatically.
      • Deleting individual versions or delete markers will not be replicated.
      • Existing objects and the objects created with SSE-C or SSE-KMS are not replicated. For existing object you need to copy the objects yourself using cli:
        • pip install awscli –upgrade –user
        • aws configure
        • aws s3 cp –recursive s3://src  s3://dst  (Copy from src to dest bucket)
        • aws s3 ls dst
      • AWS will encrypt data in-transit across regions using SSL.
    • Transfer Acceleration: It utilizes the CloudFront Edge Network to accelerate your uploads to S3 bucket. Instead of directly uploading to your S3 bucket, you can use a distinct URL to upload directly to an edge location close to you, which will then transfer that file to S3 bucket.
      • You will get a distinct URL to upload to:
      • Once enabled it can only be suspended but can’t be disabled.
      • Bucket names must be DNS compliant and must not have period (.) in the bucket name.
      • If there’s no speed enhancement, then there would be no charge using it.
      • You can use multi-part uploads and no data is cached at Cloudfront edge locations.
    • Cross-Origin Resource Sharing (CORS): Its a method of allowing a web application located in one domain to access and use the resources in another domain.
      • This allows web applications running JavaScript and HTML5 to access resources in S3 buckets without using a proxy server.
      • If enabled, then a web app hosted in S3 bucket can access resources in another S3 bucket.
    • Pre-Signed URLs:
      • It can be used to provide temporary access to a specific object, to those who don’t have AWS credentials. e.g a customer who bought website subscription.
      • Expiration date and time must be configured when generating a pre-signed  URL.
      • The URLs can be generated using SDKs for Java and .Net, it can be used to download and upload.
    • File Upload options:
      • Single Operation Upload: Its a traditional method in which you upload a file in one part. Can upload a file up to 5 GB, however any file over 100 MB should use multipart upload.
      • Multipart Upload: It allows you to upload a single file as a set of parts and all parts can be uploaded concurrently.
        • After all parts are uploaded, then AWS S3 assemble these parts and create the object.
        • It must be used for objects over 5 GB in size and up to 5 TB. But is recommended to use it for objects over 100 MB.
    • AMZ Import/Export Disk: It accelerates moving large amounts of data into and out of the AWS cloud using portable external disks for transport. You mail on-premise data in a storage device to AWS data center, that would be imported by Amazon in S3 in one business day.
    • Snowball: AWS Snowball is a service used to transfer data into the cloud at faster-than-Internet speeds or harness the power of the AWS Cloud locally using AWS-owned appliances.
      • Snowball: Its 80TB data transfer device that uses secure appliances to transfer data into and out of AWS S3.
      • Snowball Edge: Its a 100TB data transfer device with on-board storage and compute capabilities.
      • Snowmobile: Its an Exabyte-scale data transfer service used to move extremely large amounts of data to AWS. You can transfer up to 100PB per Snowmobile, a 45-foot long ruggedized shipping container, pulled by a semi-trailer truck.
  2. Elastic Block Store (EBS): Allows you to create block based storage volumes and attach them to EC2 instances. Once attached, then you can  create a file system on top of these volumes or run a database.
    • EBS volumes (virtual hard-disk) are persistent and are block based network attached storage.
    • EBS volume can only be attached to a single EC2 instance and both must be in the same AZ.
    • Delete on Termination is checked by default for EBS root volume but for data /additional volumes its unchecked. So the root volume is deleted automatically when the instance is terminated.
    • EBS root volumes can’t be encrypted by default, but additional volumes can be encrypted by KMS. For root volume you can use a third party tool (e.g bit locker) or EBS snapshots to encrypt the root volume, and then can boot a new instance from it.
      • EBS encryption is supported on all EBS volume types, but not on all EC2 instances families (e.g T2 Micro/free tier instance family doesn’t support EBS volume encryption)
      • To encrypt a volume or a snapshot, you need an encryption key, these keys are called Customer Master Key (CMKs) and are managed by AWS Key Management Service (KMS).
    • EBS volume types and size can be changed while instance is running but best practice is to stop the instance.
    • A volume can be copied into another AZ by taking a snapshot of it, and then creating a new volume out of that snapshot in the desired AZ.
    • The root device for an instance launched from the AMI is an Amazon EBS volume created from an Amazon EBS snapshot.
    • EBS volume backed instances can be stopped and the volume remains attached to the instance, the data is not erased, but you will be charged for the volume.
    • It has 99.999% availability. 5,000 EBS volumes and 10,000 snapshots can be created per account.
    • Snapshots:
      • EBS Snapshots are point-in-time Images/copies of your EBS volumes.
      • EBS volumes are backed up by snapshots which are asynchronous and incremental, and are stored in S3 (that can be viewed by EC2 API only).
      • Snapshot of an encrypted volume is encrypted automatically and the volume restored from an encrypted snapshot is also encrypted automatically.
      • Unencrypted snapshots can be shared with AWS community by setting them as public.
      • Unencrypted private snapshots can be shared with other AWS accounts, but for encrypted private snapshots you can share via Cross-account permissions that uses custom CMK key.
      • You can create an image or a volume (in same or separate AZ) from a snapshot.
      • You can copy a snapshot into a separate region (can also encrypt at the same time), and then create a volume or image from it, and finally launch the instance from it.
      • To create snapshot of root EBS volume, the instance should be stopped, whereas for non-root volume its recommended either pause IO activity or unmount and detach the volume.
      • EBS volumes are AZ specific whereas Snapshots are Region specific.
      • You can create/restore a snapshot to an EBS volume of the same or larger size (but NOT smaller size) than the original volume size from which snapshot was initially created.
      • Volumes created from an EBS snapshot must be initialized. Initializing occurs the first time a storage block on the volume is read, and the performance impact can be impacted by up to 50%. You can avoid this impact in production environments by manually reading all the blocks.
      • To take snapshots of RAID array you need to either freeze filesytem, unmount RAID array or shutdown the associated instance.
      • You must deregister the AMI before being able to delete the snapshot of the root device.
    • EBS Volume Types:
      • General Purpose SSD (GP2): Default general purpose, balances both price and performance, dev/test environments. (IOPS 100/30003 per GiB). Volume size 1-16 TB size.
      • Provisioned IOPS SSD (IO1): Designed for I/O intensive apps such as large relational or NoSQL databases (10-20k  IOPS). Volume size 4-16 TB.
      • Throughput Optimized HDD (ST1): Used for frequently accessed workload e.g Big data, data warehousing, streaming. Can’t boot volume from it. Volume size 500 GB to 16 TB.
      • Cold HDD (SC1): Used for less frequently accessed data. Can’t be boot volume from it. Volume size 500 GB to 16 TB.
      • Magnetic (Standard): Cheap, infrequently accessed storage. Can be used to boot volume from it. Volume size 1GB to 1 TB.
    • Instance Store volume:
      • Its a Ephemeral Block Storage device and is a virtual hard disk that’s allocated to the EC2 instance (guest) on physical hostIt exists for the duration of instance life cycle and  limited to 10 GB in size.
      • You can attach additional instance store volumes during launch only. After the instance is launched, then you can attach EBS volumes only.
      • Instance-Store backed (root volume) EC2 instances can’t be Stopped but can be Rebooted (data is preserved) or Terminated (data will be lost).
      • Instance store-backed EC2 instances boot from an AMI stored in S3.
      • Use Instance Store over EBS, if very high IOPS rate is required.
  3. Elastic File Systems (EFS): Amazon EFS provides block based file storage for use with your EC2 instances. EFS storage capacity is elastic , growing and shirking automatically as you add and remove files.
    • It allows to be mounted and shared among multiple EC2 instances.
    • It supports Network File System v4 (NFSv4) protocol and data is stored across multiple AZs within a region
    • You only pay for the storage you use (unlike EBS, with EFS no pre-provisioning required).
    • It can scale up to petabytes and can support thousands of concurrent NFS connections, and provide read after write consistency.
    • Amazon EBS is designed for application workloads that benefit from fine tuning for performance, cost and capacity.
    • Typical use cases include Big Data analytics engines (like the Hadoop/HDFS ecosystem and Amazon EMR clusters), relational and NoSQL databases (like Microsoft SQL Server and MySQL or Cassandra and MongoDB), stream and log processing applications (like Kafka and Splunk), and data warehousing applications (like Vertica and Teradata).
    • mount target: Instances connect to a file system by using a network interface called a mount target. Each mount target has an IP address, which AWS assign automatically or you can specify.
      • P.S. You must assign default security group to the instance to successfully connect to EFS mount target from the instance.
    • File Syncs: EFS File Sync provides a fast and simple way for you to securely sync data from existing on-premises or in-cloud file systems into Amazon EFS file systems.
      • To use EFS File Sync, download and deploy a File Sync agent into your IT environment, configure the source and destination file systems, and start the sync. Monitor progress of the sync in the EFS console or using AWS CloudWatch.
  4. Glacier: Designed for long-term archival storage (not to be used for backups) that’s very rarely accessed. Amazon Glacier is a secure, durable, and extremely low-cost cloud storage service for data archiving and long-term backup.
    • It is designed to deliver 99.999999999% durability, but there’s no SLA on availability.
    • Amazon Glacier provides query-in-place functionality, allowing you to run powerful analytics directly on your archive data at rest.
    • Object are moved by Life Cycle Polices from standard to glacier storage class.
    • Archive objects are not for real time access, you need to submit a retrieval request, then data is copied into S3-RRS by AWS (which can take from mins to hrs). Then you can download it from there in 24 hrs (the time period can be mentioned during retrieval request).
      • You can’t use AWS Console for archive jobs retrieval
      • AWS SNS can be used to notify you, when a retrieval job is completed.
      • You pay for Glacier archive itself and the restored copy into S3-RRS for the duration you specify during retrieval request.
    • Its designed to sustain loss in two facilities.
    • You need to keep your data for a minimum of 90 days.
    • It automatically encrypt data at rest using AES-256.
    • It doesn’t archive object metadata, you need to maintain a client-side database to maintain this information.
    • You can upload archives to Glacier from 1 Byte to 40 TB. File sizes from 1 byte to 4 Byte can be done in one shot. Whereas for file sizes larger than 100 MB its recommended to use multi-part upload.
    • Upload is synchronous on multiple facilities but download is asynchronous.
    • You can upload files directly from CLI, SDK or through APIs but not from AWS console.
    • Its recommended to group many smaller files into a single tar or zip file to reduce overhead charges (i.e 32-40 KB for indexing and archive metadata). If you need to access a file into an archived file, then make sure you need the compression techniques that allows to access individual files.
    • If you delete your data from Glacier before 90 days from when it was archived, then you will be charged a deletion fee.
    • Expedited Retrieval (1-5 mins): More expensive, use for urgent requests only.
    • Standard Retrieval (3-5 hrs):  Less expensive, you get 10 GB data retrieval free per month.
    • Bulk Retrieval (5-12 hrs): Cheapest, use to retrieve large amounts up to Petabytes of data in a day.
  5. Storage Gateway: Its a service that connects an on-premises software appliance with a cloud-based storage to provide seamless and secure integration between an organization’s on-premises IT environment and AWS’s storage infrastructure. The service enable you to securely store data to the AWS cloud for scalable and cost-effective storage.
    • Connects local data center software appliances to cloud based storage such AWS S3.
    • Its a software appliance available for download as a VM image that you install on a host in your data center. It supports either VMWare ESXi or Microsoft Hyper-V.
    • Once its installed and associated with AWS account through activation process, then you can use AWS Management Console to create storage gateway option as per requirements.
    • Types of Storage Gateway are as:
    • File Gateway (NFS): Used for flat files which are stored directly in S3 buckets, and are accessed through a Network File System (NFS) mount point.
    • Volumes Gateway (iSCSI): The volume interface presents your applications with disk volumes using the iSCSI protocol. Its like a virtual hard disk which is based on block storage.
      • Data written to these volumes can be asynchronously backup up as point-in-time snapshots of your volumes, and stored in the cloud as Amazon EBS snapshots.
      • Snapshots are incremental backups that capture only changed blocks. All snapshot storage is also compressed to minimize storage cost.
      • Stored Volumes:  Entire data is stored on-premises storage devices and asynchronously backed up to S3 as incremental snapshots. 1 GB to 16 TB volume size.
      • Cached Volumes: Entire data is stored on S3 and the most frequently accessed data is cached on-premise storage devices. 1 GB to 32 TB in size for Cached Volumes.
    • Tape Gateway/Virtual Tape Library (VTL): Used for backup and uses popular backup applications like NetBackup, Backup Exec, Veeam etc.
Posted in aws, cloud