Amazon Miscellaneous Services

  1. Amazon Game Development:
    1. Amazon GameLift
  2. Amazon Internet of Things:
    1. AWS IoT
    2. IoT Analytics
    3. IoT Device Management
    4. Amazon FreeRTOS
    5. AWS Greengrass
  3. Amazon Desktop & App Streaming
    1. WorkSpaces
    2. AppStream 2.0
  4. Amazon Business Productivity
    1. Alexa for Business
    2. Amazon Chime
    3. WorkDocs
    4. WorkMail
  5. Amazon Customer Engagement
    1. Amazon Connect
    2. Simple Email Service
    3. Pinpoint
  6. Amazon AR & VR
    1. Amazon Sumerian (Augmented Reality and Virtual Reality)

  7. Amazon Mobile Service
    1. Mobile Hub
    2. AWS AppSync
    3. Device Farm
    4. Mobile Analytics
  8. Amazon Machine Learning
    1. Amazon SageMaker
    2. Amazon Comprehend
    3. AWS DeepLens
    4. Amazon Lex
    5. Machine Learning
    6. Amazon Polly
    7. Rekognition
    8. Amazon Transcribe
    9. Amazon Translate

  9. Amazon Media Services
    1. Elastic Transcoder:
    2. Kinesis Video Streams:
    3. MediaConvert
    4. MediaLive
    5. MediaPackage
    6. MediaStore
    7. MediaTailor

  10. Amazon Developer Tools:
    1. CodeStar
    2. CodeCommit
    3. CodeBuild
    4. CodeDeploy
    5. CodePipeline
    6. Cloud9
    7. X-Ray

  11. Amazon Migration Services
    1. AWS Migration Hub
    2. Application Discovery Service
    3. Database Migration Service
    4. Server Migration Service
    5. Snowball
Posted in aws, cloud

Amazon Analytics Services

  1. Kinesis: It allows to easily collect, process, and analyze video and data streams in real time, so you can get timely insights and react quickly to new information. Its a real-time data processing service that continuously captures and stores large amounts of data that can power real-time streaming dashboards. Its components are as follows:
    • Stream: Streaming Data is data that is generated continuously by thousands of data sources, which typically send in the data records simultaneously, and in small sizes (order of Kilobytes).
    • Producers: Are devices that collect data and input into Kensis. Producers include IoT sensors and mobile devices.
    • Consumers: It consumes the streaming data concurrently. It amy include Real-time dashboards, S3 (storage), Redshift (big data), EMR (analytics), Lamda (event driven actions)
    • Shards (processing power): Each shard can process 2 MB of read and 1 MB of write data per second.
    • Benefits of Kensis includes:
      • Real-time and parallel processing, Fully-manged and Scalable
    • Applications of Kensis includes:
      • Gaming, Real-time analytics, Application alerts, Log/Event data collection, Mobile data captures.
  2. Elastic MapReduce (EMR): Its web service (managed Hadoop framework) that enables businesses, researchers, data analysts, and developers to easily and cost-effectively process vast amounts of data.
    • Its a service which deploys out EC2 instances based off the Hadoop big data framework to analyze and process vast amounts of data.
    • It also supports other distributed frameworks such as: Apache Spark, HBase, Presto, Flink
    • EMR Workflow is divided into following four steps:
      1. Storage: Data is stored in S3, DynamoDB, Redshift is sent to EMR
      2. Mapped: Then data is mapped to a Hadoop cluster of Master/Slave nodes for processing.
        • Mapping phase defines the process which splits the large data in file for processing. The data is split in 128 MB chunks.
      3. Computations (coded by developers): are used to process the data.
      4. Reduced: The processed data is then reduced to a single out set of return information.
        • Reduce phase aggregates the split data back into one data source. Reduced data needs to be stored (e.g in S3) because data processed by EMR cluster is not persistent.
    • Master Node: This is a single node that coordinate the distribution of data and tasks among other (slave) nodes for processing. It also tracks the status of the tasks and monitors the health of the cluster.
    • Slave Nodes: There are two types of slave nodes:
      • Core node: They run tasks and stores data in the Hadoop Distributed File System (HDFS) on the cluster.
      • Task node: They are optional node and only run tasks.
    • You has the ability to access the underlying operating system of EC2 instance and can add user data to EC2 instances launched into the cluster via bootstrapping. Also can resize a running cluster at anytime, you can deploy multiple clusters. EMR takes advantage of parallel processing for faster processing of data.
  3. Athena
  4. CloudSearch
  5. Elasticsearch Service
  6. QuickSight
  7. Data Pipeline
  8. AWS Glue
Posted in aws, cloud

Amazon Management Tools

  1. CloudWatch: Its AWS proprietary, integrated performance monitoring service. It allows for comprehensive and granular monitoring of all AWS provisioned resources, with the added ability to trigger alarms/events based off metric thresholds. It monitors operational and performance metric for your AWS cloud resources and applications.
    • It’s used to monitor AWS services such as EC2, EBS, ELB and S3.
    • You monitor your environment by configuring and viewing CloudWatch metrics.
    • Alarms can be created to trigger alerts, based on threshold you set on metrics.
    • Auto Scaling heavily utilizes Cloudwatch, relying on threshold and alarms to trigger the addition or removal of instances from an auto scaling group.
    • Metrics are specific to each AWS service or resource, and include such metrics:
      • EC2 per-instance metrics: CPUUtilization, CPURCreditUsage
      • S3 Metrics: NumberOfObjects, BucketSizeBytes
      • ELB Metrics: RequestCount, UnhealthyHostCount
    • Detailed vs Basic level monitoring:
      • Basic/Standard: Data is available automatically in 5 mins period at no charge.
      • Detailed: Data is available in 1 mins period.
    • CloudWatch EC2 monitoring:
      • System (Hypervisor) status checks: Things that are outside of our controls
        • Loss of network connectivity or system power.
        • Hardware or Software issues on physical host
        • How to Solve: Generally restarting the instance will fix the issue. This cause the instance to launch on a different physical hardware device.
      • Instance status checks: Software issues that we control.
        • Failed system status checks.
        • Mis-configured networking or start configuration
        • Exhausted memory, corrupted filesystem or incompatible kernel
        • How to solve: Generally a reboot or solving the file system configuration issues.
      • Default metrics, CloudWatch will automatically monitor metrics that can be viewed at the host level (Not the software level) Such as: CPU Utilization, DiskReadOps, NetworkIn/Out, StatusCheckFailed_Instance/System.
      • Custom metrics: OS level metrics that required a third part script (perl) to be installed (provided by AWS)
        • Memory utilization, memory used and available.
        • Disk swap utilization
        • Disk space utilization, disk space used and available.
    • Alarms: Allows you to set alarms that notify you when particular thresholds are hit.
    • Events: Helps you to respond to state changes in your AWS resources.
    • Logs: Helps you to aggregate, monitor, and store logs.
    • VPC Flow LogsAllows you to collect information about the IP traffic going to and from network interface in your VPC.
      • VPC Flow Log data is stored in a log group in CloudWatch.
      • Flow logs can be created on a separate VPC, Subnet (i.e include all network interfaces in it) or Network Interface and each interface have its own unique log stream.
      • The logs can be set on accepted, rejected or all traffic.
      • Flow logs are not captured in real-time,  but data is captured in approx. 10 mins window and then data is published.
      • Its can be used to troubleshooting why certain traffic is not reaching EC2 instance.
      • VPC Flow log consists of of specific 5-tuple of network traffic:
        1. Source IP address
        2. Source port number
        3. Destination IP address
        4. Destination port number
        5. Protocol
      • Following traffic is not captured by VPC Flow logs:
        • Traffic between EC2 instances and Amazon DNS server
        • Traffic generated by request for instance metadata (168.2554.169.254)
        • DHCP traffic
  2. CloudFormation: AWS CloudFormation allows you to quickly and easily deploy your infrastructure resources and applications on AWS. Its allows you to turn infrastructure into code. This provides numerous benefits including quick deployments, infrastructure version control, and disaster recovery solutions.
    • You can convert your architecture into JSON formatted template, and that template can be used to deploy out updated or replicated copies of that architecture into multiple regions.
    • It automate and saves time by deploying architecture in multiple regions.
    • It can be used to version control your infrastructure. Allowing for rollbacks to previous versions of your infrastructure if a new version has issues.
    • Allows for backups of your infrastructure and its a great solution for disaster recovery.
    • Stack: A stack is a group of related resources that you manage as a single unit. You can use one of the templates we provide to get started quickly with applications like WordPress or Drupal, one of the many sample templates or create your own template.
    • StackSet: A StackSet is a container for AWS CloudFormation stacks that lets you provision stacks across AWS accounts and regions by using a single AWS CloudFormation template.
    • TemplateTemplates tell AWS CloudFormation which AWS resources to provision and how to provision them. When you create a CloudFormation stack, you must submit a template.
      • If you already have AWS resources running, the CloudFormer tool can create a template from your existing resources. This means you can capture and redeploy applications you already have running.
      • To build and view templates, you can use the drag-and-drop tool called AWS CloudFormation Designer. You drag-and-drop the resources that you want to add to your template and drag lines between resources to create connections.
  3. CloudTrail: Its a auditing service, which logs all API calls made via either AWS CLI or SDK or Console to AWS. It provides centralized logging so that we can log each action taken in our environment and store it for later use if needed.
    • With CloudTrail, you can view events for your AWS account. Create a trail to retain a record of these events. With a trail, you can also create event metrics, trigger alerts, and create event workflows.
    • The created logs are placed into a designated S3 bucket, so they are highly available by default.
    • ClouTrail logs help to address security concerns by allowing you to view what actions users on your AWS account have performed.
    • Since AWS is just one big API, CloudTrail can log every single action taken in your account.
  4. Config: AWS Config is a service that enables you to assess, audit, and evaluate the configurations of your AWS resources. Config continuously monitors and records your AWS resource configurations and allows you to automate the evaluation of recorded configurations against desired configurations. With Config, you can review changes in configurations and relationships between AWS resources, dive into detailed resource configuration histories, and determine your overall compliance against the configurations specified in your internal guidelines. This enables you to simplify compliance auditing, security analysis, change management, and operational troubleshooting.
  5. OpsWorks: AWS OpsWorks is a configuration management service that provides managed instances of Chef and Puppet. Chef and Puppet are automation platforms that allow you to use code to automate the configurations of your servers. OpsWorks lets you use Chef and Puppet to automate how servers are configured, deployed, and managed across your Amazon EC2 instances or on-premises compute environments. OpsWorks has three offerings, AWS Opsworks for Chef Automate, AWS OpsWorks for Puppet Enterprise, and AWS OpsWorks Stacks.
  6. Service Catalog: AWS Service Catalog allows organizations to create and manage catalogs of IT services that are approved for use on AWS. These IT services can include everything from virtual machine images, servers, software, and databases to complete multi-tier application architectures. AWS Service Catalog allows you to centrally manage commonly deployed IT services, and helps you achieve consistent governance and meet your compliance requirements, while enabling users to quickly deploy only the approved IT services they need.
  7. Systems Manager: AWS Systems Manager gives you visibility and control of your infrastructure on AWS. Systems Manager provides a unified user interface so you can view operational data from multiple AWS services and allows you to automate operational tasks across your AWS resources. With Systems Manager, you can group resources, like Amazon EC2 instances, Amazon S3 buckets, or Amazon RDS instances, by application, view operational data for monitoring and troubleshooting, and take action on your groups of resources. Systems Manager simplifies resource and application management, shortens the time to detect and resolve operational problems, and makes it easy to operate and manage your infrastructure securely at scale.
  8. Trusted Advisor: An online resource to help you reduce cost, increase performance, and improve security by optimizing your AWS environment, Trusted Advisor provides real time guidance to help you provision your resources following AWS best practices.
  9. Managed Services: AWS Managed Services provides ongoing management of your AWS infrastructure so you can focus on your applications. By implementing best practices to maintain your infrastructure, AWS Managed Services helps to reduce your operational overhead and risk. AWS Managed Services automates common activities such as change requests, monitoring, patch management, security, and backup services, and provides full-lifecycle services to provision, run, and support your infrastructure. Our rigor and controls help to enforce your corporate and security infrastructure policies, and enable you to develop solutions and applications using your preferred development approach. AWS Managed Services improves agility, reduces cost, and unburdens you from infrastructure operations so you can direct resources toward differentiating your business.
Posted in aws, cloud

Amazon Application Integration

  1. Simple Notification Service (SNS): Its a flexible, fully managed pub/sub messaging and mobile notifications service for coordinating the delivery of messages to subscribing endpoints and clients.
    • Its a integrated notification service that allows for sending messages to various endpoints. Generally these messages are used for alert notifications to sysadmins or to create automation.
    • Its integrated into many AWS services, so its very easy to setup notifications based on events that occur in those services.
    • With CloudWatch and SNS, a full-environment monitoring solution can be created that notifies administrators of alerts , capacity issues, downtime, changes in the environment and more.
    • This service can also be used for publishing IOS/Android app notifications and creating automation based off of notifications.
    • Topic: Its a group of subscriptions that you send message to.
    • Subscription: An endpoint that a message is sent. Available endpoints are: HTTP, HTPS, Email, Email-JSON, SQS, Application, Mobile App notification, Lambda, SMS.
    • Publisher: The entity triggers the sending of a message. Such as Human, S3 Event, Cloudwatch Alarm
  2. Simple Queue Service (SQS): Its a reliable, scalable, fully-managed message queuing service. It provides the ability to have hosted and highly available queues that can be used for messages being sent between servers.
    • It allows for highly available and distributed decoupled application architecture. This is accomplished through utilizing the use of message and queues and retrieving messages is polling.
    • Each message can contain up to 256 KB of text in any format.
    • Polling Types:
      • Long Polling (1-20 secs): Allows the SQS service to wait until a message is available in a queue before sending a response and will return all message from all SQS services. Long polling reduces API requests (over using short polling).
      • Short Polling: SQS samples a subset of servers and returns messages from just those servers. It will not return all possible message in a poll. Increases API requests (over long polling) which increases costs.
    • Queue types:
      • Standard Queue: Guarantees delivery of each message at least once but does not guarantee the order (best effort) in which they are delivered.
      • First in First Out (FIFO) QueueDesigned for applications where the order of operations and events is critical, or where duplicates can’t be tolerated.
    • SQS Workflow: Generally a “worker” instance will “poll” queue to retrieve waiting messages for processing. Auto scaling can be applied based off of queue size so that if a component of your application has an increase in demand, the number of worker instances can increase.
    • SQS Message: A set of instructions that will be relayed to the worker instances via the SNS Queue. The message can be up to 256 KB of text in any format. Each message is guaranteed to be delivered at least once but order is not guaranteed and duplicates can occur.
    • SQS Queue: It stores messages (up to 14 days), that can be retrieved through polling. Queues allows components of your application to work independently of each other (Decoupled environment).
  3. Simple Workflow Service (SWF): Its a fully managed work flow service that coordinates and manages the execution of activities. It manages a specific job  from start to finish, while still allowing for distributed decoupled architecture.
    • It allows an architect/developer to implement distributed, asynchronous applications as work flow.
    • It has consistent execution and guarantees the order in which tasks are executed and there are no duplicate tasks.
    • A workflow execution can last up to 1 year.
    • Workflow: A sequence of steps required to perform a task, also commonly referred as decider. It coordinates and manages the execution of activities that  can be run asynchronously across multiple computing devices.
    • Activities: A single step (or unit of work) in the workflow.
    • Tasks: What interacts with the workers that are part of a workflow.
      • Activity task: Tells the worker to perform a function.
      • Decision task: Tells the decider the state of the work flow execution, by communicating (back to the decider) that a given task has been completed, which allow him/her to determine the next activity to be performed.
    • Worker: Responsible for receiving a task and taking action on it.
      • Can be any type of component such as an EC2 instance or even a person.
  4. Step Functions:
  5. Amazon MQ
Posted in aws, cloud

Amazon Security, Identity & Compliance Services

Amazon Web Services (AWS): AWS is made of Regions, which are grouping of independently separated data centers in a specific geographic regions known as Availability Zones.

  • Regions: A grouping of AWS resources located in a specific geographic region. Designed to service AWS customers that are located closet to a region. Regions are comprised of multiple Availability Zones.
  • Availability ZoneGeographical isolated zones a region that house AWS resources. Availability zones are where separate, physical AWS data centers are located. Multiple AZs in each Region provide redundancy for AWS resources in that region.
  • Data Centers:
  • Edge Locations: It’s an AWS data center which doesn’t contain AWS services. Instead its used to deliver contents to the parts of the world.
  1. Identity and Access Management (AIM): AWS Identity and Access Management (IAM) enables you to manage access to AWS services and resources securely. Using IAM, you can create and manage AWS users and groups, and use permissions to allow and deny their access to AWS resources. You use IAM to control who is authenticated (signed in) and authorized (has permissions) to use resources.
    • Its where you manage your AWS users and their access to AWS accounts and services.
    • Its a universal and centralized control of your AWS account and shared access to your AWS account, its global and does not apply to regions
    • Granular permissions and Multi-factor authentication
    • and Identity federation (e.g Active Directory, Facebook, LinkedIn etc)
    • Provide temporary access for users/devices and services
    • Allows you setup your own password rotation policy
    • Users: Individual user accounts, when created are assigned Access Key ID and Secret Access Keys which are used to AWS resources via CLI and APIs.
    • Groups: Collections of users
    • Policies: Polices are granted to users, group and roles to allows access for them onto AWS services. Policies are attached onto usersgroups and roles. You can create your own password rotation polices.
    • Roles: IAM roles are a secure way to grant permissions to entities that you trust. IAM roles issue keys that are valid for short durations, making them a more secure way to grant access. Roles are created and policies are attached onto them to grant access for AWS Services.
      • Roles are more secure than storing your access key and secret key on EC2 instance.
      • You can apply only one role to an EC2 instance at a time and it can be assigned after the instance is provisioned. You can also update the policies applied onto the role anytime.
      • AWS Service role: Allows AWS services to perform actions on your behalf.
      • Another AWS account role: Allows entities in other accounts to perform actions in this account.
      • Web identity role: Allows users federated by the specified external web identity or OpenID Connect (OIDC) provider to assume this role to perform actions in your account.
      • SAML 2.0 federation role: Allows users that are federated with SAML 2.0 to assume this role to perform actions in your account.
    • MFA (Multi-Factor Authentication)
    • ARN (Amazon Resource Names)
    • STS (Security Token Service): Allows you to create temporary security credentials that grant trusted users access to your AWS resources.
    • IAM API Keys: They are required to make programmatic calls to AWS resources via:
      • AWS Command Line Interface (CLI)
      • Tools for Windows PowerShell
      • AWS SDKs
      • Direct HTTP calls using the APIs for individual AWS services
    • Amazon CLI (awscli):
      • Install awscli:
        • pip install awscli –upgrade –user
        • aws configure (Access key ID & Secret access key from AIM user)
          • AWS Access Key ID:
          • AWS Secret Access Key:                      (~user/.aws/credentials)
          • Default region name: us-east-1          (~user/.aws/config)
          • P.S.
      • S3: (The user/instance must have appropriate S3 Access policy assigned)
        • aws s3 ls [mybucket]
        • aws s3 cp –recursive  s3://mybucket  /tmp  [–region us-east-1]
      • aws ec2 describe-instances | grep -i instanceid
      • aws ec2 terminate-instances –instance-ids <instance-id>
  2. Cognito
  3. GuardDuty
  4. Inspector
  5. Amazon Macie
  6. AWS Single Sign-On
  7. Certificate Manager
  8. CloudHSM
  9. Directory Service
  10. WAF & Shield
  11. Artifact
Posted in aws, cloud

Amazon Storage Services

  1. Simple Storage Service (S3)Amazon Simple Storage Service is a object based storage for the Internet. It is designed to make web-scale computing easier for developers. Amazon S3 has a simple web services interface that you can use to store and retrieve any amount of data, at any time, from anywhere on the web. It gives any developer access to the same highly scalable, reliable, fast, inexpensive data storage infrastructure that Amazon uses to run its own global network of web sites. The service aims to maximize benefits of scale and to pass those benefits on to developers.
    • Its object based i.e allows you to upload files in S3 Buckets, not suitable to install OS or databases on it.
    • S3 is universal namespace, bucket names must be unique globally and in lower case
    • Files can be from 0 Bytes to 5 TB in size but S3 storage is unlimited.
    • Read after write consistency for PUTS of new objects
    • Eventual consistency for overwrite of PUTS and DELETES (Can take some time to propagate)
    • You are charges based upon Storage, Requests, Data Transfer and Storage Management Pricing, Transfer Acceleration
    • Buckets: Its a unique global root level folder across all AWS S3 and managed Globally.
      • You can create 100 buckets in a single account.
      • Bucket names must be between 3-63 chars in length, all in lower case
      • By default bucket and all objects in it are private
      • Can be configured to create access logs which log all requests made to S3 bucket. The logs can be stored in the same or to another bucket.
    • Folders: Any subfolder created in a bucket
    • Objects: Files stored or uploaded into a bucket.
      • Object consists of: Key (name), Value (data), VerisionID, Metadata, Subresources (ACLs, Torrent)
      • Successful uploaded files/objects generate a HTTP 200 status code
      • Objects stay in a region and are synced across all availability zones in it
      • Objects are cached for the life of the TTL
    • Permissions:
      • All buckets and objects are private by default
      • The access can be granted either by IAM policy or resource policy:
      • Resource based Policy:
        • Bucket Policy: Are attached to only S3 bucket and permissions are applied to all objects in a bucket.
        • S3 ACL: Grant access to other AWS account or to the public. Both buckets and objects have ACL. Object ACL allow us to share to share an S3 object via a public URL.
    • Storage ClassesAre divided based upon cost, durability and availability:
      • Standard (Durable, Immediately Available, Frequently Accessed):  Default general and all purpose storage and is most expensive. It supports Availability of 99.99% and Durability of 99.999999999% (11*9’s).
      • Infrequent Access / S3-IA (Durable, Immediately Available, Infrequently Accessed): It has min 30-day retention period and 128KB min object size. Designed for less frequently accessible objects. It has object Availability of 99.90% and Durability of 99.999999999%.
      • Reduced Redundancy Storage/RRS (Not-Durable, Immediately Available, Frequently Accessed): For non-critical reproducible objects such as thumbnails. It has object Availability of 99.99% and Durability of 99.99%.
      • Glacier: Designed for long-term archival storage (not to be used for backups) that’s very rarely accessed and its cheapest among all.
        • Object are moved by Life Cycle Polices from standard to glacier storage class.
        • May take from several mins to hours (3-5 hrs) to retrieve the objects and is the cheapest among all storage classes.
    • Encryption:
      • Client Side Encryption
      • Server Side Encryption (SSE):
        • SSE-S3 / AES256  (SSE with Amazon S3-Managed Keys)
        • SSE-KMS / AWS-KMS (SSE with AWS KMS-Managed Keys)
        • SSE-C (Customer Provided Keys).
    • Versioning: It’s a feature to manage and store all old/new/deleted versions of objects.
      • Its set on the bucket level and applies to all objects.
      • By default its disabled but once enabled, then it can only be suspended  and it will apply for newer objects.
      • Older objects’s class automatically changed to Reduced Redundancy.
      • Can be used with life cycle policies to create a great archiving and backup solution in S3.
      • Supports versioning MFA Delete capability.
    • Lifecycle Policies (Object): Set of rules that automate the migration an object’s storage class to a different storage class (i.e from Standard to IA) and delete based on time interval.
      • By default life cycle policies are disabled on a bucket/object.
      • Can be used with versioning (current and previous versions) to create a great archiving and backup solution in S3.
      • Transition to Standard-IA class (30 days after creation & 128KB min size)
      • Archive to Glacier class (60 days after creation or 30 days after S3-IA)
      • Can be used to permanently delete objects (61 days after creation)
    • EventsS3 Event notification allows you to setup automated communication between S3 and other AWS services when a selected even occurs in S3 bucket.
      •  Common event notification triggers include:
        • RSSObjectLost
        • ObjectCreated (Put, Post, Copy)
        • ComplteMultiPartUpload
      • Even notification can be sent to SNS, Lambda, SQS Queue.
    • Static Web Hosting: Amazon S3 provides an option for a low cost, highly reliable web hosting service for static web sites. Its server-less, very cheap and scales automatically.
      • When enabled it provide you a unique endpoint url that you can point to a properly formatted file stored in S3 bucket. It supports: HTML, CSS, JavaScript
      • Amazon Route 53 can also map human readable domain names to static web hosting buckets, which are ideal for DNS failover solution.
    • Cross-Region Replication (CRR): Cross-region replication is a bucket-level configuration that enables automatic, asynchronous copying of objects across buckets in different AWS Regions. We refer to these buckets as source bucket and destination bucket. These buckets can be owned by different AWS accounts.
      • Versioning must be enabled on both source and destination bucket
      • Replicating to multiple buckets is not allowed and Regions must be unique
      • Existing files are not replicated automatically from source to destination bucket, but subsequent uploaded/updated files or deleted markers will be replicated automatically.
      • Deleting individual versions or delete markers will not be replicated.
      • Only new objects would be replicated, for existing object you need to copy the objects yourself using cli:
        • pip install awscli –upgrade –user
        • aws configure
        • aws s3 cp –recursive s3://src  s3://dst  (Copy from src to dest bucket)
        • aws s3 ls dst
    • Transfer Acceleration: It utilizes the CloudFront Edge Network to accelerate your uploads to S3. Instead of directly uploading to your S3 bucket, you can use a distinct URL to upload directly to an edge location which will then transfer that file to S3. You will get a distinct URL to upload to:
    • Cross-Origin Resource Sharing (CORS): Its a method of allowing a web application located in one domain to access and use the resources in another domain.
      • This allows web applications running JavaScript and HTML5 to access resources in S3 buckets without using a proxy server.
      • If enabled, then a web app hosted in S3 bucket can access resources in another S3 bucket.
    • File Upload options:
      • Single Operation Upload: Its a traditional method in which you upload a file in one part. SOU can upload a file up to 5 GB, how any file over 100 MB should use multipart upload.
      • Multipart Upload: It allows you to upload a single file as a set of parts and all parts can be uploaded concurrently.
        • After all parts are uploaded, then AWS S3 assemble these parts and create the object.
        • It should be used to upload any file over over 5 GB and up to 5 TB. But is highly suggested to use this option when object size is over 100 MB.
    • AMZ Import/Export Disk: It accelerates moving large amounts of data into and out of the AWS cloud using portable external disks for transport. You mail on-premise data in a storage device to AWS data center, that would be imported by Amazon in S3 in one business day.
    • Snowball: AWS Snowball is a service used to transfer data into the cloud at faster-than-Internet speeds or harness the power of the AWS Cloud locally using AWS-owned appliances.
      • Snowball: Its 80TB data transfer device that uses secure appliances to transfer data into and out of AWS S3.
      • Snowball Edge: Its a 100TB data transfer device with on-board storage and compute capabilities.
      • Snowmobile: Its an Exabyte-scale data transfer service used to move extremely large amounts of data to AWS. You can transfer up to 100PB per Snowmobile, a 45-foot long ruggedized shipping container, pulled by a semi-trailer truck.
  2. Elastic File Systems (EFS): Amazon EFS provides block based file storage for use with your EC2 instances. EFS storage capacity is elastic , growing and shirking automatically as you add and remove files.
    • It allows to be mounted and shared among multiple EC2 instances.
    • It supports Network File System v4 (NFSv4) protocol and data is stored across multiple AZs within a region
    • You only pay for the storage you use (unlike EBS, with EFS no pre-provisioning required).
    • It can scale up to petabytes and can support thousands of concurrent NFS connections, and provide read after write consistency.
    • Amazon EBS is designed for application workloads that benefit from fine tuning for performance, cost and capacity.
    • Typical use cases include Big Data analytics engines (like the Hadoop/HDFS ecosystem and Amazon EMR clusters), relational and NoSQL databases (like Microsoft SQL Server and MySQL or Cassandra and MongoDB), stream and log processing applications (like Kafka and Splunk), and data warehousing applications (like Vertica and Teradata).
    • mount target: Instances connect to a file system by using a network interface called a mount target. Each mount target has an IP address, which AWS assign automatically or you can specify.
      • P.S. You must assign default security group to the instance to successfully connect to EFS mount target from the instance.
    • File Syncs: EFS File Sync provides a fast and simple way for you to securely sync data from existing on-premises or in-cloud file systems into Amazon EFS file systems.
      • To use EFS File Sync, download and deploy a File Sync agent into your IT environment, configure the source and destination file systems, and start the sync. Monitor progress of the sync in the EFS console or using AWS CloudWatch.
  3. Glacier: Designed for long-term archival storage (not to be used for backups) that’s very rarely accessed. Amazon Glacier is a secure, durable, and extremely low-cost cloud storage service for data archiving and long-term backup. It is designed to deliver 99.999999999% durability, and provides comprehensive security and compliance capabilities that can help meet even the most stringent regulatory requirements. Amazon Glacier provides query-in-place functionality, allowing you to run powerful analytics directly on your archive data at rest. Customers can store data for as little as $0.004 per gigabyte per month, a significant savings compared to on-premises solutions. To keep costs low yet suitable for varying retrieval needs, Amazon Glacier provides three options for access to archives, from a few minutes to several hours.
    • Object are moved by Life Cycle Polices from standard to glacier storage class.
    • May take from several mins to hours to retrieve the objects and is the cheapest among all storage classes.
  4. Storage Gateway: Its a service that connects an on-premises software appliance with a cloud-based storage to provide seamless and secure integration between an organization’s on-premises IT environment and AWS’s storage infrastructure. The service enable you to securely store data to the AWS cloud for scalable and cost-effective storage.
    • Connects local data center software appliances to cloud based storage such AWS S3.
    • Its a software appliance available for download as a VM image that you install on a host in your data center. It supports either VMWare ESXi or Microsoft Hyper-V.
    • Once its installed and associated with AWS account through activation process, then you can use AWS Management Console to create storage gateway option as per requirements.
    • Types of Storage Gateway are as:
    • File Gateway (NFS): Used for flat files which are stored directly in S3 buckets, and are accessed through a Network File System (NFS) mount point.
    • Volumes Gateway (iSCSI): The volume interface presents your applications with disk volumes using the iSCSI protocol. Its like a virtual hard disk which is based on block storage.
      • Data written to these volumes can be asynchronously backup up as point-in-time snapshots of your volumes, and stored in the cloud as Amazon EBS snapshots.
      • Snapshots are incremental backups that capture only changed blocks. All snapshot storage is also compressed to minimize storage cost.
      • Stored Volumes:  Entire data is stored on-premises storage devices and asynchronously backed up to S3 as incremental snapshots. 1 GB to 16 TB volume size.
      • Cached Volumes: Entire data is stored on S3 and the most frequently accessed data is cached on-premise storage devices. 1 GB to 32 TB in size for Cached Volumes.
    • Tape Gateway/Virtual Tape Library (VTL): Used for backup and uses popular backup applications like NetBackup, Backup Exec, Veeam etc.
Posted in aws, cloud

Amazon Database Services

  1. Relational Database Services (RDS/SQL Databases): RDS provides you with multiple options for hosting a fully-managed relational database on AWS. RDS provides many advantages over hosting your own database server, including automated backups, multi-AZ failover, and read replicas. It supports following databases:
    • Its an Online Transaction Processing Database (OLTP) and SQL database service that consists of six database engines that include, Amazon Aurora, MySQL, MariaDB, PostgreSQL, Oracle, Microsoft SQLServer.
    • It doesn’t allow access to the underlying operating system and its managed by Amazon (fully-managed)
    • It has the ability to provision/resize hardware on demand for scaling.
    •  You can enable Multi-AZ deployments for backups and high available solutions.
    • Utilize Read Replicas (MySQL/PostgreSQL/Aurora) to help offload hits on primary database.
    • Automated minor updates and backups, and recovery in event of a failover.
    • When RDS (db) instance is deleted , all automated backups, system snapshots and point-in-time recovery are removed.
    • Multi-AZ Failover: Its a synchronous replication from production (primary) database to the standby database (in another AZ) and used for Disaster Recovery (DR) or fail-over.
      •  In the event of primary database outage, AWS will automatically fail-over to standby database by switching CNAME DNS record from primary to the standby instance.
      • In order for Multi-AZ to work, your primary database instance must be launched into a subnet group.
      • RDS backups are taken against the standby instance to reduce I/O freeze.
      • For Aurora, Multi-AZ is turned-on by default but for other AWS supported database types you have to turn it on explicitly.
    • RDS Read Replicas: Its a asynchronous replication from production database (master) to read replica (in another AZ or region), and used for improving performance.
      • You can have 5 read replicas by default on production database and currently its not supported on Oracle and SQL Server.
      • Must have automated backup turned on in order to deploy a read replica.
      • The additional read replicas can be created from existing replica, and you can have read read replicas of read replicas.
      • You can promote read replica to a database.
      • Each read replica have its own DNS endpoint.
      • You can have read replica in another region.
      • You can have read replicas that have Multi-AZ enabled.
      • CloudWatch can be used to monitor replication lag.
      • It should be used for high volume, non-cached databases read traffic (elasticity), running data warehousing, rebuilding indexes and import/export data into RDS
    • Backups:
      • Automated BackupsAWS provides automated point-in-time automated backups against RDS db instance. Automated backups are enabled by default, and are taken in a defined window of time period. The backup data is stored in S3, you get free storage space equal to the size of the db.
        • They allows you to recover your database to any point in time within a retention period of 1 to 35 days.
        • When RDS (db) instance is deleted, then all automated backups are removed.
        • They take a full daily snapshot and will also store transaction logs throughout the day.  When you do a recovery, AWS will first choose the most recent daily back up, and then apply transaction logs relevant to that day. This allows you to do a point in time recovery down to a seconds.
        • Backups on db engines work correctly when db engine is transactional. MySQL requires InnoDB for reliable backups.
      • Database Snapshots: DB Snapshots are done manually by the user. They are stored even after you delete the original RDS instance, unlike automated db backups.
        • Whenever you restore either Automatic Backup or a manual Snapshot, the restored version of the database will be a new RDS instance with a new DNS endpoint.
        •  I/O operations are suspended for the duration of the snapshot.
    • Encryption: Encryption at rest which is done by Amazon Key Management (KMS) service and supported by all database types in AWS. Once your RDS instance is encrypted, the data stored at rest in in underlying storage, its automated backups, snapshots, read replicas are also encrypted.
      • At present time the encrypting an existing DB instance is not supported, but you can create a snapshot of it and make a copy of that snapshot and can encrypt the copy.
    • Aurora:
      • Its is a MySQL- and PostgreSQL-compatible enterprise-class relational database engine that provides up to 5 times the throughput of MySQL and 3 times the throughput of PostgreSQL at 1/10th the cost of commercial databases and it offers greater than 99.99% availability.
      • Start with 10Gb up to 64TB of auto-scaling SSD storage (in 10 Gb increments).
      • Compute resources can scale up to 32 vCPUs and 244 Gb memory (during maintenance window).
      • 6-way replication across three Availability Zones (2 copies of data storage in each AZ).
      • Up to 15 Aurora Read Replicas and up to 5 MySQL Read Replicas with sub-10ms replica lag.
      • Automatic monitoring and fail-over in less than 30 seconds, during fail-over Amazon RDS will promote the replica with highest priority to primary. Priority tier logic: tier-0 > .. > tier-15.
      • Designed to transparently handle loss of 2 write and 3 read copies without affecting availability.
      • Aurora storage is self-healing. Data blocks and disks are continuously scanned for errors and repiared automatically.
  2. DynamoDB (NoSQL databases): Amazon DynamoDB is a fast and flexible NoSQL database service for all applications that need consistent, single-digit millisecond latency at any scale. Its flexible data model and reliable performance make it a great fit for mobile, web, gaming, ad-tech, IoT, and many other applications.
    • Its a fully managed, NOSQL database service and is similar to MongoDB but is a home grown AWS solution.
    • DynamoDB is a schema-less database that only requires a table name and primary key. The table’s primary key is made up of one or two attributes that uniquely identify items, partition the data, and sort data within each partition.
    • Stored on SSD storage and spread across 3 geographically distinct data centers.
    • You specify the required throughput capacity and DynamoDB does the rest being fully-managed. Service manages all provisioning and scaling of underlying hardware. Fully distributed and scales automatically with demand and growth.
    • Easily integrates with other AWS services such as Elastic MapReduce and easily move data to a hadoop cluster in Elastic MapReduce.
    • It supports both document and key value data models. Popular use cases include:
      • IoT (storing meta data)
      • Gaming (storing session information, learderboards)
      • Mobile (Storing user profiles, personalization)
    • Eventually Consistency Reads: Consistency across all copies of data is usually reached within a second. Repeating a read after a short time should return the updated data. (Best Performance Read)
    • Strongly Consistent Reads: A strongly consistent read returns a result that reflects all writes that received a successful response prior to the read.
    • Reserved Capacity: Its a billing feature that allows you to obtain discounts on your provisioned throughput capacity in exchange for a one-time-up-front payment and commitment to a minimum monthly usage level. It applies to a single AWS region and can be purchased with a 1-year or 3-year terms.
    • Default settings:
      • No secondary indexes.
      • Provisioned capacity set to 5 reads and 5 writes.
      • Basic alarms with 80% upper threshold using SNS topic “dynamodb”.
  3. ElastiCacheIts a web service that makes it easier to launch, manage, and scale a distributed in-memory cache in the cloud, and is powered by Memcached and Redis engines.
    • Its a fully managed in memory cache engine and is used to improve performance by caching results of db queries.
    • The service improves the performance of web apps by allowing you to retrieve information from fast, managed, in-memory caches instead of relying entirely on slower disk-based databases.
    • Its designed for large, high-performance or taxing queries. It can store the queries to alleviate hits to the database.
    • It allows for managing web sessions and also caching dynamic generated data.
    • Memcached: Its a widely adopted memory object caching system. ElastiCache is protocol compliant with Memcached, so popular tools that use today with existing Memcached environments will work seamlessly with the service.
    • Redis: Its popular open-soruce in-memory key-value store that supports data structures such as sorted sets and lists. ElastiCache supports Master/Slave replication and Multi-AZ which can be used to achieve cross AZ redundancy.
  4. Amazon RedshiftAmazon Redshift is a data warehouse Online Analytical Processing (OLAP) based service that’s fast, fully managed, petabyte-scale data warehouse solution that makes it simple and cost-effective to efficiently analyze all your data using your existing business intelligence tools.
    • Generally used for big-data analytics and it can integrate with most popular business intelligence tools that include: Jaspersoft, Microstrategy, Pentaho, Tableau, Business Object and Cognos.
    • Single Node can have 10 Gb.
    • Multi-Node:
      • Leader Node: Manages client connections and receives the queries.
      • Compute Node: Up to 128 compute nodes
    • Columnar Data Storage: Instead of storing data as a series of rows, Amazon Redshift organizes the data by column. Unlike row-based systems which are ideal for transaction processing, column-based systems are ideal for data warehousing and analytics, where quiers often involve aggregates performed over large data sets.
    • Advanced Compression: Columnar data stores can be compressed much more than row-based data stores because similar data is stored sequentially on disk. It does not require indexes or materialized views, so uses less space than traditional relational database systems.
    • Massively Parallel Processing (MPP): Amazon Redshift automatically distributes data and query load across all nodes.
    • Security: Encrypted in transit using SSL and at rest using AES-256.
    • Currently only available in 1 AZ but can restore snapshots to new AZs in the event of an outage.
Posted in aws, cloud