AWS uncommon services
SES
- fully managed service to send emails securely, globally and at scale
- allows inbound/outbound emails
- use cases: transactional, marketing and bulk email communications
Pinpoint
- scalable 2-way (inbound/outbound) marketing communication service
- supports email, sms, push, voice and in-app messaging
- use cases: run campaigns by sending marketing, bulk, transaction SMS messages
- SNS/SES vs Pinpoint
- SNS/SES: need to manage each message's audience, content and delivery schedule
- Pinpoint: create message templates, delivery schedule, segments/groups and full campaigns
SSM Session Manager
- allows a secure shell on EC2 and on-premise servers
- no SSH access, no bastion hosts, no SSH keys needed
- no port 22 needed (no security group is required)
- for an existing EC2 instance, or starting a new EC2 instance
- assign SSM permissions to the instance role (for example, the
AmazonEC2RoleforSSM
policy) - in the EC2 instance
connect
, you can use Session Manager - or in Systems Manager => Fleet Manager: you can see the EC2 instance
- assign SSM permissions to the instance role (for example, the
AWS Batch
- a fully managed batch computing service that plans, schedules, and runs your batch computing workloads at any scale
- use Docker image, runs on ECS, EKS, Fargate
- AWS Batch vs Lambda
- Lambda:
- time limit
- limited runtime
- limited temporary disk space®®®
- serverless
- Batch
- no time limit
- any runtime as long as it's packaged as a docker image
- rely on EBS/instance store for disk space
- relies on EC2 (ECS)
- Lambda:
Amazon Rekognition
Amazon Rekognition is a cloud-based image and video analysis service that makes it easy to add advanced computer vision capabilities to your applications.
- it can search, verify, and organize millions of images and videos
- you can proactively detect inappropriate, unwanted or offensive content
- you can define an interface VPC endpoint for Amazon Pekognition, which communicates with resources on your VPC without going through the public internet
- related services:
- Amazon SageMaker: it is a fully managed machine learning service, you can prepare, build, train, and deploy high-quality ML models efficiently.
Amazon SWF (Simple Workflow Service)
-
provides a way to build, run and scale background jobs that have parallel or sequential steps
-
you can coordinate work across distributed components, tracking the state of tasks (which can run either on AWS, or on-premises)
-
Services that can be used to create a decoupled architecture for applications that use both AWS and on-premise resources?
- Amazon SWF (Simple Workflow)
- SQS/SNS
- ELB (in the target group, the
IP Address
target type supports load balancing to VPC and on-premises resources)
Amazon EMR
- EMR: Elastic Map Reduce
- a managed cluster platform that simplifies running big data frameworks, such as Apache Hadoop and Apache Spark on AWS to process and analyze vast amount of data
- It securely and reliably handles a broad set of big data use cases, including log analysis, web indexing, data transformations (ETL), machine learning, financial analysis, scientific simulation and bioinformatics.
Amazon Redshift
-
It is the most widely used cloud data warehouse
-
a fast, simple and cost-effective way to analyze the data using standard SQL and existing Business Intelligence (BI) tools
-
allows you to run complex analytic queries against terabytes to petabytes of structured and semi-structured data, using sophisticated query optimization, columnar storage on high-performance storage and massively parallel query execution.
-
Redshift Spectrum: Using Amazon Redshift Spectrum, you can efficiently query and retrieve structured and semistructured data from files in Amazon S3 without having to load the data into Amazon Redshift tables.
Storage Gateway
AWS Storage Gateway gives your applications on-premises and in-cloud access to virtually unlimited cloud storage. You can deploy it as a Virtual Machine (VM) within VMware, Hyper-V or as an EC2 instance within VPC.
Storage Gateway types:
- File Gateway
- Amazon FSx File Gateway
- Tape Gateway
- Volume Gateway
For example, you can use a Tape Gateway to replace physical tapes on-premises with virtual tapes in AWS without changing existing backup workflows
AWS Glue
-
A serverless data integration service that makes it easy for analytics users to discover, prepare, move, and integrate data from multiple sources.
-
With AWS Glue, you can discover and connect to more than 70 diverse data sources and manage your data in a centralized data catalog.
-
You can visually create, run, and monitor extract, transform, and load (ETL) pipelines to load data into your data lakes
-
You can immediately search and query cataloged data using Amazon Athena, Amazon EMR, and Amazon Redshift Spectrum.
-
Features:
- Discover and organize data
- Transform, prepare, and clean data for analysis
- Build and monitor data pipelines
-
Job bookmarks:
- AWS Glue tracks data that has already been processed during a previous run of an ETL job by persisting state information from the job run. This persisted state information is called a job bookmark.
- Job bookmarks help AWS Glue maintain state information and prevent the reprocessing of old data. With job bookmarks, you can process new data when rerunning on a scheduled interval.
- A job bookmark is composed of the states for various elements of jobs, such as sources, transformations, and targets. For example, your ETL job might read new partitions in an Amazon S3 file. AWS Glue tracks which partitions the job has processed successfully to prevent duplicate processing and duplicate data in the job's target data store.
Amazon Transcribe
Amazon Transcribe is an automatic speech recognition service that uses machine learning models to convert audio to text(audio => text).
You can use Amazon Transcribe as a standalone transcription service or to add speech-to-text capabilities to any application.
You can transcribe media in real time (streaming) or you can transcribe media files located in an Amazon S3 bucket (batch).