marketplace solution in a new VPC, AWS CloudFormation templates create and connect the following resources in the VPC:
Informatica domain on an EC2 instance, with additional instances to contain nodes in the Data Integration Service grid
Informatica clients on a remote Windows bastion server, on a public subnet
Amazon S3 storage resources and connections for source and target data in existing S3 buckets
Amazon RDS relational databases for Informatica domain repositories
AWS security and account management services
Lambda functions
The following diagram shows the architecture of
Data Quality
on AWS:
The numbers in the architecture diagram correspond to items in the following list:
A virtual public cloud (VPC) configured across two Availability Zones to contain the
Data Quality
deployment.
Availability Zones. This deployment provisions two private and one public subnets.
Subnets to contain specific elements of the deployment. Creates two private subnets, plus one public subnet if you want to use a Windows bastion server for Informatica clients. Creates each of the subnets in a different availability zone.
The Informatica domain, including the Model Repository Service and the Data Integration Service.
Oracle database to contain the following Informatica repositories:
Domain configuration repository
Model repository, which stores all the metadata for projects created using Informatica client tools. The Model repository also stores run-time and configuration information for applications that are deployed to a Data Integration Service.
S3 storage, to act as a temporary location for files that the Data Integration Service moves between EC2 instances and the EMR cluster.
AWS Lambda functions.
IAM roles.
Amazon CloudWatch.
Informatica clients in a separate EC2 bastion server in a public subnet.