Herd-MDL Advanced Install¶
Use these Advanced Installation instructions if your organization requires that certain AWS resources such as IAM Roles, Security Groups, etc. are created outside Herd-MDL automated install. The Advanced Install allows for optional creation of AWS resources through other mechanisms and provides detailed specifications on what to create and how to reference resources created outside Herd-MDL automated install.
See Basic Install for an easy, turnkey installation of Herd-MDL.
Prerequisites¶
These are prerequisites that are necessary for installing MDL components for Advanced Installation Type
- An AWS account
- User who has power user access as per this policy - arn:aws:iam::aws:policy/PowerUserAccess
- MDL deployment creates various AWS resources like Cloudformation, EC2, IAM, Security Groups, S3 etc, and power user access is needed for creating these resources
- Sample IAM policy for PowerUserAccess
- Domain - if (EnableSSLAndAuth == true)
- User account needs to own a Domain for using with Route53 record set for public end points created by MDL
- Refer AWS documentation for setting up a new Domain
- Domain must be owned by User in case "EnableSSLAndAuth" parameter is specified
- Certificate in ACM with Wildcard Domain Name - if (EnableSSLAndAuth == true)
- The Certificate should match any first level subdomain (Wildcard character supports this)
- Format - *.domain
- Example: *.example.com
- MDL prefixes corresponding first level subdomain (Example: mdlHerd.example.com, mdlShepherd.example.com, and mdlBdsql.example.com)
- Certificate in ACM needs to exist in the AWS account in case "EnableSSLAndAuth" parameter is specified
- Hosted Zone in Route53 - if (EnableSSLAndAuth == true)
- MDL adds a record set for Hosted Zone in Route53 to associate the DNS information with Domain Name
- MDL creates three record sets (Example: mdlHerd.example.com, mdlShepherd.example.com, and mdlBdsql.example.com)
- Hosted Zone needs to exist in the AWS account in case "EnableSSLAndAuth" parameter is specified
- There are some conditional parameters for MDL, which let the user to use existing resources instead of creating them as part of MDL. In these case, corresponding SSM parameters need to be created before creating MDL stack. Following are the conditional resource types in MDL.
- VPC/Subnets - if (CreateVPC == false)
- Refer to CreateVPC section and create SSM parameters in case existing VPC/Subnets need to be used
- S3 - if (CreateS3Buckets == false)
- Refer to CreateS3Buckets section and create SSM parameters in case existing S3 buckets need to be used
- IAM - if (CreateIAMRoles == false)
- Refer to CreateIAMRoles section and create SSM parameters in case existing IAM roles need to be used
- RDS - if (CreateRDSInstances == false)
- Refer to CreateRDSInstances section and create SSM parameters in case existing RDS instances need to be used
- Security Groups - if (CreateSecurityGroups == false)
- Refer to CreateSecurityGroups section and create SSM parameters in case existing Security Groups need to be used
- SQS - if (CreateSQS == false)
- Refer to CreateSQS section and create SSM parameters in case existing SQS need to be used
- OpenLDAP - if (CreateOpenLDAP == false)
- Refer to CreateOpenLDAP section and create SSM parameters in case existing OpenLDAP need to be used
- KeyPair - if (CreateKeypair == false)
- Refer to CreateKeypair section and create SSM parameters in case existing KeyPair needs to be used
- Cloudfront Distribution - if (CreateCloudFrontDistribution == false)
- Refer to CreateCloudFrontDistribution section and create SSM parameters in case existing Cloudfront needs to be used
- VPC/Subnets - if (CreateVPC == false)
Steps¶
Installation is automated through Cloudformation templates in AWS. The stack creates all the resources required by MDL application. This roughly takes a couple of hours to create all the resources needed for MDL. A stack can be created using AWS console, or AWS CLI, or AWS SDK. Refer AWS documentation for creating stacks using Cloudformation templates. In this section, steps are described for creating the stack using AWS console.
- Download the attached installMDL.yml file to local file system
- Login to AWS console and navigate to Cloudformation
- Create the stack using option "Upload a template to Amazon S3" - Refer AWS documentation for selecting a local template
- Choose the installMDL.yml file from local file system
-
In the next page,
- Enter the values for Stack Name
- A stack name can contain only alphanumeric characters (case-sensitive) and hyphens. It must start with an alphabetic character and can't be longer than 128 characters.
- Refer to MDL CFT Specifications and change the required parameters for the chosen installation type
- Enter the values for Stack Name
-
In the next page, specify the stack options as per AWS documentation
- Review the parameters, and create the stack as per AWS documentation
- Wait for "CREATE_COMPLETE" on the stack and all nested stacks.
CFT Specifications¶
Deployment Parameters¶
These parameters are related to which version and components to deploy.
ReleaseVersion
Name | ReleaseVersion |
Description | Release version of MDL application to install |
Required | Yes |
Default Value | 1.4.0 (latest release) |
Allowed Values | 1.0.0, 1.1.0, 1.2.0, 1.3.0, 1.4.0 |
DeployComponents
Name | DeployComponents | ||||||||||
Description | MDL Component to deploy | ||||||||||
Required | Yes | ||||||||||
Default Value | All | ||||||||||
Allowed Values |
|
Generic Parameters¶
These parameters define the basic parameters used across various components
ImageId
Name | ImageId |
Description | AMI id for the EC2 instances. Note that OSS user may use any other AMI which is similar to amzn-ami-hvm-2017.09.1.20180307-x86_64-gp2. However, there could be some issues in terms of package installation/availability, while using a different AMI. So, it is user's responsibility to make sure provided AMI has all the packages like amzn-ami-hvm-2017.09.1.20180307-x86_64-gp2 |
Required | Yes |
Default Value | ami-1853ac65 |
MDLInstanceName
Name | MDLInstanceName |
Description | Name of the Application being installed |
Required | Yes |
Default Value | mdl |
Allowed Pattern | [a-z0-9_]* |
Max Length | 15 |
Environment
Name | Environment |
Description | Environment name for MDL |
Required | Yes |
Default Value | prod |
Allowed Pattern | [a-z0-9_]* |
Max Length | 4 |
CloudWatchRetentionDays
Name | CloudWatchRetentionDays |
Description | Retention days for CloudWatch logs |
Required | Yes |
Default Value | 90 |
Conditional Parameters¶
These are conditional parameters to decide whether MDL creates certain resources or MDL uses existing resources. In each case where a parameter is false, SSM parameters must be present that allow MDL to reference the resources that have been created prior to running the Herd-MDL automated install.
CreateS3Buckets
Name | CreateS3Buckets | ||||||||||||||||||||
Description | Specifies whether to create S3 buckets or to use existing s3 buckets. User needs to fill the SSM parameters as per below information in case of using existing s3 buckets. | ||||||||||||||||||||
Required | Yes | ||||||||||||||||||||
Default Value | true | ||||||||||||||||||||
Allowed Values | true, false | ||||||||||||||||||||
SSM Parameters (required if false) |
|
CreateIAMRoles
Name | CreateIAMRoles | |||||||||||||||||||||||||
Description | Specifies whether to create IAM roles or to use existing IAM roles. User needs to fill the SSM parameters as per below information in case of using existing IAM roles. | |||||||||||||||||||||||||
Required | Yes | |||||||||||||||||||||||||
Default Value | true | |||||||||||||||||||||||||
Allowed Values | true, false | |||||||||||||||||||||||||
SSM Parameters (required if false) |
|
CreateRDSInstances
Name | CreateRDSInstances | |||||||||||||||||||||||||
Description | Specifies whether to create RDS or to use existing RDS. User needs to fill the SSM parameters as per below information in case of using existing RDS. | |||||||||||||||||||||||||
Required | Yes | |||||||||||||||||||||||||
Default Value | true | |||||||||||||||||||||||||
Allowed Values | true, false | |||||||||||||||||||||||||
SSM Parameters (required if false) |
|
CreateSecurityGroups
Name | CreateSecurityGroups | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
Description | Specifies whether to create Security Groups or to use existing Security Groups. User needs to fill the SSM parameters as per below information in case of using existing Security Groups. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
Required | Yes | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
Default Value | true | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
Allowed Values | true, false | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
SSM Parameters (required if false) |
|
CreateSQS
Name | CreateSQS | ||||||||||||||||||||
Description | Specifies whether to create SQS or to use existing SQS. User needs to fill the SSM parameters as per below information in case of using existing SQS. | ||||||||||||||||||||
Required | Yes | ||||||||||||||||||||
Default Value | true | ||||||||||||||||||||
Allowed Values | true, false | ||||||||||||||||||||
SSM Parameters (required if false) |
|
CreateOpenLDAP
Name | CreateOpenLDAP | ||||||||||||||||||||
Description | Specifies whether to create OpenLDAP Server or to use existing OpenLDAP Server. User needs to fill the SSM parameters as per below information in case of using existing OpenLDAP Server for Authentication. | ||||||||||||||||||||
Required | Yes | ||||||||||||||||||||
Default Value | true | ||||||||||||||||||||
Allowed Values | true, false | ||||||||||||||||||||
SSM Parameters (required if false) |
|
CreateVPC
Name | CreateVPC | ||||||||||||||||||||
Description | Specifies whether to create new VPC/Subnets or to use existing VPC/Subnets. User needs to fill the SSM parameters as per below information in case of using existing VPC/Subnets. | ||||||||||||||||||||
Required | Yes | ||||||||||||||||||||
Default Value | true | ||||||||||||||||||||
Allowed Values | true, false | ||||||||||||||||||||
SSM Parameters (required if false) |
|
CreateKeypair
Name | CreateKeypair | ||||||||||
Description | Specifies whether to create new KeyPair or to use existing KeyPair. User needs to fill the SSM parameters as per below information in case of using existing KeyPair. Note that the private key will be uploaded to parameter store in case MDL creates the keys. | ||||||||||
Required | Yes | ||||||||||
Default Value | true | ||||||||||
Allowed Values | true, false | ||||||||||
Input SSM Parameters (required if false) |
|
||||||||||
Output SSM Parameters (in case of true) |
|
CreateCloudFrontDistribution
Name | CreateCloudFrontDistribution |
Description | Specifies whether to create new Cloudfront Distribution or to use existing Cloudfront Distribution. User needs to fill the SSM parameters as per below information in case of using existing Cloudfront Distribution. Here is the template to create one. |
Required | Yes |
Default Value | true |
Allowed Values | true, false |
CreateDemoObjects
Name | CreateDemoObjects |
Description | Specifies whether to create demo data in the data lake. |
Required | Yes |
Default Value | true |
Allowed Values | true, false |
EnableSSLAndAuth
Name | EnableSSLAndAuth |
Description | Specifies whether to enable Authentication for Herd/BDSQL/Shepherd. If Authentication is enabled, MDL uses OpenLDAP to perform authentication/authorization |
Required | Yes |
Default Value | false |
Allowed Values | true, false |
RefreshDatabase
Name | RefreshDatabase |
Description | Specifies whether to refresh RDS for both Herd and Metastor. This is disabled during stack updates. |
Required | Yes |
Default Value | true |
Allowed Values | true, false |
RDS Parameters¶
These parameters are related to RDS
HerdDBClass
Name | HerdDBClass |
Description | Specifies the required Database Class for Herd RDS |
Required | Only If (CreateRDSInstances == true) |
Default Value | db.m4.large |
Allowed Values | Refer to AWS RDS documentation for valid values |
HerdDBSize
Name | HerdDBSize |
Description | Specifies the required Database Size for Herd RDS (in GB) |
Required | Only If (CreateRDSInstances == true) |
Default Value | 10 |
Allowed Values | Refer AWS RDS documentation for more details |
MetastorDBClass
Name | MetastorDBClass |
Description | Specifies the required Database Class for Metastor RDS |
Required | Only If (CreateRDSInstances == true) |
Default Value | db.m4.large |
Allowed Values | Refer to AWS RDS documentation for valid values |
MetastorDBSize
Name | MetastorDBSize |
Description | Specifies the required Database Size for Metastor RDS (in GB) |
Required | Only If (CreateRDSInstances == true) |
Default Value | 10 |
Allowed Values | Refer AWS RDS documentation for more details |
Web Domain and Certificate Parameters¶
These parameters are related to Certificates and Domains. These are required only if EnableSSLAndAuth = true
CertificateArn
Name | CertificateArn |
Description | Specifies the Arn information from ACM for the Certificate to be used in MDL. Refer AWS documentation to create Certificates in ACM. |
Required | Only If (EnableSSLAndAuth == true) |
Default Value | |
Allowed Values | Refer AWS documentation for getting the ARN of the Certificate. Note that the certificate is used for three end points: Herd, Shepherd, and Bdsql. So, Certificate should have Wildcard Domain Name. The Certificate should match any first level subdomain. Format - .domainName* (Example: .example.com). MD prefixes corresponding first level subdomain (Example: mdlHerd.example.com, mdlShepherd.example.com, and mdlBdsql.example.com*). |
DomainNameSuffix
Name | DomainNameSuffix |
Description | Specifies the Domain Name Suffix as per the Certificate specified in "CertificateArn" |
Required | Only If (EnableSSLAndAuth == true) |
Default Value | |
Allowed Values | Refer AWS documentation for setting up a new Domain. When "EnableSSLAndAuth" option is enabled, MDL uses this DomainNameSuffix for the Route53 configurations. So, user needs to own this specified domain. And, this Domain name must match the certificate specified in "CertificateArn" parameter. |
HostedZoneName
Name | HostedZoneName |
Description | Specifies the HostedZoneName for Route53 configuration |
Required | Only If (EnableSSLAndAuth == true) |
Default Value | |
Allowed Values | Refer AWS Documentation for more details about creating Hosted Zone. When "EnableSSLAndAuth" option is enabled, MDL uses this HostedZoneName for the Route53 configurations. So, user needs to own this specified domain related to the HostedZone. And, this Domain name must match the certificate specified in "CertificateArn" parameter. |
CertificateInfo
Name | CertificateInfo |
Description | Specifies the Certificate Information for creating self-signed certificates |
Required | Only If (EnableSSLAndAuth == true) |
Default Value | |
Allowed Values | Format of: CN=<>,OU=<>,O=<>,L=<>,ST=<>,C=<> |
LdapDN
Name | LdapDN |
Description | Specifies the LDAP Domain name used in OpenLDAP configuration |
Required | Only If (EnableSSLAndAuth == true) |
Default Value | |
Allowed Values | ^(dc=[^=]+,)*(dc=[^=]+)$ |
EC2 Instance Parameters¶
These parameters describe the instance types for various EC2 that are use to run components of the Herd-MDL product
EsInstanceType
Name | EsInstanceType |
Description | Specifies the instance type for Elastic Search EC2 |
Required | Yes |
Default Value | t2.medium |
Allowed Values | Refer AWS Documentation for more details |
HerdInstanceType
Name | HerdInstanceType |
Description | Specifies the instance type for Herd EC2 |
Required | Yes |
Default Value | m4.2xlarge |
Allowed Values | Refer AWS Documentation for more details |
LdapInstanceType
Name | LdapInstanceType |
Description | Specifies the instance type for OpenLDAP EC2 |
Required | Yes |
Default Value | t2.small |
Allowed Values | Refer AWS Documentation for more details |
MetastorInstanceType
Name | MetastorInstanceType |
Description | Specifies the instance type for Metastor EC2 |
Required | Yes |
Default Value | m4.2xlarge |
Allowed Values | Refer AWS Documentation for more details |
BdsqlMasterInstanceType
Name | BdsqlMasterInstanceType |
Description | Specifies the instance type for BDSQL Presto EMR Cluster Master Instance |
Required | Yes |
Default Value | m4.4xlarge |
Allowed Values | Refer AWS Documentation for more details |
BdsqlCoreInstanceType
Name | BdsqlMasterInstanceType |
Description | Specifies the instance type for BDSQL Presto EMR Cluster Core Instance |
Required | Yes |
Default Value | m4.4xlarge |
Allowed Values | Refer AWS Documentation for more details |
NumberOfBdsqlCoreInstances
Name | NumberOfBdsqlCoreInstances |
Description | Specifies the number of Core Instances for BDSQL Presto EMR Cluster |
Required | Yes |
Default Value | 1 |
Allowed Values | Integer |
Tag Parameters¶
These parameters describe the tag information for the AWS resources created by MDL.
CustomTagName
Name | CustomTagName |
Description | Specifies the Tag Name to be applied to all the AWS resources created by MDL |
Required | No |
Default Value | |
Allowed Values | Refer AWS Documentation |
CustomTagValue
Name | CustomTagValue |
Description | Specifies the Tag Value to be applied to all the AWS resources created by MDL |
Required | No |
Default Value | |
Allowed Values | Refer AWS Documentation |