Skip to content

Latest commit

 

History

History
3060 lines (1965 loc) · 122 KB

File metadata and controls

3060 lines (1965 loc) · 122 KB

cdk-emrserverless-with-delta-lake

License Release npm downloads pypi downloads NuGet downlods repo languages

npm (JS/TS) PyPI (Python) Maven (Java) Go NuGet
Link Link Link Link Link

high level architecture

This constrcut builds an EMR studio, a cluster template for the EMR Studio, and an EMR Serverless application. 2 S3 buckets will be created, one is for the EMR Studio workspace and the other one is for EMR Serverless applications. Besides, the VPC and the subnets for the EMR Studio will be tagged {"Key": "for-use-with-amazon-emr-managed-policies", "Value": "true"} via a custom resource. This is necessary for the service role of EMR Studio. This construct is for analysts, data engineers, and anyone who wants to know how to process Delta Lake data with EMR serverless. cfn designer They build the construct via cdkv2 and build a serverless job within the EMR application generated by the construct via AWS CLI within few minutes. After the EMR serverless job is finished, they can then check the processed result done by the EMR serverless job on an EMR notebook through the cluster template. app history

TOC

Requirements

  1. Your current identity has the AdministratorAccess power.
  2. An IAM user named Administrator with the AdministratorAccess power.
    • This is related to the Portfolio of AWS Service Catalog created by the construct, which is required for EMR cluster tempaltes.
    • You can choose whatsoever identity you wish to associate with the Product in the Porfolio for creating an EMR cluster via cluster tempalte. Check serviceCatalogProps in the EmrServerless construct for detail, otherwise, the IAM user mentioned above will be chosen to set up with the Product.

Before deployment

You might want to execute the following command.

PROFILE_NAME="scott.hsieh"
# If you only have one credentials on your local machine, just ignore `--profile`, buddy.
cdk bootstrap aws://${AWS_ACCOUNT_ID}/${AWS_REGION} --profile ${PROFILE_NAME}

Minimal content for deployment

#!/usr/bin/env node
import * as cdk from 'aws-cdk-lib';
import { Construct } from 'constructs';
import { EmrServerless } from 'cdk-emrserverless-with-delta-lake';

class TypescriptStack extends cdk.Stack {
  constructor(scope: Construct, id: string, props?: cdk.StackProps) {
    super(scope, id, props);
    new EmrServerless(this, 'EmrServerless');
  }
}

const app = new cdk.App();
new TypescriptStack(app, 'TypescriptStack', {
  stackName: 'emr-studio',
  env: {
    region: process.env.CDK_DEFAULT_REGION,
    account: process.env.CDK_DEFAULT_ACCOUNT,
  },
});

After deployment

Promise me, darling, make advantage on the CloudFormation outputs. All you need is copy-paste, copy-paste, copy-paste, life should be always that easy. cfn outputs

  1. Define the following environment variables on your current session.
    export PROFILE_NAME="${YOUR_PROFILE_NAME}"
    export JOB_ROLE_ARN="${copy-paste-thank-you}"
    export APPLICATION_ID="${copy-paste-thank-you}"
    export SERVERLESS_BUCKET_NAME="${copy-paste-thank-you}"
    export DELTA_LAKE_SCRIPT_NAME="delta-lake-demo"
    
  2. Copy partial NYC-taxi data into the EMR Serverless bucket.
    aws s3 cp s3://nyc-tlc/trip\ data/ s3://${SERVERLESS_BUCKET_NAME}/nyc-taxi/ --exclude "*" --include "yellow_tripdata_2021-*.parquet" --recursive --profile ${PROFILE_NAME}
  3. Create a Python script for processing Delta Lake
    touch ${DELTA_LAKE_SCRIPT_NAME}.py
    cat << EOF > ${DELTA_LAKE_SCRIPT_NAME}.py
    from pyspark.sql import SparkSession
    import uuid
    
    if __name__ == "__main__":
        """
            Delta Lake with EMR Serverless, take NYC taxi as example.
        """
        spark = SparkSession \\
            .builder \\
            .config("spark.sql.extensions", "io.delta.sql.DeltaSparkSessionExtension") \\
            .config("spark.sql.catalog.spark_catalog", "org.apache.spark.sql.delta.catalog.DeltaCatalog") \\
            .enableHiveSupport() \\
            .appName("Delta-Lake-OSS") \\
            .getOrCreate()
    
        url = "s3://${SERVERLESS_BUCKET_NAME}/emr-serverless-spark/delta-lake/output/1.2.1/%s/" % str(
            uuid.uuid4())
    
        # creates a Delta table and outputs to target S3 bucket
        spark.range(5).write.format("delta").save(url)
    
        # reads a Delta table and outputs to target S3 bucket
        spark.read.format("delta").load(url).show()
    
        # The source for the second Delta table.
        base = spark.read.parquet(
            "s3://${SERVERLESS_BUCKET_NAME}/nyc-taxi/*.parquet")
    
        # The sceond Delta table, oh ya.
        base.write.format("delta") \\
            .mode("overwrite") \\
            .save("s3://${SERVERLESS_BUCKET_NAME}/emr-serverless-spark/delta-lake/nyx-tlc-2021")
        spark.stop()
    EOF
  4. Upload the script and required jars into the serverless bucket
    # upload script
    aws s3 cp delta-lake-demo.py s3://${SERVERLESS_BUCKET_NAME}/scripts/${DELTA_LAKE_SCRIPT_NAME}.py --profile ${PROFILE_NAME}
    # download jars and upload them
    DELTA_VERSION="2.2.0"
    DELTA_LAKE_CORE="delta-core_2.13-${DELTA_VERSION}.jar"
    DELTA_LAKE_STORAGE="delta-storage-${DELTA_VERSION}.jar"
    curl https://repo1.maven.org/maven2/io/delta/delta-core_2.13/${DELTA_VERSION}/${DELTA_LAKE_CORE} --output ${DELTA_LAKE_CORE}
    curl https://repo1.maven.org/maven2/io/delta/delta-storage/${DELTA_VERSION}/${DELTA_LAKE_STORAGE} --output ${DELTA_LAKE_STORAGE}
    aws s3 mv ${DELTA_LAKE_CORE} s3://${SERVERLESS_BUCKET_NAME}/jars/${${DELTA_LAKE_CORE}} --profile ${PROFILE_NAME}
    aws s3 mv ${DELTA_LAKE_STORAGE} s3://${SERVERLESS_BUCKET_NAME}/jars/${DELTA_LAKE_STORAGE} --profile ${PROFILE_NAME}

Create an EMR Serverless app

Rememeber, you got so much information to copy and paste from the CloudFormation outputs. cfn outputs

aws emr-serverless start-job-run \
  --application-id ${APPLICATION_ID} \
  --execution-role-arn ${JOB_ROLE_ARN} \
  --name 'shy-shy-first-time' \
  --job-driver '{
        "sparkSubmit": {
            "entryPoint": "s3://'${SERVERLESS_BUCKET_NAME}'/scripts/'${DELTA_LAKE_SCRIPT_NAME}'.py",
            "sparkSubmitParameters": "--conf spark.executor.cores=1 --conf spark.executor.memory=4g --conf spark.driver.cores=1 --conf spark.driver.memory=4g --conf spark.executor.instances=1 --conf spark.jars=s3://'${SERVERLESS_BUCKET_NAME}'/jars/delta-core_2.12-1.2.0.jar,s3://'${SERVERLESS_BUCKET_NAME}'/jars/delta-storage-1.2.0.jar"
        }
    }' \
  --configuration-overrides '{
        "monitoringConfiguration": {
            "s3MonitoringConfiguration": {
                "logUri": "s3://'${SERVERLESS_BUCKET_NAME}'/serverless-log/"
	        }
	    }
	}' \
	--profile ${PROFILE_NAME}

If you execute with success, you should see similar reponse as the following:

{
    "applicationId": "00f1gvklchoqru25",
    "jobRunId": "00f1h0ipd2maem01",
    "arn": "arn:aws:emr-serverless:ap-northeast-1:630778274080:/applications/00f1gvklchoqru25/jobruns/00f1h0ipd2maem01"
}

and got a Delta Lake data under s3://${SERVERLESS_BUCKET_NAME}/emr-serverless-spark/delta-lake/nyx-tlc-2021/. Delta Lake data

Check the executing job

Access the EMR Studio via the URL from the CloudFormation outputs. It should look very similar to the following url: https://es-pilibalapilibala.emrstudio-prod.ap-northeast-1.amazonaws.com, i.e., weird string and region won't be the same as mine.

  1. Enter into the application enter into the app
  2. Enter into the executing job

Check results from an EMR notebook via cluster template

  1. Create a workspace and an EMR cluster via the cluster template on the AWS Console create workspace
  2. Check the results delivered by the EMR serverless application via an EMR notebook.

Fun facts

  1. You can assign multiple jars as a comma-separated list to the spark.jars as the Spark page says for your EMR Serverless job. The UI will complain, you still can start the job. Don't be afraid, just click it like when you were child, facing authority fearlessly. ui bug
  2. To fully delet a stack with the construct, you need to make sure there is no more workspace within the EMR Studio. Aside from that, you also need to remove the associated identity from the Service Catalog (this is a necessary resource for the cluster template).
  3. Version inconsistency on Spark history. Possibly it can be ignored yet still made me wonder why the versions are different. naughty inconsistency
  4. So far, I still haven't figured out how to make the s3a URI work. The s3 URI is fine while the serverless app will complain that it couldn't find proper credentials provider to read the s3a URI.

Future work

  1. Custom resuorce for EMR Serverless
  2. Make the construct more flexible for users
  3. Compare Databricks Runtime and EMR Serverless.

API Reference

Constructs

EmrClusterTemplateStack

Creates a CloudFormation template which will be a Product under a Portfolio of AWS Service Catalog.

This is for creating an EMR cluster via cluster template in the EMR Studio, created by the EmrServerless construct, on the AWS Console.

And you don't have control via the EmrServerless construct by now. The documentation is for you to grasp the architecture of the EmrServerless more easily.

For detail, please refer to Create AWS CloudFormation templates for Amazon EMR Studio.

const product = new servicecatalog.CloudFormationProduct(this, 'MyFirstProduct', {
   productName: 'EMR_6.6.0',
   owner: 'scott.hsieh',
   description: 'EMR cluster with 6.6.0 version',
   productVersions: [
     {
       productVersionName: 'v1',
       validateTemplate: true,
       cloudFormationTemplate: servicecatalog.CloudFormationTemplate.fromProductStack(new EmrClusterTemplateStack(this, 'EmrStudio')),
     },
],
});

Initializers

import { EmrClusterTemplateStack } from 'cdk-emrserverless-with-delta-lake'

new EmrClusterTemplateStack(scope: Construct, id: string)
Name Type Description
scope constructs.Construct No description.
id string No description.

scopeRequired
  • Type: constructs.Construct

idRequired
  • Type: string

Methods

Name Description
toString Returns a string representation of this construct.
addDependency Add a dependency between this stack and another stack.
addMetadata Adds an arbitary key-value pair, with information you want to record about the stack.
addTransform Add a Transform to this stack. A Transform is a macro that AWS CloudFormation uses to process your template.
exportStringListValue Create a CloudFormation Export for a string list value.
exportValue Create a CloudFormation Export for a string value.
formatArn Creates an ARN from components.
getLogicalId Allocates a stack-unique CloudFormation-compatible logical identity for a specific resource.
regionalFact Look up a fact value for the given fact for the region of this stack.
renameLogicalId Rename a generated logical identities.
reportMissingContextKey Indicate that a context key was expected.
resolve Resolve a tokenized value in the context of the current stack.
splitArn Splits the provided ARN into its components.
toJsonString Convert an object, potentially containing tokens, to a JSON string.
toYamlString Convert an object, potentially containing tokens, to a YAML string.

toString
public toString(): string

Returns a string representation of this construct.

addDependency
public addDependency(target: Stack, reason?: string): void

Add a dependency between this stack and another stack.

This can be used to define dependencies between any two stacks within an app, and also supports nested stacks.

targetRequired
  • Type: aws-cdk-lib.Stack

reasonOptional
  • Type: string

addMetadata
public addMetadata(key: string, value: any): void

Adds an arbitary key-value pair, with information you want to record about the stack.

These get translated to the Metadata section of the generated template.

https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/metadata-section-structure.html

keyRequired
  • Type: string

valueRequired
  • Type: any

addTransform
public addTransform(transform: string): void

Add a Transform to this stack. A Transform is a macro that AWS CloudFormation uses to process your template.

Duplicate values are removed when stack is synthesized.

https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/transform-section-structure.html

Example

declare const stack: Stack;

stack.addTransform('AWS::Serverless-2016-10-31')
transformRequired
  • Type: string

The transform to add.


exportStringListValue
public exportStringListValue(exportedValue: any, options?: ExportValueOptions): string[]

Create a CloudFormation Export for a string list value.

Returns a string list representing the corresponding Fn.importValue() expression for this Export. The export expression is automatically wrapped with an Fn::Join and the import value with an Fn::Split, since CloudFormation can only export strings. You can control the name for the export by passing the name option.

If you don't supply a value for name, the value you're exporting must be a Resource attribute (for example: bucket.bucketName) and it will be given the same name as the automatic cross-stack reference that would be created if you used the attribute in another Stack.

One of the uses for this method is to remove the relationship between two Stacks established by automatic cross-stack references. It will temporarily ensure that the CloudFormation Export still exists while you remove the reference from the consuming stack. After that, you can remove the resource and the manual export.

See exportValue for an example of this process.

exportedValueRequired
  • Type: any

optionsOptional
  • Type: aws-cdk-lib.ExportValueOptions

exportValue
public exportValue(exportedValue: any, options?: ExportValueOptions): string

Create a CloudFormation Export for a string value.

Returns a string representing the corresponding Fn.importValue() expression for this Export. You can control the name for the export by passing the name option.

If you don't supply a value for name, the value you're exporting must be a Resource attribute (for example: bucket.bucketName) and it will be given the same name as the automatic cross-stack reference that would be created if you used the attribute in another Stack.

One of the uses for this method is to remove the relationship between two Stacks established by automatic cross-stack references. It will temporarily ensure that the CloudFormation Export still exists while you remove the reference from the consuming stack. After that, you can remove the resource and the manual export.

Example

Here is how the process works. Let's say there are two stacks, producerStack and consumerStack, and producerStack has a bucket called bucket, which is referenced by consumerStack (perhaps because an AWS Lambda Function writes into it, or something like that).

It is not safe to remove producerStack.bucket because as the bucket is being deleted, consumerStack might still be using it.

Instead, the process takes two deployments:

Deployment 1: break the relationship

  • Make sure consumerStack no longer references bucket.bucketName (maybe the consumer stack now uses its own bucket, or it writes to an AWS DynamoDB table, or maybe you just remove the Lambda Function altogether).
  • In the ProducerStack class, call this.exportValue(this.bucket.bucketName). This will make sure the CloudFormation Export continues to exist while the relationship between the two stacks is being broken.
  • Deploy (this will effectively only change the consumerStack, but it's safe to deploy both).

Deployment 2: remove the bucket resource

  • You are now free to remove the bucket resource from producerStack.
  • Don't forget to remove the exportValue() call as well.
  • Deploy again (this time only the producerStack will be changed -- the bucket will be deleted).
exportedValueRequired
  • Type: any

optionsOptional
  • Type: aws-cdk-lib.ExportValueOptions

formatArn
public formatArn(components: ArnComponents): string

Creates an ARN from components.

If partition, region or account are not specified, the stack's partition, region and account will be used.

If any component is the empty string, an empty string will be inserted into the generated ARN at the location that component corresponds to.

The ARN will be formatted as follows:

arn:{partition}:{service}:{region}:{account}:{resource}{sep}{resource-name}

The required ARN pieces that are omitted will be taken from the stack that the 'scope' is attached to. If all ARN pieces are supplied, the supplied scope can be 'undefined'.

componentsRequired
  • Type: aws-cdk-lib.ArnComponents

getLogicalId
public getLogicalId(element: CfnElement): string

Allocates a stack-unique CloudFormation-compatible logical identity for a specific resource.

This method is called when a CfnElement is created and used to render the initial logical identity of resources. Logical ID renames are applied at this stage.

This method uses the protected method allocateLogicalId to render the logical ID for an element. To modify the naming scheme, extend the Stack class and override this method.

elementRequired
  • Type: aws-cdk-lib.CfnElement

The CloudFormation element for which a logical identity is needed.


regionalFact
public regionalFact(factName: string, defaultValue?: string): string

Look up a fact value for the given fact for the region of this stack.

Will return a definite value only if the region of the current stack is resolved. If not, a lookup map will be added to the stack and the lookup will be done at CDK deployment time.

What regions will be included in the lookup map is controlled by the @aws-cdk/core:target-partitions context value: it must be set to a list of partitions, and only regions from the given partitions will be included. If no such context key is set, all regions will be included.

This function is intended to be used by construct library authors. Application builders can rely on the abstractions offered by construct libraries and do not have to worry about regional facts.

If defaultValue is not given, it is an error if the fact is unknown for the given region.

factNameRequired
  • Type: string

defaultValueOptional
  • Type: string

renameLogicalId
public renameLogicalId(oldId: string, newId: string): void

Rename a generated logical identities.

To modify the naming scheme strategy, extend the Stack class and override the allocateLogicalId method.

oldIdRequired
  • Type: string

newIdRequired
  • Type: string

reportMissingContextKey
public reportMissingContextKey(report: MissingContext): void

Indicate that a context key was expected.

Contains instructions which will be emitted into the cloud assembly on how the key should be supplied.

reportRequired
  • Type: aws-cdk-lib.cloud_assembly_schema.MissingContext

The set of parameters needed to obtain the context.


resolve
public resolve(obj: any): any

Resolve a tokenized value in the context of the current stack.

objRequired
  • Type: any

splitArn
public splitArn(arn: string, arnFormat: ArnFormat): ArnComponents

Splits the provided ARN into its components.

Works both if 'arn' is a string like 'arn:aws:s3:::bucket', and a Token representing a dynamic CloudFormation expression (in which case the returned components will also be dynamic CloudFormation expressions, encoded as Tokens).

arnRequired
  • Type: string

the ARN to split into its components.


arnFormatRequired
  • Type: aws-cdk-lib.ArnFormat

the expected format of 'arn' - depends on what format the service 'arn' represents uses.


toJsonString
public toJsonString(obj: any, space?: number): string

Convert an object, potentially containing tokens, to a JSON string.

objRequired
  • Type: any

spaceOptional
  • Type: number

toYamlString
public toYamlString(obj: any): string

Convert an object, potentially containing tokens, to a YAML string.

objRequired
  • Type: any

Static Functions

Name Description
isConstruct Checks if x is a construct.
isStack Return whether the given object is a Stack.
of Looks up the first stack scope in which construct is defined.

isConstruct
import { EmrClusterTemplateStack } from 'cdk-emrserverless-with-delta-lake'

EmrClusterTemplateStack.isConstruct(x: any)

Checks if x is a construct.

Use this method instead of instanceof to properly detect Construct instances, even when the construct library is symlinked.

Explanation: in JavaScript, multiple copies of the constructs library on disk are seen as independent, completely different libraries. As a consequence, the class Construct in each copy of the constructs library is seen as a different class, and an instance of one class will not test as instanceof the other class. npm install will not create installations like this, but users may manually symlink construct libraries together or use a monorepo tool: in those cases, multiple copies of the constructs library can be accidentally installed, and instanceof will behave unpredictably. It is safest to avoid using instanceof, and using this type-testing method instead.

xRequired
  • Type: any

Any object.


isStack
import { EmrClusterTemplateStack } from 'cdk-emrserverless-with-delta-lake'

EmrClusterTemplateStack.isStack(x: any)

Return whether the given object is a Stack.

We do attribute detection since we can't reliably use 'instanceof'.

xRequired
  • Type: any

of
import { EmrClusterTemplateStack } from 'cdk-emrserverless-with-delta-lake'

EmrClusterTemplateStack.of(construct: IConstruct)

Looks up the first stack scope in which construct is defined.

Fails if there is no stack up the tree.

constructRequired
  • Type: constructs.IConstruct

The construct to start the search from.


Properties

Name Type Description
node constructs.Node The tree node.
account string The AWS account into which this stack will be deployed.
artifactId string The ID of the cloud assembly artifact for this stack.
availabilityZones string[] Returns the list of AZs that are available in the AWS environment (account/region) associated with this stack.
bundlingRequired boolean Indicates whether the stack requires bundling or not.
dependencies aws-cdk-lib.Stack[] Return the stacks this stack depends on.
environment string The environment coordinates in which this stack is deployed.
nested boolean Indicates if this is a nested stack, in which case parentStack will include a reference to it's parent.
notificationArns string[] Returns the list of notification Amazon Resource Names (ARNs) for the current stack.
partition string The partition in which this stack is defined.
region string The AWS region into which this stack will be deployed (e.g. us-west-2).
stackId string The ID of the stack.
stackName string The concrete CloudFormation physical stack name.
synthesizer aws-cdk-lib.IStackSynthesizer Synthesis method for this stack.
tags aws-cdk-lib.TagManager Tags to be applied to the stack.
templateFile string The name of the CloudFormation template file emitted to the output directory during synthesis.
templateOptions aws-cdk-lib.ITemplateOptions Options for CloudFormation template (like version, transform, description).
urlSuffix string The Amazon domain suffix for the region in which this stack is defined.
nestedStackParent aws-cdk-lib.Stack If this is a nested stack, returns it's parent stack.
nestedStackResource aws-cdk-lib.CfnResource If this is a nested stack, this represents its AWS::CloudFormation::Stack resource.
terminationProtection boolean Whether termination protection is enabled for this stack.

nodeRequired
public readonly node: Node;
  • Type: constructs.Node

The tree node.


accountRequired
public readonly account: string;
  • Type: string

The AWS account into which this stack will be deployed.

This value is resolved according to the following rules:

  1. The value provided to env.account when the stack is defined. This can either be a concrete account (e.g. 585695031111) or the Aws.ACCOUNT_ID token.
  2. Aws.ACCOUNT_ID, which represents the CloudFormation intrinsic reference { "Ref": "AWS::AccountId" } encoded as a string token.

Preferably, you should use the return value as an opaque string and not attempt to parse it to implement your logic. If you do, you must first check that it is a concrete value an not an unresolved token. If this value is an unresolved token (Token.isUnresolved(stack.account) returns true), this implies that the user wishes that this stack will synthesize into a account-agnostic template. In this case, your code should either fail (throw an error, emit a synth error using Annotations.of(construct).addError()) or implement some other region-agnostic behavior.


artifactIdRequired
public readonly artifactId: string;
  • Type: string

The ID of the cloud assembly artifact for this stack.


availabilityZonesRequired
public readonly availabilityZones: string[];
  • Type: string[]

Returns the list of AZs that are available in the AWS environment (account/region) associated with this stack.

If the stack is environment-agnostic (either account and/or region are tokens), this property will return an array with 2 tokens that will resolve at deploy-time to the first two availability zones returned from CloudFormation's Fn::GetAZs intrinsic function.

If they are not available in the context, returns a set of dummy values and reports them as missing, and let the CLI resolve them by calling EC2 DescribeAvailabilityZones on the target environment.

To specify a different strategy for selecting availability zones override this method.


bundlingRequiredRequired
public readonly bundlingRequired: boolean;
  • Type: boolean

Indicates whether the stack requires bundling or not.


dependenciesRequired
public readonly dependencies: Stack[];
  • Type: aws-cdk-lib.Stack[]

Return the stacks this stack depends on.


environmentRequired
public readonly environment: string;
  • Type: string

The environment coordinates in which this stack is deployed.

In the form aws://account/region. Use stack.account and stack.region to obtain the specific values, no need to parse.

You can use this value to determine if two stacks are targeting the same environment.

If either stack.account or stack.region are not concrete values (e.g. Aws.ACCOUNT_ID or Aws.REGION) the special strings unknown-account and/or unknown-region will be used respectively to indicate this stack is region/account-agnostic.


nestedRequired
public readonly nested: boolean;
  • Type: boolean

Indicates if this is a nested stack, in which case parentStack will include a reference to it's parent.


notificationArnsRequired
public readonly notificationArns: string[];
  • Type: string[]

Returns the list of notification Amazon Resource Names (ARNs) for the current stack.


partitionRequired
public readonly partition: string;
  • Type: string

The partition in which this stack is defined.


regionRequired
public readonly region: string;
  • Type: string

The AWS region into which this stack will be deployed (e.g. us-west-2).

This value is resolved according to the following rules:

  1. The value provided to env.region when the stack is defined. This can either be a concrete region (e.g. us-west-2) or the Aws.REGION token.
  2. Aws.REGION, which is represents the CloudFormation intrinsic reference { "Ref": "AWS::Region" } encoded as a string token.

Preferably, you should use the return value as an opaque string and not attempt to parse it to implement your logic. If you do, you must first check that it is a concrete value an not an unresolved token. If this value is an unresolved token (Token.isUnresolved(stack.region) returns true), this implies that the user wishes that this stack will synthesize into a region-agnostic template. In this case, your code should either fail (throw an error, emit a synth error using Annotations.of(construct).addError()) or implement some other region-agnostic behavior.


stackIdRequired
public readonly stackId: string;
  • Type: string

The ID of the stack.


Example

// After resolving, looks like
'arn:aws:cloudformation:us-west-2:123456789012:stack/teststack/51af3dc0-da77-11e4-872e-1234567db123'
stackNameRequired
public readonly stackName: string;
  • Type: string

The concrete CloudFormation physical stack name.

This is either the name defined explicitly in the stackName prop or allocated based on the stack's location in the construct tree. Stacks that are directly defined under the app use their construct id as their stack name. Stacks that are defined deeper within the tree will use a hashed naming scheme based on the construct path to ensure uniqueness.

If you wish to obtain the deploy-time AWS::StackName intrinsic, you can use Aws.STACK_NAME directly.


synthesizerRequired
public readonly synthesizer: IStackSynthesizer;
  • Type: aws-cdk-lib.IStackSynthesizer

Synthesis method for this stack.


tagsRequired
public readonly tags: TagManager;
  • Type: aws-cdk-lib.TagManager

Tags to be applied to the stack.


templateFileRequired
public readonly templateFile: string;
  • Type: string

The name of the CloudFormation template file emitted to the output directory during synthesis.

Example value: MyStack.template.json


templateOptionsRequired
public readonly templateOptions: ITemplateOptions;
  • Type: aws-cdk-lib.ITemplateOptions

Options for CloudFormation template (like version, transform, description).


urlSuffixRequired
public readonly urlSuffix: string;
  • Type: string

The Amazon domain suffix for the region in which this stack is defined.


nestedStackParentOptional
public readonly nestedStackParent: Stack;
  • Type: aws-cdk-lib.Stack

If this is a nested stack, returns it's parent stack.


nestedStackResourceOptional
public readonly nestedStackResource: CfnResource;
  • Type: aws-cdk-lib.CfnResource

If this is a nested stack, this represents its AWS::CloudFormation::Stack resource.

undefined for top-level (non-nested) stacks.


terminationProtectionRequired
public readonly terminationProtection: boolean;
  • Type: boolean

Whether termination protection is enabled for this stack.


EmrServerless

Creates an EMR Studio, an EMR cluster template for the studio, and an EMR Serverless application.

// the quickiest deployment
new EmrServerless(this, 'EmrServerless');

// custom deployment references
new EmrServerless(this, 'EmrServerless', {
   vpcId: 'vpc-idididid',
});

new EmrServerless(this, 'EmrServerless', {
   vpcId: 'vpc-idididid',
   subnetIds: ['subnet-eeeee', 'subnet-fffff']
});

const myRole = new iam.Role.fromRoleName('MyRole');
new EmrServerless(this, 'EmrServerless', {
   serviceCatalogProps: {
       role: myRole
   }
});

const myUser = new iam.Role.fromUserName('MyUser');
new EmrServerless(this, 'EmrServerless', {
   vpcId: 'vpc-idididid',
   subnetIds: ['subnet-eeeee', 'subnet-fffff'],
   serviceCatalogProps: {
       user: myUser
   }
});

const myGroup = new iam.Group.fromGroupName('MyGroup');
new EmrServerless(this, 'EmrServerless', {
   serviceCatalogProps: {
       group: myGroup
   }
});

Initializers

import { EmrServerless } from 'cdk-emrserverless-with-delta-lake'

new EmrServerless(scope: Construct, name: string, props?: EmrServerlessProps)
Name Type Description
scope constructs.Construct No description.
name string No description.
props EmrServerlessProps No description.

scopeRequired
  • Type: constructs.Construct

nameRequired
  • Type: string

propsOptional

Methods

Name Description
toString Returns a string representation of this construct.

toString
public toString(): string

Returns a string representation of this construct.

Static Functions

Name Description
isConstruct Checks if x is a construct.

isConstruct
import { EmrServerless } from 'cdk-emrserverless-with-delta-lake'

EmrServerless.isConstruct(x: any)

Checks if x is a construct.

Use this method instead of instanceof to properly detect Construct instances, even when the construct library is symlinked.

Explanation: in JavaScript, multiple copies of the constructs library on disk are seen as independent, completely different libraries. As a consequence, the class Construct in each copy of the constructs library is seen as a different class, and an instance of one class will not test as instanceof the other class. npm install will not create installations like this, but users may manually symlink construct libraries together or use a monorepo tool: in those cases, multiple copies of the constructs library can be accidentally installed, and instanceof will behave unpredictably. It is safest to avoid using instanceof, and using this type-testing method instead.

xRequired
  • Type: any

Any object.


Properties

Name Type Description
node constructs.Node The tree node.

nodeRequired
public readonly node: Node;
  • Type: constructs.Node

The tree node.


EmrServerlessBucket

Creates a bucket for EMR Serverless applications.

const emrServerlessBucket = new EmrServerlessBucket(this, 'EmrServerless');

Initializers

import { EmrServerlessBucket } from 'cdk-emrserverless-with-delta-lake'

new EmrServerlessBucket(scope: Construct, name: string, props?: EmrServerlessBucketProps)
Name Type Description
scope constructs.Construct No description.
name string No description.
props EmrServerlessBucketProps No description.

scopeRequired
  • Type: constructs.Construct

nameRequired
  • Type: string

propsOptional

Methods

Name Description
toString Returns a string representation of this construct.

toString
public toString(): string

Returns a string representation of this construct.

Static Functions

Name Description
isConstruct Checks if x is a construct.

isConstruct
import { EmrServerlessBucket } from 'cdk-emrserverless-with-delta-lake'

EmrServerlessBucket.isConstruct(x: any)

Checks if x is a construct.

Use this method instead of instanceof to properly detect Construct instances, even when the construct library is symlinked.

Explanation: in JavaScript, multiple copies of the constructs library on disk are seen as independent, completely different libraries. As a consequence, the class Construct in each copy of the constructs library is seen as a different class, and an instance of one class will not test as instanceof the other class. npm install will not create installations like this, but users may manually symlink construct libraries together or use a monorepo tool: in those cases, multiple copies of the constructs library can be accidentally installed, and instanceof will behave unpredictably. It is safest to avoid using instanceof, and using this type-testing method instead.

xRequired
  • Type: any

Any object.


Properties

Name Type Description
node constructs.Node The tree node.
bucketEntity aws-cdk-lib.aws_s3.Bucket No description.

nodeRequired
public readonly node: Node;
  • Type: constructs.Node

The tree node.


bucketEntityRequired
public readonly bucketEntity: Bucket;
  • Type: aws-cdk-lib.aws_s3.Bucket

EmrStudio

Creates an EMR Studio for EMR Serverless applications.

The Studio is not only for EMR Serverless applications but also for launching an EMR cluster via a cluster template created in this constrcut to check out results transformed by EMR serverless applications.

For what Studio can do further, please refer to Amazon EMR Studio.

const workspaceBucket = new WorkSpaceBucket(this, 'EmrStudio');
const emrStudio = new EmrStudio(this, '', {
   workSpaceBucket: workspaceBucket,
   subnetIds: ['subnet1', 'subnet2', 'subnet3']
});

Initializers

import { EmrStudio } from 'cdk-emrserverless-with-delta-lake'

new EmrStudio(scope: Construct, name: string, props: EmrStudioProps)
Name Type Description
scope constructs.Construct No description.
name string No description.
props EmrStudioProps No description.

scopeRequired
  • Type: constructs.Construct

nameRequired
  • Type: string

propsRequired

Methods

Name Description
toString Returns a string representation of this construct.

toString
public toString(): string

Returns a string representation of this construct.

Static Functions

Name Description
isConstruct Checks if x is a construct.

isConstruct
import { EmrStudio } from 'cdk-emrserverless-with-delta-lake'

EmrStudio.isConstruct(x: any)

Checks if x is a construct.

Use this method instead of instanceof to properly detect Construct instances, even when the construct library is symlinked.

Explanation: in JavaScript, multiple copies of the constructs library on disk are seen as independent, completely different libraries. As a consequence, the class Construct in each copy of the constructs library is seen as a different class, and an instance of one class will not test as instanceof the other class. npm install will not create installations like this, but users may manually symlink construct libraries together or use a monorepo tool: in those cases, multiple copies of the constructs library can be accidentally installed, and instanceof will behave unpredictably. It is safest to avoid using instanceof, and using this type-testing method instead.

xRequired
  • Type: any

Any object.


Properties

Name Type Description
node constructs.Node The tree node.
entity aws-cdk-lib.aws_emr.CfnStudio No description.

nodeRequired
public readonly node: Node;
  • Type: constructs.Node

The tree node.


entityRequired
public readonly entity: CfnStudio;
  • Type: aws-cdk-lib.aws_emr.CfnStudio

EmrStudioDeveloperStack

Creates a Service Catalog for EMR cluster templates.

For detail, please refer to Create AWS CloudFormation templates for Amazon EMR Studio.

const emrClusterTemplatePortfolio = new EmrStudioDeveloperStack(this, 'ClusterTempalte');

Initializers

import { EmrStudioDeveloperStack } from 'cdk-emrserverless-with-delta-lake'

new EmrStudioDeveloperStack(scope: Construct, name: string, props?: EmrStudioDeveloperStackProps)
Name Type Description
scope constructs.Construct No description.
name string No description.
props EmrStudioDeveloperStackProps No description.

scopeRequired
  • Type: constructs.Construct

nameRequired
  • Type: string

propsOptional

Methods

Name Description
toString Returns a string representation of this construct.

toString
public toString(): string

Returns a string representation of this construct.

Static Functions

Name Description
isConstruct Checks if x is a construct.

isConstruct
import { EmrStudioDeveloperStack } from 'cdk-emrserverless-with-delta-lake'

EmrStudioDeveloperStack.isConstruct(x: any)

Checks if x is a construct.

Use this method instead of instanceof to properly detect Construct instances, even when the construct library is symlinked.

Explanation: in JavaScript, multiple copies of the constructs library on disk are seen as independent, completely different libraries. As a consequence, the class Construct in each copy of the constructs library is seen as a different class, and an instance of one class will not test as instanceof the other class. npm install will not create installations like this, but users may manually symlink construct libraries together or use a monorepo tool: in those cases, multiple copies of the constructs library can be accidentally installed, and instanceof will behave unpredictably. It is safest to avoid using instanceof, and using this type-testing method instead.

xRequired
  • Type: any

Any object.


Properties

Name Type Description
node constructs.Node The tree node.
portfolio aws-cdk-lib.aws_servicecatalog.Portfolio The representative of the service catalog for EMR cluster tempaltes.
product aws-cdk-lib.aws_servicecatalog.Product The representative of the product for demo purpose.

nodeRequired
public readonly node: Node;
  • Type: constructs.Node

The tree node.


portfolioRequired
public readonly portfolio: Portfolio;
  • Type: aws-cdk-lib.aws_servicecatalog.Portfolio

The representative of the service catalog for EMR cluster tempaltes.


productRequired
public readonly product: Product;
  • Type: aws-cdk-lib.aws_servicecatalog.Product

The representative of the product for demo purpose.


EmrStudioEngineSecurityGroup

Created an engine security group for EMR notebooks.

For detail, plrease refer to Engine security group.

const workSpaceSecurityGroup = new EmrStudioWorkspaceSecurityGroup(this, 'Workspace', { vpc: baseVpc });
const engineSecurityGroup = new EmrStudioEngineSecurityGroup(this, 'Engine', { vpc: baseVpc });
workSpaceSecurityGroup.entity.connections.allowTo(engineSecurityGroup.entity, ec2.Port.tcp(18888), 'Allow traffic to any resources in the Engine security group for EMR Studio.');
workSpaceSecurityGroup.entity.addEgressRule(ec2.Peer.anyIpv4(), ec2.Port.tcp(443), 'Allow traffic to the internet to link publicly hosted Git repositories to Workspaces.');

Initializers

import { EmrStudioEngineSecurityGroup } from 'cdk-emrserverless-with-delta-lake'

new EmrStudioEngineSecurityGroup(scope: Construct, name: string, props: EmrStudioEngineSecurityGroupProps)
Name Type Description
scope constructs.Construct No description.
name string No description.
props EmrStudioEngineSecurityGroupProps No description.

scopeRequired
  • Type: constructs.Construct

nameRequired
  • Type: string

propsRequired

Methods

Name Description
toString Returns a string representation of this construct.

toString
public toString(): string

Returns a string representation of this construct.

Static Functions

Name Description
isConstruct Checks if x is a construct.

isConstruct
import { EmrStudioEngineSecurityGroup } from 'cdk-emrserverless-with-delta-lake'

EmrStudioEngineSecurityGroup.isConstruct(x: any)

Checks if x is a construct.

Use this method instead of instanceof to properly detect Construct instances, even when the construct library is symlinked.

Explanation: in JavaScript, multiple copies of the constructs library on disk are seen as independent, completely different libraries. As a consequence, the class Construct in each copy of the constructs library is seen as a different class, and an instance of one class will not test as instanceof the other class. npm install will not create installations like this, but users may manually symlink construct libraries together or use a monorepo tool: in those cases, multiple copies of the constructs library can be accidentally installed, and instanceof will behave unpredictably. It is safest to avoid using instanceof, and using this type-testing method instead.

xRequired
  • Type: any

Any object.


Properties

Name Type Description
node constructs.Node The tree node.
entity aws-cdk-lib.aws_ec2.SecurityGroup The representative of the security group as the EMR Studio engine security group.

nodeRequired
public readonly node: Node;
  • Type: constructs.Node

The tree node.


entityRequired
public readonly entity: SecurityGroup;
  • Type: aws-cdk-lib.aws_ec2.SecurityGroup

The representative of the security group as the EMR Studio engine security group.


EmrStudioServiceRole

Creates a default service role for an EMR Studio.

For detail, please refer to Create an EMR Studio service role.

const workSpaceBucket = new WorkSpaceBucket(this, 'WorkSpace');
const emrStudioServiceRole = new EmrStudioServiceRole(this, 'Service', {
     workSpaceBucket: workSpaceBucket
});

Initializers

import { EmrStudioServiceRole } from 'cdk-emrserverless-with-delta-lake'

new EmrStudioServiceRole(scope: Construct, name: string, props: EmrStudioServiceRoleProps)
Name Type Description
scope constructs.Construct No description.
name string No description.
props EmrStudioServiceRoleProps No description.

scopeRequired
  • Type: constructs.Construct

nameRequired
  • Type: string

propsRequired

Methods

Name Description
toString Returns a string representation of this construct.

toString
public toString(): string

Returns a string representation of this construct.

Static Functions

Name Description
isConstruct Checks if x is a construct.

isConstruct
import { EmrStudioServiceRole } from 'cdk-emrserverless-with-delta-lake'

EmrStudioServiceRole.isConstruct(x: any)

Checks if x is a construct.

Use this method instead of instanceof to properly detect Construct instances, even when the construct library is symlinked.

Explanation: in JavaScript, multiple copies of the constructs library on disk are seen as independent, completely different libraries. As a consequence, the class Construct in each copy of the constructs library is seen as a different class, and an instance of one class will not test as instanceof the other class. npm install will not create installations like this, but users may manually symlink construct libraries together or use a monorepo tool: in those cases, multiple copies of the constructs library can be accidentally installed, and instanceof will behave unpredictably. It is safest to avoid using instanceof, and using this type-testing method instead.

xRequired
  • Type: any

Any object.


Properties

Name Type Description
node constructs.Node The tree node.
roleEntity aws-cdk-lib.aws_iam.Role The representative of the default service role for EMR Studio.

nodeRequired
public readonly node: Node;
  • Type: constructs.Node

The tree node.


roleEntityRequired
public readonly roleEntity: Role;
  • Type: aws-cdk-lib.aws_iam.Role

The representative of the default service role for EMR Studio.


EmrStudioTaggingExpert

Creates a Lambda function for the custom resource which can add necessary tag onto the VPC and subnets for the EMR Studio during deployment.

For detail on the tag, please refer to How to create a service role for EMR Studio

Initializers

import { EmrStudioTaggingExpert } from 'cdk-emrserverless-with-delta-lake'

new EmrStudioTaggingExpert(scope: Construct, name: string)
Name Type Description
scope constructs.Construct No description.
name string No description.

scopeRequired
  • Type: constructs.Construct

nameRequired
  • Type: string

Methods

Name Description
toString Returns a string representation of this construct.

toString
public toString(): string

Returns a string representation of this construct.

Static Functions

Name Description
isConstruct Checks if x is a construct.

isConstruct
import { EmrStudioTaggingExpert } from 'cdk-emrserverless-with-delta-lake'

EmrStudioTaggingExpert.isConstruct(x: any)

Checks if x is a construct.

Use this method instead of instanceof to properly detect Construct instances, even when the construct library is symlinked.

Explanation: in JavaScript, multiple copies of the constructs library on disk are seen as independent, completely different libraries. As a consequence, the class Construct in each copy of the constructs library is seen as a different class, and an instance of one class will not test as instanceof the other class. npm install will not create installations like this, but users may manually symlink construct libraries together or use a monorepo tool: in those cases, multiple copies of the constructs library can be accidentally installed, and instanceof will behave unpredictably. It is safest to avoid using instanceof, and using this type-testing method instead.

xRequired
  • Type: any

Any object.


Properties

Name Type Description
node constructs.Node The tree node.
functionEntity aws-cdk-lib.aws_lambda.Function The repesentative of the Lambda function for the custom resource which can add necessary tag onto the VPC and subnets for the EMR Studio during deployment.

nodeRequired
public readonly node: Node;
  • Type: constructs.Node

The tree node.


functionEntityRequired
public readonly functionEntity: Function;
  • Type: aws-cdk-lib.aws_lambda.Function

The repesentative of the Lambda function for the custom resource which can add necessary tag onto the VPC and subnets for the EMR Studio during deployment.


EmrStudioWorkspaceSecurityGroup

Created a workspace security group for EMR Studio.

For detail, plrease refer to Workspace security group.

const workSpaceSecurityGroup = new EmrStudioWorkspaceSecurityGroup(this, 'Workspace', { vpc: baseVpc });
const engineSecurityGroup = new EmrStudioEngineSecurityGroup(this, 'Engine', { vpc: baseVpc });
workSpaceSecurityGroup.entity.connections.allowTo(engineSecurityGroup.entity, ec2.Port.tcp(18888), 'Allow traffic to any resources in the Engine security group for EMR Studio.');
workSpaceSecurityGroup.entity.addEgressRule(ec2.Peer.anyIpv4(), ec2.Port.tcp(443), 'Allow traffic to the internet to link publicly hosted Git repositories to Workspaces.');

Initializers

import { EmrStudioWorkspaceSecurityGroup } from 'cdk-emrserverless-with-delta-lake'

new EmrStudioWorkspaceSecurityGroup(scope: Construct, name: string, props: EmrStudioWorkspaceSecurityGroupProps)
Name Type Description
scope constructs.Construct No description.
name string No description.
props EmrStudioWorkspaceSecurityGroupProps No description.

scopeRequired
  • Type: constructs.Construct

nameRequired
  • Type: string

propsRequired

Methods

Name Description
toString Returns a string representation of this construct.

toString
public toString(): string

Returns a string representation of this construct.

Static Functions

Name Description
isConstruct Checks if x is a construct.

isConstruct
import { EmrStudioWorkspaceSecurityGroup } from 'cdk-emrserverless-with-delta-lake'

EmrStudioWorkspaceSecurityGroup.isConstruct(x: any)

Checks if x is a construct.

Use this method instead of instanceof to properly detect Construct instances, even when the construct library is symlinked.

Explanation: in JavaScript, multiple copies of the constructs library on disk are seen as independent, completely different libraries. As a consequence, the class Construct in each copy of the constructs library is seen as a different class, and an instance of one class will not test as instanceof the other class. npm install will not create installations like this, but users may manually symlink construct libraries together or use a monorepo tool: in those cases, multiple copies of the constructs library can be accidentally installed, and instanceof will behave unpredictably. It is safest to avoid using instanceof, and using this type-testing method instead.

xRequired
  • Type: any

Any object.


Properties

Name Type Description
node constructs.Node The tree node.
entity aws-cdk-lib.aws_ec2.SecurityGroup The representative of the security group as the EMR Studio workspace security group.

nodeRequired
public readonly node: Node;
  • Type: constructs.Node

The tree node.


entityRequired
public readonly entity: SecurityGroup;
  • Type: aws-cdk-lib.aws_ec2.SecurityGroup

The representative of the security group as the EMR Studio workspace security group.


ServerlessJobRole

Creates an execution job role for EMR Serverless.

For detail, please refer to Create a job runtime role.

const emrServerlessBucket = new EmrServerlessBucket(this, 'EmrServerlessStorage');
const emrServerlessJobRole = new ServerlessJobRole(this, 'EmrServerlessJob', {emrServerlessBucket: emrServerlessBucket});

Initializers

import { ServerlessJobRole } from 'cdk-emrserverless-with-delta-lake'

new ServerlessJobRole(scope: Construct, name: string, props: ServerlessJobRoleProps)
Name Type Description
scope constructs.Construct No description.
name string No description.
props ServerlessJobRoleProps No description.

scopeRequired
  • Type: constructs.Construct

nameRequired
  • Type: string

propsRequired

Methods

Name Description
toString Returns a string representation of this construct.

toString
public toString(): string

Returns a string representation of this construct.

Static Functions

Name Description
isConstruct Checks if x is a construct.

isConstruct
import { ServerlessJobRole } from 'cdk-emrserverless-with-delta-lake'

ServerlessJobRole.isConstruct(x: any)

Checks if x is a construct.

Use this method instead of instanceof to properly detect Construct instances, even when the construct library is symlinked.

Explanation: in JavaScript, multiple copies of the constructs library on disk are seen as independent, completely different libraries. As a consequence, the class Construct in each copy of the constructs library is seen as a different class, and an instance of one class will not test as instanceof the other class. npm install will not create installations like this, but users may manually symlink construct libraries together or use a monorepo tool: in those cases, multiple copies of the constructs library can be accidentally installed, and instanceof will behave unpredictably. It is safest to avoid using instanceof, and using this type-testing method instead.

xRequired
  • Type: any

Any object.


Properties

Name Type Description
node constructs.Node The tree node.
entity aws-cdk-lib.aws_iam.Role The representative of the execution role for EMR Serverless.

nodeRequired
public readonly node: Node;
  • Type: constructs.Node

The tree node.


entityRequired
public readonly entity: Role;
  • Type: aws-cdk-lib.aws_iam.Role

The representative of the execution role for EMR Serverless.


WorkSpaceBucket

Initializers

import { WorkSpaceBucket } from 'cdk-emrserverless-with-delta-lake'

new WorkSpaceBucket(scope: Construct, name: string, props?: WorkSpaceBucketProps)
Name Type Description
scope constructs.Construct No description.
name string No description.
props WorkSpaceBucketProps No description.

scopeRequired
  • Type: constructs.Construct

nameRequired
  • Type: string

propsOptional

Methods

Name Description
toString Returns a string representation of this construct.

toString
public toString(): string

Returns a string representation of this construct.

Static Functions

Name Description
isConstruct Checks if x is a construct.

isConstruct
import { WorkSpaceBucket } from 'cdk-emrserverless-with-delta-lake'

WorkSpaceBucket.isConstruct(x: any)

Checks if x is a construct.

Use this method instead of instanceof to properly detect Construct instances, even when the construct library is symlinked.

Explanation: in JavaScript, multiple copies of the constructs library on disk are seen as independent, completely different libraries. As a consequence, the class Construct in each copy of the constructs library is seen as a different class, and an instance of one class will not test as instanceof the other class. npm install will not create installations like this, but users may manually symlink construct libraries together or use a monorepo tool: in those cases, multiple copies of the constructs library can be accidentally installed, and instanceof will behave unpredictably. It is safest to avoid using instanceof, and using this type-testing method instead.

xRequired
  • Type: any

Any object.


Properties

Name Type Description
node constructs.Node The tree node.
bucketEntity aws-cdk-lib.aws_s3.Bucket No description.

nodeRequired
public readonly node: Node;
  • Type: constructs.Node

The tree node.


bucketEntityRequired
public readonly bucketEntity: Bucket;
  • Type: aws-cdk-lib.aws_s3.Bucket

Structs

EmrServerlessBucketProps

Properties for the EMR Serverless bucket.

Initializer

import { EmrServerlessBucketProps } from 'cdk-emrserverless-with-delta-lake'

const emrServerlessBucketProps: EmrServerlessBucketProps = { ... }

Properties

Name Type Description
bucketName string The bucket name for EMR Serverless applications.
removalPolicy aws-cdk-lib.RemovalPolicy Policy to apply when the bucket is removed from this stack.

bucketNameOptional
public readonly bucketName: string;
  • Type: string
  • Default: 'emr-serverless-AWS::AccountId'

The bucket name for EMR Serverless applications.


removalPolicyOptional
public readonly removalPolicy: RemovalPolicy;
  • Type: aws-cdk-lib.RemovalPolicy
  • Default: The bucket will be deleted.

Policy to apply when the bucket is removed from this stack.


EmrServerlessProps

Initializer

import { EmrServerlessProps } from 'cdk-emrserverless-with-delta-lake'

const emrServerlessProps: EmrServerlessProps = { ... }

Properties

Name Type Description
serviceCatalogProps EmrStudioDeveloperStackProps Options for which kind of identity will be associated with the Product of the Porfolio in AWS Service Catalog for EMR cluster templates.
subnetIds string[] The subnet IDs for the EMR studio.
vpcId string Used by the EMR Studio.

serviceCatalogPropsOptional
public readonly serviceCatalogProps: EmrStudioDeveloperStackProps;

Options for which kind of identity will be associated with the Product of the Porfolio in AWS Service Catalog for EMR cluster templates.

You can choose either an IAM group, IAM role, or IAM user. If you leave it empty, an IAM user named Administrator with the AdministratorAccess power needs to be created first.


subnetIdsOptional
public readonly subnetIds: string[];
  • Type: string[]

The subnet IDs for the EMR studio.

You can select the subnets from the default VPC in your AWS account.


vpcIdOptional
public readonly vpcId: string;
  • Type: string
  • Default: 'The default VPC will be used.'

Used by the EMR Studio.


EmrStudioDeveloperStackProps

Interface for Service Catalog of EMR cluster templates.

Initializer

import { EmrStudioDeveloperStackProps } from 'cdk-emrserverless-with-delta-lake'

const emrStudioDeveloperStackProps: EmrStudioDeveloperStackProps = { ... }

Properties

Name Type Description
group aws-cdk-lib.aws_iam.IGroup an IAM group you wish to associate with the Portfolio for EMR cluster template.
providerName string The provider name in a Service Catalog for EMR cluster templates.
role aws-cdk-lib.aws_iam.IRole an IAM role you wish to associate with the Portfolio for EMR cluster template.
user aws-cdk-lib.aws_iam.IUser an IAM user you wish to associate with the Portfolio for EMR cluster template.

groupOptional
public readonly group: IGroup;
  • Type: aws-cdk-lib.aws_iam.IGroup

an IAM group you wish to associate with the Portfolio for EMR cluster template.


providerNameOptional
public readonly providerName: string;
  • Type: string
  • Default: 'scott.hsieh'

The provider name in a Service Catalog for EMR cluster templates.


roleOptional
public readonly role: IRole;
  • Type: aws-cdk-lib.aws_iam.IRole

an IAM role you wish to associate with the Portfolio for EMR cluster template.


userOptional
public readonly user: IUser;
  • Type: aws-cdk-lib.aws_iam.IUser

an IAM user you wish to associate with the Portfolio for EMR cluster template.


EmrStudioEngineSecurityGroupProps

Interface for engine security group of EMR Studio.

Initializer

import { EmrStudioEngineSecurityGroupProps } from 'cdk-emrserverless-with-delta-lake'

const emrStudioEngineSecurityGroupProps: EmrStudioEngineSecurityGroupProps = { ... }

Properties

Name Type Description
vpc aws-cdk-lib.aws_ec2.IVpc The VPC in which to create the engine security group for EMR Studio.

vpcRequired
public readonly vpc: IVpc;
  • Type: aws-cdk-lib.aws_ec2.IVpc
  • Default: default VPC in an AWS account.

The VPC in which to create the engine security group for EMR Studio.


EmrStudioProps

Options for the EMR Studio, mainly for EMR Serverless applications.

Initializer

import { EmrStudioProps } from 'cdk-emrserverless-with-delta-lake'

const emrStudioProps: EmrStudioProps = { ... }

Properties

Name Type Description
workSpaceBucket WorkSpaceBucket The custom construct as the workspace S3 bucket.
authMode StudioAuthMode Specifies whether the Studio authenticates users using AWS SSO or IAM.
description string A detailed description of the Amazon EMR Studio.
engineSecurityGroupId string The ID of the Amazon EMR Studio Engine security group.
serviceCatalogProps EmrStudioDeveloperStackProps Options for which kind of identity will be associated with the Product of the Porfolio in AWS Service Catalog for EMR cluster templates.
serviceRoleArn string No description.
serviceRoleName string A name for the service role of an EMR Studio.
studioName string A descriptive name for the Amazon EMR Studio.
subnetIds string[] The subnet IDs for the EMR studio.
userRoleArn string The custom user role for the EMR Studio when authentication is AWS SSO.
vpcId string Used by the EMR Studio.
workSpaceSecurityGroupId string The ID of the security group used by the workspace.

workSpaceBucketRequired
public readonly workSpaceBucket: WorkSpaceBucket;

The custom construct as the workspace S3 bucket.


authModeOptional
public readonly authMode: StudioAuthMode;

Specifies whether the Studio authenticates users using AWS SSO or IAM.


descriptionOptional
public readonly description: string;
  • Type: string
  • Default: 'EMR Studio Quick Launch - by scott.hsieh'

A detailed description of the Amazon EMR Studio.


engineSecurityGroupIdOptional
public readonly engineSecurityGroupId: string;
  • Type: string
  • Default: a security group created by EmrStudioEngineSecurityGroup.

The ID of the Amazon EMR Studio Engine security group.

The Engine security group allows inbound network traffic from the Workspace security group, and it must be in the same VPC specified by VpcId.


serviceCatalogPropsOptional
public readonly serviceCatalogProps: EmrStudioDeveloperStackProps;

Options for which kind of identity will be associated with the Product of the Porfolio in AWS Service Catalog for EMR cluster templates.

You can choose either an IAM group, IAM role, or IAM user. If you leave it empty, an IAM user named Administrator with the AdministratorAccess power needs to be created first.


serviceRoleArnOptional
public readonly serviceRoleArn: string;
  • Type: string

serviceRoleNameOptional
public readonly serviceRoleName: string;
  • Type: string
  • Default: 'emr-studio-service-role'

A name for the service role of an EMR Studio.

For valid values, see the RoleName parameter for the CreateRole action in the IAM API Reference.

IMPORTANT: If you specify a name, you cannot perform updates that require replacement of this resource. You can perform updates that require no or some interruption. If you must replace the resource, specify a new name.

If you specify a name, you must specify the CAPABILITY_NAMED_IAM value to acknowledge your template's capabilities. For more information, see Acknowledging IAM Resources in AWS CloudFormation Templates.


studioNameOptional
public readonly studioName: string;
  • Type: string
  • Default: 'emr-sutdio-quicklaunch'

A descriptive name for the Amazon EMR Studio.


subnetIdsOptional
public readonly subnetIds: string[];
  • Type: string[]

The subnet IDs for the EMR studio.

You can select the subnets from the default VPC in your AWS account.


userRoleArnOptional
public readonly userRoleArn: string;
  • Type: string

The custom user role for the EMR Studio when authentication is AWS SSO.

Currently, if you choose to establish an EMR serverless application where the authentication mechanism used by the EMR Studio is AWS SSO, you need to create a user role by yourself and assign the role arn to this argument if AWS SSO is chosen as authentication for the EMR Studio;

https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-studio-user-permissions.html


vpcIdOptional
public readonly vpcId: string;
  • Type: string
  • Default: 'The default VPC will be used.'

Used by the EMR Studio.


workSpaceSecurityGroupIdOptional
public readonly workSpaceSecurityGroupId: string;
  • Type: string
  • Default: a security group created by EmrStudioWorkspaceSecurityGroup.

The ID of the security group used by the workspace.


EmrStudioServiceRoleProps

Properties for defining the service role of an EMR Studio.

Initializer

import { EmrStudioServiceRoleProps } from 'cdk-emrserverless-with-delta-lake'

const emrStudioServiceRoleProps: EmrStudioServiceRoleProps = { ... }

Properties

Name Type Description
workSpaceBucket WorkSpaceBucket The custom construct as the workspace S3 bucket.
roleName string A name for the service role of an EMR Studio.

workSpaceBucketRequired
public readonly workSpaceBucket: WorkSpaceBucket;

The custom construct as the workspace S3 bucket.


roleNameOptional
public readonly roleName: string;
  • Type: string
  • Default: 'emr-studio-service-role'

A name for the service role of an EMR Studio.

For valid values, see the RoleName parameter for the CreateRole action in the IAM API Reference.

IMPORTANT: If you specify a name, you cannot perform updates that require replacement of this resource. You can perform updates that require no or some interruption. If you must replace the resource, specify a new name.

If you specify a name, you must specify the CAPABILITY_NAMED_IAM value to acknowledge your template's capabilities. For more information, see Acknowledging IAM Resources in AWS CloudFormation Templates.


EmrStudioWorkspaceSecurityGroupProps

Interface for workspace security group of EMR Studio.

Initializer

import { EmrStudioWorkspaceSecurityGroupProps } from 'cdk-emrserverless-with-delta-lake'

const emrStudioWorkspaceSecurityGroupProps: EmrStudioWorkspaceSecurityGroupProps = { ... }

Properties

Name Type Description
vpc aws-cdk-lib.aws_ec2.IVpc The VPC in which to create workspace security group for EMR Studio.

vpcRequired
public readonly vpc: IVpc;
  • Type: aws-cdk-lib.aws_ec2.IVpc
  • Default: default VPC in an AWS account.

The VPC in which to create workspace security group for EMR Studio.


ServerlessJobRoleProps

Options for the execution job role of EMR Serverless.

Initializer

import { ServerlessJobRoleProps } from 'cdk-emrserverless-with-delta-lake'

const serverlessJobRoleProps: ServerlessJobRoleProps = { ... }

Properties

Name Type Description
emrServerlessBucket aws-cdk-lib.aws_s3.Bucket The EMR Serverless bucket.

emrServerlessBucketRequired
public readonly emrServerlessBucket: Bucket;
  • Type: aws-cdk-lib.aws_s3.Bucket

The EMR Serverless bucket.


WorkSpaceBucketProps

Initializer

import { WorkSpaceBucketProps } from 'cdk-emrserverless-with-delta-lake'

const workSpaceBucketProps: WorkSpaceBucketProps = { ... }

Properties

Name Type Description
bucketName string The bucket name for the workspace of an EMR Studio.
removalPolicy aws-cdk-lib.RemovalPolicy Policy to apply when the bucket is removed from this stack.

bucketNameOptional
public readonly bucketName: string;
  • Type: string
  • Default: 'emr-studio-workspace-bucket-AWS::AccountId'

The bucket name for the workspace of an EMR Studio.


removalPolicyOptional
public readonly removalPolicy: RemovalPolicy;
  • Type: aws-cdk-lib.RemovalPolicy
  • Default: The bucket will be deleted.

Policy to apply when the bucket is removed from this stack.


Enums

StudioAuthMode

What kind of authentication the Studio uses.

Members

Name Description
AWS_SSO the Studio authenticates users using AWS SSO.
AWS_IAM the Studio authenticates users using AWS IAM.

AWS_SSO

the Studio authenticates users using AWS SSO.


AWS_IAM

the Studio authenticates users using AWS IAM.