Simplify information lake entry management to your enterprise customers with trusted identification propagation in AWS IAM Id Heart, AWS Lake Formation, and Amazon S3 Entry Grants


Many organizations use exterior identification suppliers (IdPs) resembling Okta or Microsoft Azure Energetic Listing to handle their enterprise consumer identities. These customers work together with and run analytical queries throughout AWS analytics companies. To allow them to make use of the AWS companies, their identities from the exterior IdP are mapped to AWS Id and Entry Administration (IAM) roles inside AWS, and entry insurance policies are utilized to those IAM roles by information directors.

Given the various vary of companies concerned, totally different IAM roles could also be required for accessing the information. Consequently, directors have to handle permissions throughout a number of roles, a process that may turn into cumbersome at scale.

To handle this problem, you want a unified resolution to simplify information entry administration utilizing your company consumer identities as a substitute of relying solely on IAM roles. AWS IAM Id Heart provides an answer via its trusted identification propagation characteristic, which is constructed upon the OAuth 2.0 authorization framework.

With trusted identification propagation, information entry administration is anchored to a consumer’s identification, which will be synchronized to IAM Id Heart from exterior IdPs utilizing the System for Cross-domain Id Administration (SCIM) protocol. Built-in purposes alternate OAuth tokens, and these tokens are propagated throughout companies. This strategy empowers directors to grant entry instantly primarily based on current consumer and group memberships federated from exterior IdPs, somewhat than counting on IAM customers or roles.

On this put up, we showcase the seamless integration of AWS analytics companies with trusted identification propagation by presenting an end-to-end structure for information entry flows.

Answer overview

Let’s take into account a fictional firm, OkTank. OkTank has a number of consumer personas that use quite a lot of AWS Analytics companies. The consumer identities are managed externally in an exterior IdP: Okta. User1 is a Knowledge Analyst and makes use of the Amazon Athena question editor to question AWS Glue Knowledge Catalog tables with information saved in Amazon Easy Storage Service (Amazon S3). User2 is a Knowledge Engineer and makes use of Amazon EMR Studio notebooks to question Knowledge Catalog tables and in addition question uncooked information saved in Amazon S3 that’s not but cataloged to the Knowledge Catalog. User3 is a Enterprise Analyst who wants to question information saved in Amazon Redshift tables utilizing the Amazon Redshift Question Editor v2. Moreover, this consumer builds Amazon QuickSight visualizations for the information in Redshift tables.

OkTank needs to simplify governance by centralizing information entry management for his or her number of information sources, consumer identities, and instruments. Additionally they wish to outline permissions instantly on their company consumer or group identities from Okta as a substitute of making IAM roles for every consumer and group and managing entry on the IAM function. As well as, for his or her audit necessities, they want the aptitude to map information entry to the company identification of customers inside Okta for enhanced monitoring and accountability.

To attain these targets, we use trusted identification propagation with the aforementioned companies and use AWS Lake Formation and Amazon S3 Entry Grants for entry controls. We use Lake Formation to centrally handle permissions to the Knowledge Catalog tables and Redshift tables shared with Redshift datashares. In our situation, we use S3 Entry Grants for granting permission for the Athena question outcome location. Moreover, we present how you can entry a uncooked information bucket ruled by S3 Entry Grants with an EMR pocket book.

Knowledge entry is audited with AWS CloudTrail and will be queried with AWS CloudTrail Lake. This structure showcases the flexibility and effectiveness of AWS analytics companies in enabling environment friendly and safe information evaluation workflows throughout totally different use instances and consumer personas.

We use Okta because the exterior IdP, however it’s also possible to use different IdPs like Microsoft Azure Energetic Listing. Customers and teams from Okta are synced to IAM Id Heart. On this put up, now we have three teams, as proven within the following diagram.

User1 wants to question a Knowledge Catalog desk with information saved in Amazon S3. The S3 location is secured and managed by Lake Formation. The consumer connects to an IAM Id Heart enabled Athena workgroup utilizing the Athena question editor with EMR Studio. The IAM Id Heart enabled Athena workgroups should be secured with S3 Entry Grants permissions for the Athena question outcomes location. With this characteristic, it’s also possible to allow the creation of identity-based question outcome places which might be ruled by S3 Entry Grants. These consumer identity-based S3 prefixes let customers in an Athena workgroup hold their question outcomes remoted from different customers in the identical workgroup. The next diagram illustrates this structure.

User2 wants to question the identical Knowledge Catalog desk as User1. This desk is ruled utilizing Lake Formation permissions. Moreover, the consumer must entry uncooked information in one other S3 bucket that isn’t cataloged to the Knowledge Catalog and is managed utilizing S3 Entry Grants; within the following diagram, that is proven as S3 Knowledge Location-2.

The consumer makes use of an EMR Studio pocket book to run Spark queries on an EMR cluster. The EMR cluster makes use of a safety configuration that integrates with IAM Id Heart for authentication and makes use of Lake Formation for authorization. The EMR cluster can also be enabled for S3 Entry Grants. With this type of hybrid entry administration, you should use Lake Formation to centrally handle permissions to your datasets cataloged to the Knowledge Catalog and use S3 Entry Grants to centrally handle entry to your uncooked information that’s not but cataloged to the Knowledge Catalog. This provides you flexibility to entry information managed by both of the entry management mechanisms from the identical pocket book.

User3 makes use of the Redshift Question Editor V2 to question a Redshift desk. The consumer additionally accesses the identical desk with QuickSight. For our demo, we use a single consumer persona for simplicity, however in actuality, these may very well be fully totally different consumer personas. To allow entry management with Lake Formation for Redshift tables, we use information sharing in Lake Formation.

Knowledge entry requests by the particular customers are logged to CloudTrail. Later on this put up, we additionally briefly contact upon utilizing CloudTrail Lake to question the information entry occasions.

Within the following sections, we exhibit how you can construct this structure. We use AWS CloudFormation to provision the assets. AWS CloudFormation helps you to mannequin, provision, and handle AWS and third-party assets by treating infrastructure as code. We additionally use the AWS Command Line Interface (AWS CLI) and AWS Administration Console to finish some steps.

The next diagram exhibits the end-to-end structure.

Stipulations

Full the next prerequisite steps:

  1. Have an AWS account. In case you don’t have an account, you possibly can create one.
  2. Have IAM Id Heart arrange in a selected AWS Area.
  3. Ensure you use the identical Area the place you will have IAM Id Heart arrange all through the setup and verification steps. On this put up, we use the us-east-1 Area.
  4. Have Okta arrange with three totally different teams and customers, and allow sync to IAM Id Heart. Seek advice from Configure SAML and SCIM with Okta and IAM Id Heart for directions.

After the Okta teams are pushed to IAM Id Heart, you possibly can see the customers and teams on the IAM Id Heart console, as proven within the following screenshot. You want the group IDs of the three teams to be handed within the CloudFormation template.

  1. For enabling User2 entry utilizing the EMR cluster, you want have an SSL certificates .zip file accessible in your S3 bucket. You may obtain the next pattern certificates to make use of on this put up. In manufacturing use instances, it is best to create and use your personal certificates. You have to reference the bucket title and the certificates bundle .zip file in AWS CloudFormation. The CloudFormation template helps you to select the parts you wish to provision. If you don’t intend to deploy the EMR cluster, you possibly can ignore this step.
  2. Have an administrator consumer or function to run the CloudFormation stack. The consumer or function also needs to be a Lake Formation administrator to grant permissions.

Deploy the CloudFormation stack

The CloudFormation template supplied within the put up helps you to select the parts you wish to provision from the answer structure. On this put up, we allow all parts, as proven within the following screenshot.

Run the supplied CloudFormation stack to create the answer assets. Seek advice from the next desk for a listing of vital parameters.

Parameter Group Description Parameter Title Anticipated Worth
Select parts to provision. Select the parts you wish to be provisioned. DeployAthenaFlow Sure/No. In case you select No, you possibly can ignore the parameters within the “Athena Configuration” group.
DeployEMRFlow Sure/No. In case you select No, you possibly can ignore the parameters within the “EMR Configuration” group.
DeployRedshiftQEV2Flow Sure/No. In case you select No, you possibly can ignore the parameters within the “Redshift Configuration” group.
CreateS3AGInstance Sure/No. If you have already got an S3 Entry Grants occasion, select No. In any other case, select Sure to permit the stack create a brand new S3 Entry Grants occasion. The S3 Entry Grants occasion is required for User1 and User2.
Id Heart Configuration IAM Id Heart parameters. IDCGroup1Id Group ID comparable to Group1 from IAM Id Heart.
IDCGroup2Id Group ID comparable to Group2 from IAM Id Heart.
IDCGroup3Id Group ID comparable to Group3 from IAM Id Heart.
IAMIDCInstanceArn IAM Id Heart occasion ARN. You may get this from the Settings part of IAM Id Heart.
Redshift Configuration

Redshift parameters.

Ignore in case you selected DeployRedshiftQEV2Flow as No.

RedshiftServerlessAdminUserName Redshift admin consumer title.
RedshiftServerlessAdminPassword Redshift admin password.
RedshiftServerlessDatabase Redshift database to create the tables.
EMR Configuration

EMR parameters.

Ignore in case you selected parameter DeployEMRFlow as No.

SSlCertsS3BucketName Bucket title the place you copied the SSL certificates.
SSlCertsZip Title of SSL certificates file (my-certs.zip) to make use of the pattern certificates supplied within the put up.
Athena Configuration

Athena parameters.

Ignore in case you selected parameter DeployAthenaFlow as No.

IDCUser1Id Consumer ID comparable to User1 from IAM Id Heart.

The CloudFormation stack provisions the next assets:

  • A VPC with a private and non-private subnet.
  • In case you selected the Redshift parts, it additionally creates three further subnets.
  • S3 buckets for information and Athena question outcomes location storage. It additionally copies some pattern information to the buckets.
  • EMR Studio with IAM Id Heart integration.
  • Amazon EMR safety configuration with IAM Id Heart integration.
  • An EMR cluster that makes use of the EMR safety group.
  • Registers the supply S3 bucket with Lake Formation.
  • An AWS Glue database named oktank_tipblog_temp and a desk named buyer beneath the database. The desk factors to the Amazon S3 location ruled by Lake Formation.
  • Permits exterior engines to entry information in Amazon S3 places with full desk entry. That is required for Amazon EMR integration with Lake Formation for trusted identification propagation. As of this writing, Amazon EMR helps table-level entry with IAM Id Heart enabled clusters.
  • An S3 Entry Grants occasion.
  • S3 Entry Grants for Group1 to the User1 prefix beneath the Athena question outcomes location bucket.
  • S3 Entry Grants for Group2 to the S3 bucket enter and output prefixes. The consumer has learn entry to the enter prefix and write entry to the output prefix beneath the bucket.
  • An Amazon Redshift Serverless namespace and workgroup. This workgroup is just not built-in with IAM Id Heart; we full subsequent steps to allow IAM Id Heart for the workgroup.
  • An AWS Cloud9 built-in growth atmosphere (IDE), which we use to run AWS CLI instructions in the course of the setup.

Word the stack outputs on the AWS CloudFormation console. You utilize these values in later steps.

Select the hyperlink for Cloud9URL within the stack output to open the AWS Cloud9 IDE. In AWS Cloud9, go to the Window tab and select New Terminal to start out a brand new bash terminal.

Arrange Lake Formation

You have to allow Lake Formation with IAM Id Heart and allow an EMR software with Lake Formation integration. Full the next steps:

  1. Within the AWS Cloud9 bash terminal, enter the next command to get the Amazon EMR safety configuration created by the stack:
aws emr describe-security-configuration --name TIP-EMRSecurityConfig | jq -r '.SecurityConfiguration | fromjson | .AuthenticationConfiguration.IdentityCenterConfiguration.IdCApplicationARN'

  1. Word the worth for IdcApplicationARN from the output.
  2. Enter the next command in AWS Cloud9 to allow the Lake Formation integration with IAM Id Heart and add the Amazon EMR safety configuration software as a trusted software in Lake Formation. If you have already got the IAM Id Heart integration with Lake Formation, check in to Lake Formation and add the previous worth to the checklist of purposes as a substitute of working the next command and proceed to subsequent step.
aws lakeformation create-lake-formation-identity-center-configuration --catalog-id <Change with CatalogId worth from Cloudformation output> --instance-arn <Change with IDCInstanceARN worth from CloudFormation stack output> --external-filtering Standing=ENABLED,AuthorizedTargets=<Change with IdcApplicationARN worth copied in earlier step>

After this step, it is best to see the appliance on the Lake Formation console.

This completes the preliminary setup. In subsequent steps, we apply some further configurations for particular consumer personas.

Validate consumer personas

To evaluation the S3 Entry Grants created by AWS CloudFormation, open the Amazon S3 console and Entry Grants within the navigation pane. Select the entry grant you created to view its particulars.

The CloudFormation stack created the S3 Entry Grants for Group1 for the User1 prefix beneath the Athena question outcomes location bucket. This enables User1 to entry the prefix beneath within the question outcomes bucket. The stack additionally created the grants for Group2 for User2 to entry the uncooked information bucket enter and output prefixes.

Arrange User1 entry

Full the steps on this part to arrange User1 entry.

Create an IAM Id Heart enabled Athena workgroup

Let’s create the Athena workgroup that can be utilized by User1.

Enter the next command within the AWS Cloud9 terminal. The command creates an IAM Id Heart built-in Athena workgroup and allows S3 Entry Grants for the user-level prefix. These consumer identity-based S3 prefixes let customers in an Athena workgroup hold their question outcomes remoted from different customers in the identical workgroup. The prefix is routinely created by Athena when the CreateUserLevelPrefix possibility is enabled. Entry to the prefix was granted by the CloudFormation stack.

aws athena create-work-group --cli-input-json '{
"Title": "AthenaIDCWG",
"Configuration": {
"ResultConfiguration": {
"OutputLocation": "<Change with AthenaResultLocation from CloudFormation stack>"
},
"ExecutionRole": "<Change with TIPStudioRoleArn from CloudFormation stack>",
"IdentityCenterConfiguration": {
"EnableIdentityCenter": true,
"IdentityCenterInstanceArn": "<Change with IDCInstanceARN from CloudFormation stack>"
},
"QueryResultsS3AccessGrantsConfiguration": {
"EnableS3AccessGrants": true,
"CreateUserLevelPrefix": true,
"AuthenticationType": "DIRECTORY_IDENTITY"
},
"EnforceWorkGroupConfiguration":true
},
"Description": "Athena Workgroup with IDC integration"
}'

Grant entry to User1 on the Athena workgroup

Register to the Athena console and grant entry to Group1 to the workgroup as proven within the following screenshot. You may grant entry to the consumer (User1) or to the group (Group1). On this put up, we grant entry to Group1.

Grant entry to User1 in Lake Formation

Register to the Lake Formation console, select Knowledge lake permissions within the navigation pane, and grant entry to the consumer group on the database oktank_tipblog_temp and desk buyer.

With Athena, you possibly can grant entry to particular columns and for particular rows with row-level filtering. For this put up, we grant column-level entry and limit entry to solely chosen columns for the desk.

This completes the entry permission setup for User1.

Confirm entry

Let’s see how User1 makes use of Athena to research the information.

  1. Copy the URL for EMRStudioURL from the CloudFormation stack output.
  2. Open a brand new browser window and connect with the URL.

You’ll be redirected to the Okta login web page.

  1. Log in with User1.
  2. Within the EMR Studio question editor, change the workgroup to AthenaIDCWG and select Acknowledge.
  3. Run the next question within the question editor:
SELECT * FROM "oktank_tipblog_temp"."buyer" restrict 10;


You may see that the consumer is simply in a position to entry the columns for which permissions had been beforehand granted in Lake Formation. This completes the entry move verification for User1.

Arrange User2 entry

User2 accesses the desk utilizing an EMR Studio pocket book. Word the present concerns for EMR with IAM Id Heart integrations.

Full the steps on this part to arrange User2 entry.

Grant Lake Formation permissions to User2

Register to the Lake Formation console and grant entry to Group2 on the desk, much like the steps you adopted earlier for User1. Additionally grant Describe permission on the default database to Group2, as proven within the following screenshot.

Create an EMR Studio Workspace

Subsequent, User2 creates an EMR Studio Workspace.

  1. Copy the URL for EMR Studio from the EMRStudioURL worth from the CloudFormation stack output.
  2. Log in to EMR Studio as User2 on the Okta login web page.
  3. Create a Workspace, giving it a reputation and leaving all different choices as default.

This may open a JupyterLab pocket book in a brand new window.

Hook up with the EMR Studio pocket book

Within the Compute pane of the pocket book, choose the EMR cluster (named EMRWithTIP) created by the CloudFormation stack to connect to it. After the pocket book is hooked up to the cluster, select the PySpark kernel to run Spark queries.

Confirm entry

Enter the next question within the pocket book to learn from the client desk:

spark.sql("choose * from oktank_tipblog_temp.buyer").present()


The consumer entry works as anticipated primarily based on the Lake Formation grants you supplied earlier.

Run the next Spark question within the pocket book to learn information from the uncooked bucket. Entry to this bucket is managed by S3 Entry Grants.

spark.learn.possibility("header",True).csv("s3://tip-blog-s3-s3ag/enter/*").present()

Let’s write this information to the identical bucket and enter prefix. This could fail since you solely granted learn entry to the enter prefix with S3 Entry Grants.

spark.learn.possibility("header",True).csv("s3://tip-blog-s3-s3ag/enter/*").write.mode("overwrite").parquet("s3://tip-blog-s3-s3ag/enter/")

The consumer has entry to the output prefix beneath the bucket. Change the question to put in writing to the output prefix:

spark.learn.possibility("header",True).csv("s3://tip-blog-s3-s3ag/enter/*").write.mode("overwrite").parquet("s3://tip-blog-s3-s3ag/output/check.half")

The write ought to now achieve success.

We’ve now seen the information entry controls and entry flows for User1 and User2.

Arrange User3 entry

Following the goal structure in our put up, Group3 customers use the Redshift Question Editor v2 to question the Redshift tables.

Full the steps on this part to arrange entry for User3.

Allow Redshift Question Editor v2 console entry for User3

Full the next steps:

  1. On the IAM Id Heart console, create a customized permission set and fasten the next insurance policies:
    1. AWS managed coverage AmazonRedshiftQueryEditorV2ReadSharing.
    2. Buyer managed coverage redshift-idc-policy-tip. This coverage is already created by the CloudFormation stack, so that you don’t need to create it.
  2. Present a reputation (tip-blog-qe-v2-permission-set) to the permission set.
  3. Set the relay state as https://<region-id>.console.aws.amazon.com/sqlworkbench/residence (for instance, https://us-east-1.console.aws.amazon.com/sqlworkbench/residence).
  4. Select Create.
  5. Assign Group3 to the account in IAM Id Heart, choose the permission set you created, and select Submit.

Create the Redshift IAM Id Heart software

Enter the next within the AWS Cloud9 terminal:

aws redshift create-redshift-idc-application 
--idc-instance-arn '<Change with IDCInstanceARN worth from CloudFormation Output>' 
--redshift-idc-application-name 'redshift-iad-<Change with CatalogId worth from CloudFormation output>-tip-blog-1' 
--identity-namespace 'tipblogawsidc' 
--idc-display-name 'TIPBlog_AWSIDC' 
--iam-role-arn '<Change with TIPRedshiftRoleArn worth from CloudFormation output>' 
--service-integrations '[
  {
    "LakeFormation": [
    {
     "LakeFormationQuery": {
     "Authorization": "Enabled"
    }
   }
  ]
 }
]'

Enter the next command to get the appliance particulars:

aws redshift describe-redshift-idc-applications --output json

Maintain a word of the IdcManagedApplicationArn, IdcDisplayName, and IdentityNamespace values within the output for the appliance with IdcDisplayName TIPBlog_AWSIDC. You want these values within the subsequent step.

Allow the Redshift Question Editor v2 for the Redshift IAM Id Heart software

Full the next steps:

  1. On the Amazon Redshift console, select IAM Id Heart connections within the navigation pane.
  2. Select the appliance you created.
  3. Select Edit.
  4. Choose Allow Question Editor v2 software and select Save adjustments.
  5. On the Teams tab, select Add or assign teams.
  6. Assign Group3 to the appliance.

The Redshift IAM Id Heart connection is now arrange.

Allow the Redshift Serverless namespace and workgroup with IAM Id Heart

The CloudFormation stack you deployed created a serverless namespace and workgroup. Nevertheless, they’re not enabled with IAM Id Heart. To allow with IAM Id Heart, full the next steps. You may get the namespace title from the RedshiftNamespace worth of the CloudFormation stack output.

  1. On the Amazon Redshift Serverless dashboard console, navigate to the namespace you created.
  2. Select Question Knowledge to open Question Editor v2.
  3. Select the choices menu (three dots) and select Create connections for the workgroup redshift-idc-wg-tipblog.
  4. Select Different methods to attach after which Database consumer title and password.
  5. Use the credentials you supplied for the Redshift admin consumer title and password parameters when deploying the CloudFormation stack and create the connection.

Create assets utilizing the Redshift Question Editor v2

You now enter a collection of instructions within the question editor with the database admin consumer.

  1. Create an IdP for the Redshift IAM Id Heart software:
CREATE IDENTITY PROVIDER "TIPBlog_AWSIDC" TYPE AWSIDC
NAMESPACE 'tipblogawsidc'
APPLICATION_ARN '<Change with IdcManagedApplicationArn worth you copied earlier in Cloud9>'
IAM_ROLE '<Change with TIPRedshiftRoleArn worth from CloudFormation output>';

  1. Enter the next command to test the IdP you added beforehand:
SELECT * FROM svv_identity_providers;

Subsequent, you grant permissions to the IAM Id Heart consumer.

  1. Create a job in Redshift. This function ought to correspond to the group in IAM Id Heart to which you propose to offer the permissions (Group3 on this put up). The function ought to observe the format <namespace>:<GroupNameinIDC>.
Create function "tipblogawsidc:Group3";

  1. Run the next command to see function you created. The external_id corresponds to the group ID worth for Group3 in IAM Id Heart.
Choose * from svv_roles the place role_name="tipblogawsidc:Group3";

  1. Create a pattern desk to make use of to confirm entry for the Group3 consumer:
CREATE TABLE IF NOT EXISTS income
(
account INTEGER ENCODE az64
,buyer VARCHAR(20) ENCODE lzo
,salesamt NUMERIC(18,0) ENCODE az64
)
DISTSTYLE AUTO
;

insert into income values (10001, 'ABC Firm', 12000);
insert into income values (10002, 'Tech Logistics', 175400);

  1. Grant entry to the consumer on the schema:
-- Grant utilization on schema
grant utilization on schema public to function "tipblogawsidc:Group3";

  1. To create a datashare and add the previous desk to the datashare, enter the next statements:
CREATE DATASHARE demo_datashare;
ALTER DATASHARE demo_datashare ADD SCHEMA public;
ALTER DATASHARE demo_datashare ADD TABLE income;

  1. Grant utilization on the datashare to the account utilizing the Knowledge Catalog:
GRANT USAGE ON DATASHARE demo_datashare TO ACCOUNT '<Change with CatalogId from Cloud Formation Output>' by way of DATA CATALOG;

Authorize the datashare

For this put up, we use the AWS CLI to authorize the datashare. You can even do it from the Amazon Redshift console.

Enter the next command within the AWS Cloud9 IDE to explain the datashare you created and word the worth of DataShareArn and ConsumerIdentifier to make use of in subsequent steps:

aws redshift describe-data-shares

Enter the next command within the AWS Cloud9 IDE to the authorize the datashare:

aws redshift authorize-data-share --data-share-arn <Change with DataShareArn worth copied from earlier command’s output> --consumer-identifier <Change with ConsumerIdentifier worth copied from earlier command’s output >

Settle for the datashare in Lake Formation

Subsequent, settle for the datashare in Lake Formation.

  1. On the Lake Formation console, select Knowledge sharing within the navigation pane.
  2. Within the Invites part, choose the datashare invitation that’s pending acceptance.
  3. Select Evaluate invitation and settle for the datashare.
  4. Present a database title (tip-blog-redshift-ds-db), which can be created within the Knowledge Catalog by Lake Formation.
  5. Select Skip to Evaluate and Create and create the database.

Grant permissions in Lake Formation

Full the next steps:

  1. On the Lake Formation console, select Knowledge lake permissions within the navigation pane.
  2. Select Grant and within the Principals part, select User3 to grant permissions with the IAM Id Heart-new possibility. Seek advice from the Lake Formation entry grants steps carried out for User1 and User2 if wanted.
  3. Select the database (tip-blog-redshift-ds-db) you created earlier and the desk public.income, which you created within the Redshift Question Editor v2.
  4. For Desk permissions¸ choose Choose.
  5. For Knowledge permissions¸ choose Column-based entry and choose the account and salesamt columns.
  6. Select Grant.

Mount the AWS Glue database to Amazon Redshift

Because the final step within the setup, mount the AWS Glue database to Amazon Redshift. Within the Question Editor v2, enter the next statements:

create exterior schema if not exists tipblog_datashare_idc_schema from DATA CATALOG DATABASE 'tip-blog-redshift-ds-db' catalog_id '<Change with CatalogId from CloudFormation output>';

grant utilization on schema tipblog_datashare_idc_schema to function "tipblogawsidc:Group3";

grant choose on all tables in schema tipblog_datashare_idc_schema to function "tipblogawsidc:Group3";

You are actually achieved with the required setup and permissions for User3 on the Redshift desk.

Confirm entry

To confirm entry, full the next steps:

  1. Get the AWS entry portal URL from the IAM Id Heart Settings part.
  2. Open a special browser and enter the entry portal URL.

This may redirect you to your Okta login web page.

  1. Register, choose the account, and select the tip-blog-qe-v2-permission-set hyperlink to open the Question Editor v2.

In case you’re utilizing personal or incognito mode for testing this, you might have to allow third-party cookies.

  1. Select the choices menu (three dots) and select Edit connection for the redshift-idc-wg-tipblog workgroup.
  2. Use IAM Id Heart within the pop-up window and select Proceed.

In case you get an error with the message “Redshift serverless cluster is auto paused,” change to the opposite browser with admin credentials and run any pattern queries to un-pause the cluster. Then change again to this browser and proceed the subsequent steps.

  1. Run the next question to entry the desk:
SELECT * FROM "dev"."tipblog_datashare_idc_schema"."public.income";

You may solely see the 2 columns as a result of entry grants you supplied in Lake Formation earlier.

This completes configuring User3 entry to the Redshift desk.

Arrange QuickSight for User3

Let’s now arrange QuickSight and confirm entry for User3. We already granted entry to User3 to the Redshift desk in earlier steps.

  1. Create a brand new IAM Id Heart enabled QuickSight account. Seek advice from Simplify enterprise intelligence identification administration with Amazon QuickSight and AWS IAM Id Heart for steering.
  2. Select Group3 for the creator and reader for this put up.
  3. For IAM Function, select the IAM function matching the RoleQuickSight worth from the CloudFormation stack output.

Subsequent, you add a VPC connection to QuickSight to entry the Redshift Serverless namespace you created earlier.

  1. On the QuickSight console, handle your VPC connections.
  2. Select Add VPC connection.
  3. For VPC connection title, enter a reputation.
  4. For VPC ID, enter the worth for VPCId from the CloudFormation stack output.
  5. For Execution function, select the worth for RoleQuickSight from the CloudFormation stack output.
  6. For Safety Group IDs, select the safety group for QSSecurityGroup from the CloudFormation stack output.

  1. Look forward to the VPC connection to be AVAILABLE.
  2. Enter the next command in AWS Cloud9 to allow QuickSight with Amazon Redshift for trusted identification propagation:
aws quicksight update-identity-propagation-config --aws-account-id "<Change with CatalogId from CloudFormation output>" --service "REDSHIFT" --authorized-targets "< Change with IdcManagedApplicationArn worth from output of aws redshift describe-redshift-idc-applications --output json which you copied earlier>"

Confirm User3 entry with QuickSight

Full the next steps:

  1. Register to the QuickSight console as User3 in a special browser.
  2. On the Okta sign-in web page, check in as Consumer 3.
  3. Create a brand new dataset with Amazon Redshift as the information supply.
  4. Select the VPC connection you created above for Connection Sort.
  5. Present the Redshift server (the RedshiftSrverlessWorkgroup worth from the CloudFormation stack output), port (5439 on this put up), and database title (dev on this put up).
  6. Beneath Authentication technique, choose Single sign-on.
  7. Select Validate, then select Create information supply.

In case you encounter a problem with validating utilizing single sign-on, change to Database username and password for Authentication technique, validate with any dummy consumer and password, after which change again to validate utilizing single sign-on and proceed to the subsequent step. Additionally test that the Redshift serverless cluster is just not auto-paused as talked about earlier in Redshift entry verification.

  1. Select the schema you created earlier (tipblog_datashare_idc_schema) and the desk public.income
  2. Select Choose to create your dataset.

You must now have the ability to visualize the information in QuickSight. You might be solely in a position to solely see the account and salesamt columns from the desk due to the entry permissions you granted earlier with Lake Formation.

This finishes all of the steps for organising trusted identification propagation.

Audit information entry

Let’s see how we will audit the information entry with the totally different customers.

Entry requests are logged to CloudTrail. The IAM Id Heart consumer ID is logged beneath the onBehalfOf tag within the CloudTrail occasion. The next screenshot exhibits the GetDataAccess occasion generated by Lake Formation. You may view the CloudTrail occasion historical past and filter by occasion title GetDataAccess to view comparable occasions in your account.

You may see the userId corresponds to User2.

You may run the next instructions in AWS Cloud9 to substantiate this.

Get the identification retailer ID:

aws sso-admin describe-instance --instance-arn <Change together with your occasion arn worth> | jq -r '.IdentityStoreId'

Describe the consumer within the identification retailer:

aws identitystore describe-user --identity-store-id <Change with output of above command> --user-id <Consumer Id from above screenshot>

One strategy to question the CloudTrail log occasions is through the use of CloudTrail Lake. Arrange the occasion information retailer (seek advice from the next directions) and rerun the queries for User1, User2, and User3. You may question the entry occasions utilizing CloudTrail Lake with the next pattern question:

SELECT eventTime,userIdentity.onBehalfOf.userid AS idcUserId,requestParameters as accessInfo, serviceEventDetails
FROM 04d81d04-753f-42e0-a31f-2810659d9c27
WHERE userIdentity.arn IS NOT NULL AND eventName="BatchGetTable" or eventName="GetDataAccess" or eventName="CreateDataSet"
order by eventTime DESC

The next screenshot exhibits an instance of the detailed outcomes with audit explanations.

Clear up

To keep away from incurring additional expenses, delete the CloudFormation stack. Earlier than you delete the CloudFormation stack, delete all of the assets you created utilizing the console or AWS CLI:

  1. Manually delete any EMR Studio Workspaces you created with User2.
  2. Delete the Athena workgroup created as a part of the User1 setup.
  3. Delete the QuickSight VPC connection you created.
  4. Delete the Redshift IAM Id Heart connection.
  5. Deregister IAM Id Heart from S3 Entry Grants.
  6. Delete the CloudFormation stack.
  7. Manually delete the VPC created by AWS CloudFormation.

Conclusion

On this put up, we delved into the trusted identification propagation characteristic of AWS Id Heart alongside varied AWS Analytics companies, demonstrating its utility in managing permissions utilizing company consumer or group identities somewhat than IAM roles. We examined various consumer personas using interactive instruments like Athena, EMR Studio notebooks, Redshift Question Editor V2, and QuickSight, all centralized beneath Lake Formation for streamlined permission administration. Moreover, we explored S3 Entry Grants for S3 bucket entry administration, and concluded with insights into auditing via CloudTrail occasions and CloudTrail Lake for a complete overview of consumer information entry.

For additional studying, seek advice from the next assets:


Concerning the Writer

Shoukat Ghouse is a Senior Huge Knowledge Specialist Options Architect at AWS. He helps clients around the globe construct strong, environment friendly and scalable information platforms on AWS leveraging AWS analytics companies like AWS Glue, AWS Lake Formation, Amazon Athena and Amazon EMR.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *