Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
45 commits
Select commit Hold shift + click to select a range
c1bb584
Set target schema as a parameter
jlongland Sep 4, 2024
24539f0
Reverted previous change. Schema is effectively the db username
jlongland Sep 4, 2024
f588be3
Use parameter instead of hard-coded value
jlongland Sep 4, 2024
333b27f
Make stack outputs unique in the event there are multiple copies of t…
jlongland Sep 4, 2024
6769b17
Updates to setup script to handle multiple copies of the stack in an …
jlongland Sep 4, 2024
0a3d6dc
Add conditions to use an existing database
jlongland Sep 4, 2024
4afb8ef
Fix copy/paste error
jlongland Sep 4, 2024
ecb6909
Fix copy/paste error
jlongland Sep 4, 2024
b864f23
More corrections
jlongland Sep 4, 2024
53c51d8
Improve clarity of conditions
jlongland Sep 4, 2024
464dd76
Merge branch 'main' into byo_db
PyMedic Jun 26, 2025
ed45421
Added AWS::StakName to the AWS resource names and stack outputs to av…
PyMedic Jul 2, 2025
e2557cd
Removed AWS::StackName from some database related AWS resources becau…
PyMedic Jul 4, 2025
fdd880b
Code formatting related change
PyMedic Jul 7, 2025
4eb81a8
Changed the parameter name for the schema name in the create_schema_s…
PyMedic Jul 8, 2025
f8f82a0
Replaced the schema name from canvas to username because that is the …
PyMedic Jul 8, 2025
e0dea98
Added the necessary information when trying to deploy the CD2 stack w…
PyMedic Jul 8, 2025
d314dcf
Resolved the merge conflicts in template.yaml
PyMedic Jul 29, 2025
47738e6
Merge branch 'main' into byo_db
PyMedic Jul 30, 2025
cfe16f1
Removed the stackname reference from AthenaConnector related AWS reso…
PyMedic Jul 30, 2025
e9351db
Reverted the DAP parameter store name changes. For other canvas insta…
PyMedic Jul 31, 2025
b1a0226
Added the condition CreateDatabaseSecurityGroup for the RDS security …
PyMedic Jul 31, 2025
5cc6aa7
Added DatabaseClientSecurityGroupParameter and the condition associat…
PyMedic Jul 31, 2025
c6323ac
Added an option to get security group ID of DatabaseClientSecurityGro…
PyMedic Aug 1, 2025
0df8ba0
Wrapped all Ref calls to DatabaseClientSecurityGroup with an Fn::If t…
PyMedic Aug 1, 2025
9a8befe
Added a new condition called CreateDatabaseClientSecurityGroup and ap…
PyMedic Aug 1, 2025
8b4c669
Changed the SecurityGroupIds expression for AthenaPostgreSQLConnector…
PyMedic Aug 5, 2025
be79d5e
Added the conditions for the resource creation for DatabaseSubnetGrou…
PyMedic Aug 5, 2025
f66be14
Removed the resource DatabaseClientEgressToExistingDatabase because i…
PyMedic Aug 5, 2025
87cec25
Revert "Removed the resource DatabaseClientEgressToExistingDatabase b…
PyMedic Aug 5, 2025
2f9b644
Created new parameters for SecretsKmsKey and DataKmsKey. Added if-con…
PyMedic Aug 6, 2025
1b35d9c
Removed SecretsKmsKeyArnParameter and DataKmsKeyArnParameter because …
PyMedic Aug 7, 2025
a09c26b
Added the hardcoded ECS fargate cluster name for the canvas-data-2-ca…
PyMedic Aug 7, 2025
03e74a4
Assigned the dynamic name for the CD2 Data Refresh event schedule.
PyMedic Aug 7, 2025
6bee27c
Added the System Managers parameter name in the inti_table, list_tabl…
PyMedic Aug 12, 2025
e151501
Added a new environment variable called DB_CD2_USER to the list_table…
PyMedic Aug 12, 2025
6f9eb39
Applied the new environment variable to the list_tables code.
PyMedic Aug 12, 2025
026932a
Added MainCD2StackNameParameter. This parameter is used in the ECS Ta…
PyMedic Aug 12, 2025
8ff9728
Added the steps to grant the necessary database permissions for the n…
PyMedic Aug 14, 2025
f9e590e
Included the ARN of the secrets for the catalog cd2 db user under the…
PyMedic Aug 14, 2025
8f5a8a3
Renamed the secrets name for CD2 canvas/catalog user. Changed the ARN…
PyMedic Aug 14, 2025
e635e5f
Reverted the secret name for AthenaPostgreSQLConnectorExecutionPolicy.
PyMedic Aug 14, 2025
28ec3cc
Removed the stackname from the DefaultConnectionString from AthenaPos…
PyMedic Aug 15, 2025
b393ae9
Removed the debug statement for a SQL query because it is considered …
PyMedic Aug 25, 2025
6372fe1
Added the instruction on how how to deploy the CD2 cloudformation sta…
PyMedic Aug 25, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
37 changes: 36 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,25 @@ It will be helpful to have a working knowledge of AWS services and the AWS Conso

By default the database will not have a public IP address and will not be accessible outside of your VPC. You will need to configure network access to the database as appropriate for your situation.

### If you want to use the existing RDS cluster

If you want to use the existing RDS cluster, you need to create the following AWS resources:
1. One parameter in the AWS System Manager Parameter Store for Canvas Data 2 DAP Client ID
2. One parameter in the AWS System Manager Parameter Store for Canvas Data 2 DAP Client Secret

You also need the following information:
1. RDS Database Subnets where your existing RDS cluster is located.
- Parameter Name: `DatabaseSubnetListParameter`
2. Database Admin user name for the existing RDS cluster
- Parameter Name: `DatabaseAdminUserParameter`
3. The ARN of the RDS cluster
- Parameter Name: `DatabaseClusterArnParameter`
4. The ARN of the Database Admin Secret
- Parameter Name: `DatabaseAdminUserParameter`
5. Database Security Group Name (ex: sg-.....)
- Parameter Name: `DatabaseSecurityGroupParameter`


## Deploying the application

The Serverless Application Model Command Line Interface (SAM CLI) is an extension of the AWS CLI that adds functionality for building and testing Lambda applications. It uses Docker to run your functions in an Amazon Linux environment that matches Lambda. It can also emulate your application's build environment and API.
Expand Down Expand Up @@ -62,7 +81,7 @@ Deploying this application will create:
- An Aurora Postgres cluster
- A database user credential in AWS Secrets Manager.

In order for the application to use that credential to connect to the database, a database user must be created and granted appropriate privileges. A helper script is included that will take care of this setup.
In order for the application to use that credential to connect to the database, a database user must be created and granted appropriate privileges. A helper script is included that will take care of this setup.

After deploying the SAM app, run this script. You must have valid AWS credentials before running the script.

Expand All @@ -71,6 +90,22 @@ pip install setup/requirements.txt -r
./setup/prepare_aurora_db.py --stack-name <stack name returned by the SAM deployment>
```

#### (Optional) If you are creating a new database user and schema for an additional Canvas instance:
1. If you create a new cloudformation stack for an additional Canvas instance, you need to modify `secret_name_prefix` so that it can target the correct secrets for the RDS database credential for the new DB user.

2. When you run the `prepare_aurora_db.py` script, add `--is-additional-stack` argument.

```
pip install setup/requirements.txt -r
./setup/prepare_aurora_db.py --stack-name <stack name returned by the SAM deployment> --is-additional-stack
```

3. You need to grant the access to the new schema for the database user `athena`.
- You need to run the following queries in order for the Athena connector to access all tables in the new schema:
- `GRANT SELECT ON ALL TABLES IN SCHEMA catalog TO athena;`
- `GRANT USAGE ON SCHEMA catalog TO athena;`
- `ALTER DEFAULT PRIVILEGES IN SCHEMA catalog GRANT SELECT ON TABLES TO athena;`: This allows any new tables created in catalog automatically give SELECT to the DB user `athena`.

Occasionally the schema for a CD2 table will change. The DAP library will take care of applying these changes to the database, but they will not succeed if you have created views that depend on the table. To handle this situation, the `sync_table` Lambda function will attempt to drop and recreate any views that depend on the table being synced. The pgsql functions necessary to do this can be found in this repository: https://github.com/rvkulikov/pg-deps-management. You will need to run the `ddl.sql` script in your database to create the necessary functions. (details tbd)

## Configuration
Expand Down
5 changes: 3 additions & 2 deletions init_table/app.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,9 +22,10 @@
logger = Logger()

env = os.environ.get('ENV', 'dev')
ssm_parameter_name = os.environ.get('SSM_PARAMETER_NAME', 'canvas_data_2')
db_user_secret_name = os.environ.get('DB_USER_SECRET_NAME')

param_path = f'/{env}/canvas_data_2'
param_path = f'/{env}/{ssm_parameter_name}'

api_base_url = os.environ.get('API_BASE_URL', 'https://api-gateway.instructure.com')

Expand Down Expand Up @@ -95,7 +96,7 @@ async def init_table(credentials, api_base_url, db_connection, namespace, table_
if token:
stepfunctions.send_task_success(
taskToken=token,
output=json.dumps(payload))
output=json.dumps(payload))

"""
if token and result['state'] == 'complete':
Expand Down
6 changes: 4 additions & 2 deletions list_tables/app.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,12 +17,14 @@
logger = Logger()

env = os.environ.get('ENV', 'dev')
ssm_parameter_name = os.environ.get('SSM_PARAMETER_NAME', 'canvas_data_2')
db_user = os.environ.get('DB_CD2_USER', 'canvas')

param_path = f'/{env}/canvas_data_2'
param_path = f'/{env}/{ssm_parameter_name}'

api_base_url = os.environ.get('API_BASE_URL', 'https://api-gateway.instructure.com')

namespace = 'canvas'
namespace = db_user

REGION = os.environ["AWS_REGION"]
SLACK_WEBHOOK_URL_SECRET_NAME = os.getenv("SLACK_WEBHOOK_SECRET_NAME")
Expand Down
56 changes: 46 additions & 10 deletions setup/prepare_aurora_db.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,12 @@
help="The name of the Canvas Data 2 CloudFormation stack containing the Aurora database",
required=True,
)
parser.add_argument(
"--is-additional-stack",
help="Whether this is for the DB changes for the additional CD2 stack",
action="store_true", # If specified, sets the value as True
default=False # Default is False when not passed
)
args = parser.parse_args()

console = Console()
Expand All @@ -22,6 +28,8 @@
cf_resource = boto3.resource("cloudformation")
stack = cf_resource.Stack(args.stack_name)

is_additional_stack = args.is_additional_stack

console.print("Starting database preparation", style="bold green")

# Fetch stack outputs and parameters
Expand Down Expand Up @@ -91,7 +99,7 @@ def create_user(username, password, database_name):
def create_schema(schema_name, username, database_name):
"""Create a schema with user as owner"""
try:
create_schema_sql = f"CREATE SCHEMA IF NOT EXISTS {username} AUTHORIZATION {username}"
create_schema_sql = f"CREATE SCHEMA IF NOT EXISTS {schema_name} AUTHORIZATION {username}"
execute_statement(create_schema_sql, database_name)
console.print(f" - Created schema {schema_name} with owner {username}", style="bold green")
except ClientError as e:
Expand Down Expand Up @@ -127,6 +135,27 @@ def grant_user_to_admin(username, admin_username, database_name):
except ClientError as e:
console.print(f" ! Error granting user {username} to user {admin_username}: {e}", style="bold red")

def grant_create_permission_on_db_to_db_user(username, database_name):
"""Grant CREATE permission on the database to the DB user"""
try:
grant_create_permission_sql = f"GRANT CREATE ON DATABASE {database_name} TO {username}"
execute_statement(grant_create_permission_sql, database_name)
console.print(f" - Granted CREATE permission on the database {database_name} to user {username}", style="bold green")
except ClientError as e:
console.print(f" ! Error granting CREATE permission on the database {database_name} to user {username}: {e}", style="bold red")

def grant_access_permission_on_instructure_dap_schema_to_db_user(username, database_name):
"""Grant SELECT, INSERT, UPDATE, DELETE permissions on the instructure_dap schema to the DB user"""
try:
tables = ["database_version", "table_sync"]

for tablename in tables:
grant_access_permission_sql = f"GRANT INSERT, SELECT, UPDATE, DELETE ON instructure_dap.{tablename} TO {username}"
execute_statement(grant_access_permission_sql, database_name)
console.print(f" - Granted SELECT, INSERT, UPDATE, and DELETE permission on the schema instructure_dap to user {username}", style="bold green")
except ClientError as e:
console.print(f" ! Error granting SELECT, INSERT, UPDATE, and DELETE permission on the schema instructure_dap to user {username}: {e}", style="bold red")

# Get all database user secrets
secret_name_prefix = f"{prefix}-cd2-db-user-{env}-"
user_secrets = secrets_client.list_secrets(
Expand All @@ -140,27 +169,34 @@ def grant_user_to_admin(username, admin_username, database_name):
secret_value = json.loads(secrets_client.get_secret_value(SecretId=secret_arn)["SecretString"])
username = secret_value["username"]
database_name = secret_value["dbname"]

# Create or update the user
create_user(username, secret_value["password"], database_name)

# Grant user to admin user
grant_user_to_admin(username, admin_username, database_name)

# Create schema for user (with them as owner) if they need a schema
if username in users_to_create_schema:
create_schema(username, username, database_name)

# Create instructure_dap schema for the CD2 database user with them as owner
if username == db_user_username:
create_schema("instructure_dap", username, database_name)

# Assign privileges to canvas and instructure_dap schemas
# Defaults to read-only if user is not set in user_roles dict
user_role = get_user_role(username)
grant_usage_to_schema(username, "canvas", database_name)
assign_privileges(username, "canvas", user_role, database_name)

grant_usage_to_schema(username, username, database_name)
assign_privileges(username, username, user_role, database_name)

grant_usage_to_schema(username, "instructure_dap", database_name)
assign_privileges(username, "instructure_dap", user_role, database_name)
assign_privileges(username, "instructure_dap", user_role, database_name)

# Grant the CREATE privilege on the cd2 database.
grant_create_permission_on_db_to_db_user(username, database_name)

# If this is for the new additional stack, grant the access permission to the instructure_dap schema.
if is_additional_stack:
grant_access_permission_on_instructure_dap_schema_to_db_user(username, database_name)
3 changes: 2 additions & 1 deletion sync_table/app.py
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,8 @@
db_cluster_arn = os.environ.get("DB_CLUSTER_ARN")
db_user_secret_name = os.environ.get("DB_USER_SECRET_NAME")
admin_secret_arn = os.environ.get("ADMIN_SECRET_ARN")
param_path = f"/{env}/canvas_data_2"
ssm_parameter_name = os.environ.get('SSM_PARAMETER_NAME', 'canvas_data_2')
param_path = f"/{env}/{ssm_parameter_name}"
api_base_url = os.environ.get("API_BASE_URL", "https://api-gateway.instructure.com")

FUNCTION_NAME = 'sync_table'
Expand Down
Loading