Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SEAB-5604: submit github delivery event to s3 #162

Merged
merged 18 commits into from
Apr 18, 2024
Merged

Conversation

hyunnaye
Copy link
Contributor

@hyunnaye hyunnaye commented Mar 27, 2024

Description
This PR adds code to the existing github lambda to send the message to a S3 bucket. The lambda uploads the event in the path date/deliveryid

Issue
https://ucsc-cgl.atlassian.net/browse/SEAB-5604

Security
If there are any concerns that require extra attention from the security team, highlight them here.

Please make sure that you've checked the following before submitting your pull request. Thanks!

  • Ensure that the PR targets the correct branch. Check the milestone or fix version of the ticket.

@hyunnaye hyunnaye self-assigned this Mar 27, 2024
@hyunnaye hyunnaye marked this pull request as ready for review April 12, 2024 20:07
upsertGitHubTag/deployment/index.js Outdated Show resolved Hide resolved
upsertGitHubTag/deployment/index.js Outdated Show resolved Hide resolved
upsertGitHubTag/deployment/index.js Outdated Show resolved Hide resolved
const client = new S3Client({});
const command = new PutObjectCommand({
Bucket: process.env.BUCKET_NAME,
Key: deliveryId,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will put all objects in the "root" of the bucket. Is that what we want? If we know the deliveryId, then it works. If we don't know it, then it's going to be hard to find.

Is it worth putting them in keys by date/org or org/date?

I don't know the answer, just raising the question. It depends on how we expect to use this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is now changed to date/repository/deliveryid

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to change it to date/deliveryid as there can be multiple repos in a single delivery.

upsertGitHubTag/deployment/index.js Show resolved Hide resolved
@@ -253,6 +254,23 @@ function processEvent(event, callback) {
" from GitHub.",
});
}
// If bucket name is not null (had to put this for the integration test)
if (process.env.BUCKET_NAME) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is called after we invoke callback above...should it be before? Doesn't the lambda use callback to return a result? (general question, callbacks confuse me)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like it will still execute after the callback: https://stackoverflow.com/questions/49688927/how-do-i-stop-execution-of-a-aws-lambda-after-a-callback. Although maybe you want to log earlier to avoid confusion. It doesn't look like it will affect performance either way

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In looking at Kathy's question, I noticed the method is getting pretty big (we don't have a linter in place), so I'd optionally suggest creating a method out of this if block, e.g., logPayloadToS3().

Copy link
Contributor Author

@hyunnaye hyunnaye Apr 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How strongly do we feel about moving my s3 code to be before the callback? I avoided putting my code to s3 before the callback to avoid too many if-blocks because we want to avoid submitting the event if the event type is not supported. So, I put return in the else condition (line 257) above such that the new s3 code wouldn't be ran.

Copy link
Member

@denis-yuen denis-yuen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a couple comments

upsertGitHubTag/deployment/index.js Outdated Show resolved Hide resolved
upsertGitHubTag/deployment/index.js Outdated Show resolved Hide resolved
Copy link
Contributor

@coverbeck coverbeck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed with Denis' comments. Since today is my last day of the sprint and proposed changes seem pretty minor, approving.

@@ -253,6 +254,23 @@ function processEvent(event, callback) {
" from GitHub.",
});
}
// If bucket name is not null (had to put this for the integration test)
if (process.env.BUCKET_NAME) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like it will still execute after the callback: https://stackoverflow.com/questions/49688927/how-do-i-stop-execution-of-a-aws-lambda-after-a-callback. Although maybe you want to log earlier to avoid confusion. It doesn't look like it will affect performance either way

@@ -253,6 +254,23 @@ function processEvent(event, callback) {
" from GitHub.",
});
}
// If bucket name is not null (had to put this for the integration test)
if (process.env.BUCKET_NAME) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In looking at Kathy's question, I noticed the method is getting pretty big (we don't have a linter in place), so I'd optionally suggest creating a method out of this if block, e.g., logPayloadToS3().

return;
}
// If bucket name is not null (had to put this for the integration test)
if (process.env.BUCKET_NAME) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I make this part of the logPayloadToS3 method, logPayloadtoS3(body, bucketPath, deliveryId).

How strongly do we feel about moving my s3 code to be before the callback? I avoided putting my code to s3 before the callback to avoid too many if-blocks because we want to avoid submitting the event if the event type is not supported. So, I put return in the else condition (line 257) above such that the new s3 code wouldn't be ran.

Then if you want to do it before the callback, or in each of the conditions, you're only adding one line.

I don't feel too strongly about going before. Maybe add a comment before the method invocation that it will execute even if the callback has been invoked, since that caused some confusion in the PR review.

Copy link
Member

@denis-yuen denis-yuen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With @coverbeck and @kathy-t comments (my feedback has been addressed)

Copy link
Contributor

@svonworl svonworl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for persevering, the existing code is super-duper confusing. If I remember correctly, there's actually two varieties of callbacks, both referenced as callback and invoked at various points in the code. And then, to muddy things further, there's the handleCallback function, a better name for which would be handleResponse.

@hyunnaye hyunnaye merged commit 1385866 into develop Apr 18, 2024
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants