Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Add mirror creation script #2008

Draft
wants to merge 46 commits into
base: develop-v5.1.0
Choose a base branch
from
Draft

Conversation

JimMadge
Copy link
Member

@JimMadge JimMadge commented Jul 10, 2024

✅ Checklist

  • You have given your pull request a meaningful title (e.g. Enable foobar integration rather than 515 foobar).
  • You are targeting the appropriate branch. If you're not certain which one this is, it should be develop.
  • Your branch is up-to-date with the target branch (it probably was when you started, but it may have changed since then).

🚦 Depends on

⤴️ Summary

Add Azure function apps to create read only mirrors in Gitea.

🌂 Related issues

Closes #1996
Related to #1997

🔬 Tests

Copy link

github-actions bot commented Jul 10, 2024

Coverage report

Click to see where and how coverage changed

FileStatementsMissingCoverageCoverage
(new stmts)
Lines missing
  data_safe_haven/administration/users
  entra_users.py 35-43
  research_user.py 19
  user_handler.py 67
  data_safe_haven/commands
  users.py 55
  data_safe_haven/external/api
  graph_api.py 423-459, 704, 724-730, 754, 787-790, 809-814, 833-841
  data_safe_haven/infrastructure/components/composite
  postgresql_database.py 30, 83-84
  data_safe_haven/infrastructure/programs
  declarative_sre.py 335
  data_safe_haven/infrastructure/programs/sre
  apps.py 26-27, 41-121, 152-173, 186-192
  data_safe_haven/resources/gitea_mirror/functions
  function_app.py 22-23
Project Total  

This report was generated by python-coverage-comment-action

@JimMadge
Copy link
Member Author

Running this locally

❯ http localhost:7071/api/create-mirror address=https://github.com/JimMadge/dotfiles name=dotfiles username=jim password=password
HTTP/1.1 200 OK
Content-Type: text/plain; charset=utf-8
Date: Wed, 10 Jul 2024 14:24:09 GMT
Server: Kestrel
Transfer-Encoding: chunked

Mirror successfully created

And the function environment output

[2024-07-10T14:24:07.542Z] Executing 'Functions.create_mirror' (Reason='This function was programmatically called via the host APIs.', Id=20e523e8-9ad1-4368-889a-0b463d0c39ed)
[2024-07-10T14:24:07.549Z] Request received.
[2024-07-10T14:24:07.549Z] parameters: address=https://github.com/JimMadge/dotfiles, name=dotfiles, password=password, username=jim
[2024-07-10T14:24:07.550Z] Sending request to create mirror.
[2024-07-10T14:24:10.173Z] Response status code: 201.
[2024-07-10T14:24:10.173Z] Sending request to configure mirror repo.
[2024-07-10T14:24:10.293Z] Response status code: 200.
[2024-07-10T14:24:10.297Z] Executed 'Functions.create_mirror' (Succeeded, Id=20e523e8-9ad1-4368-889a-0b463d0c39ed, Duration=2755ms)

@JimMadge
Copy link
Member Author

this looks like a good reference for setting up a function app using Pulumi.
Note that the the app deployment needs some information you might need to construct from the Outputs of the storage account.

Comment on lines 120 to 138
# Deploy app
web.WebApp(
f"{self._name}_web_app",
enabled=True,
https_only=True,
kind="FunctionApp",
location=props.location,
name="giteamirror",
resource_group_name=props.resource_group_name,
server_farm_id=app_service_plan.id,
site_config=web.SiteConfig(
app_settings=[
{"name": "runtime", "value": "python"},
{"name": "FUNCTIONS_WORKER_RUNTIME", "value": "python"},
{"name": "WEBSITE_RUN_FROM_PACKAGE", "value": blob_url},
{"name": "FUNCTIONS_EXTENSION_VERSION", "value": "~4"},
],
)
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason we prefer the gitea mirror to run as a webapp rather than an ACI? Is it cheaper/easier/better in some other way to do this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part isn't the Gitea instance, it is a webapp that uses the Gitea API to create/delete mirrors.

I am hoping that using the Azure Functions framework we can avoid the complication of managing flask/fastapi/etc. and use Entra ID authentication.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless there is a reason not to, I think we should deploy all Gitea instances the same way.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking that that now, there might be a fair bit of work to either generalise the Gitea configuration or make a new Pulumi ComponentResource.

I think we will want the mirror instance to be fairly different,

  • No login
  • Browseable without login
  • No registration
  • No LDAP authentication

@jemrobinson jemrobinson changed the base branch from develop to develop-v5.1.0 July 30, 2024 14:04
@JimMadge
Copy link
Member Author

JimMadge commented Jul 31, 2024

Resources deploy. The webapp service is not able to fetch the app from storage though.

Using the connection url + SAS token gives,

error: ContentDecodingError: ('Received response with content-encoding: deflate, but failed to decode it.', error('Error -3 while decompressing data: invalid stored block lengths'))

Possibly a problem with the construction of that string, or the way data is uploaded to storage.

@JimMadge
Copy link
Member Author

Although, wget is able to fetch the zip for the SAS URL 😕.

@JimMadge
Copy link
Member Author

JimMadge commented Aug 1, 2024

The connection string was needed to fetch and unpack the Zip from blob storage.
Pulumi ts examples showed how to do this.

Now there seems to be some trouble identifying the endpoints, so I'm taking a step back and looking at examples to see what needs to be done.

@JimMadge
Copy link
Member Author

JimMadge commented Aug 1, 2024

The problem was likely using the v2 Python programming model, where as examples use the v1 model.

I couldn't find clear documentation on how to use the v2 model outside of deploying functions using Azure CLI, or examples of using the v2 model with Pulumi. Using the v1 model might be the most sensible thing for now.

@JimMadge
Copy link
Member Author

JimMadge commented Aug 1, 2024

Following the v1 model directory structure means the functions get recognised and registered.

@JimMadge JimMadge self-assigned this Nov 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Automate Gitea mirror creation
2 participants