Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert MNIST PyTorch and MNIST Ray test cases to go #103

Merged
merged 1 commit into from
Sep 15, 2023

Conversation

sutaakar
Copy link
Contributor

@sutaakar sutaakar commented Sep 1, 2023

Converting

function test_mcad_torchx_functionality() {
header "Testing MCAD TorchX Functionality"
########### Clean Cluster should be free of these resources ############
# Get appwrapper name
AW=$(oc get appwrapper -n ${ODHPROJECT} | grep mnistjob | cut -d ' ' -f 1) || true
# Clean up resources
if [[ -n $AW ]]; then
os::cmd::expect_success "oc delete appwrapper $AW -n ${ODHPROJECT} || true"
fi
os::cmd::expect_success "oc delete notebook jupyter-nb-kube-3aadmin -n ${ODHPROJECT} || true"
os::cmd::expect_success "oc delete cm notebooks-mcad -n ${ODHPROJECT} || true"
os::cmd::expect_success "oc delete pvc jupyterhub-nb-kube-3aadmin-pvc -n ${ODHPROJECT} || true"
##############################################################################
# Wait for the notebook controller ready
os::cmd::try_until_text "oc get deployment odh-notebook-controller-manager -n ${ODHPROJECT} --no-headers=true | awk '{print \$2}'" "1/1" $odhdefaulttimeout $odhdefaultinterval
# Create a mnist_ray_mini.ipynb as a configMap
os::cmd::expect_success "oc create configmap notebooks-mcad -n ${ODHPROJECT} --from-file=${RESOURCEDIR}/mnist_mcad_mini.ipynb"
# Get Token
local TESTUSER_BEARER_TOKEN="$(curl -skiL -u $TEST_USER:$TEST_PASS -H 'X-CSRF-Token: xxx' "$OPENSHIFT_OAUTH_ENDPOINT/oauth/authorize?response_type=token&client_id=openshift-challenging-client" | grep -oE 'access_token=[^&]*'| sed 's/access_token=//')"
os::cmd::expect_success "cat ${RESOURCEDIR}/custom-nb-small.yaml \
| sed s/%INGRESS%/$(oc get ingresses.config/cluster -o jsonpath={.spec.domain})/g \
| sed s/%OCPSERVER%/$(oc whoami --show-server=true|cut -f3 -d "/")/g \
| sed s/%OCPTOKEN%/${TESTUSER_BEARER_TOKEN}/g \
| sed s/%NAMESPACE%/${ODHPROJECT}/g \
| sed s/%JOBTYPE%/mcad/g | oc apply -n ${ODHPROJECT} -f -"
# Wait for the notebook-server to be ready
os::cmd::try_until_text "oc get pod -n ${ODHPROJECT} | grep "jupyter-nb-kube-3aadmin" | awk '{print \$2}'" "2/2" $odhdefaulttimeout $odhdefaultinterval
# Wait for appwrapper to exist
os::cmd::try_until_text "oc get appwrapper -n ${ODHPROJECT} | grep mnistjob" "mnistjob-*" $odhdefaulttimeout $odhdefaultinterval
# Get appwrapper name
AW=$(oc get appwrapper -n ${ODHPROJECT} | grep mnistjob | cut -d ' ' -f 1)
# Wait for the mnisttest appwrapper state to become running
os::cmd::try_until_text "oc get appwrapper $AW -n ${ODHPROJECT} -ojsonpath='{.status.state}'" "Running" $odhdefaulttimeout $odhdefaultinterval
# Wait for workload to succeed and clean up
os::cmd::try_until_text "oc get appwrapper $AW -n ${ODHPROJECT}" ".*NotFound.*" $odhdefaulttimeout $odhdefaultinterval
# Test clean up resources
os::cmd::expect_success "oc delete notebook jupyter-nb-kube-3aadmin -n ${ODHPROJECT}"
os::cmd::expect_failure "oc get notebook jupyter-nb-kube-3aadmin -n ${ODHPROJECT}"
os::cmd::expect_success "oc delete cm notebooks-mcad -n ${ODHPROJECT} || true"
os::cmd::expect_failure "oc get cm notebooks-mcad -n ${ODHPROJECT}"
os::cmd::expect_success "oc delete appwrapper $AW -n ${ODHPROJECT} || true"
os::cmd::expect_failure "oc get appwrapper $AW -n ${ODHPROJECT}"
os::cmd::expect_success "oc delete pvc jupyterhub-nb-kube-3aadmin-pvc -n ${ODHPROJECT} || true"
os::cmd::expect_failure "oc get pvc jupyterhub-nb-kube-3aadmin-pvc -n ${ODHPROJECT}"
}
into go lang integration tests.

Description

As part of #63 this PR converts MNIST PyTorch test case into go lang.

How Has This Been Tested?

Tested by executing locally against OCP.

Merge criteria:

  • The commits are squashed in a cohesive manner and have meaningful messages.
  • Testing instructions have been added in the PR body (for PRs involving changes that are not immediately obvious).
  • The developer has manually tested the changes and verified that the changes work

tests/go.mod Outdated
@@ -54,4 +54,6 @@ require (
sigs.k8s.io/yaml v1.3.0 // indirect
)

replace github.com/project-codeflare/codeflare-operator => /home/ksuta/src/github.com/project-codeflare/codeflare-operator
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't seem right

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, i see the pr is marked wip. Sorry!

tests/integration/mcad_pytorch_test.go Outdated Show resolved Hide resolved
tests/integration/mcad_pytorch_test.go Outdated Show resolved Hide resolved
tests/integration/mcad_pytorch_test.go Outdated Show resolved Hide resolved
tests/integration/mcad_pytorch_test.go Outdated Show resolved Hide resolved
tests/integration/mcad_pytorch_test.go Outdated Show resolved Hide resolved
@sutaakar sutaakar force-pushed the mcad-tests branch 8 times, most recently from a31a8f6 to dde4d7c Compare September 8, 2023 12:02
@sutaakar
Copy link
Contributor Author

sutaakar commented Sep 8, 2023

@astefanutti adjusted PR:

  • added a test for MCAD Ray
  • replaced oc CLI commands with dynamic client
  • minor adjustments here and there

TODO:

  • change AppWrapper group - waiting for a new SDK release

There may be potential issue with MCAD Ray test for OpenShift CI as Ray head requires a lot of resources. Maybe there is a need for another node (or for a possibility to modify head node requirements). Will see once TODO is addressed.

@sutaakar sutaakar changed the title Convert MNIST PyTorch test case to go Convert MNIST PyTorch and MNIST Ray test cases to go Sep 8, 2023
Copy link
Contributor

@astefanutti astefanutti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome. I've left a couple of small comments. Otherwise I'll LTGM it once ready.

tests/integration/mcad_test.go Outdated Show resolved Hide resolved
tests/integration/mcad_test.go Outdated Show resolved Hide resolved
tests/integration/mcad_test.go Outdated Show resolved Hide resolved
tests/integration/mcad_test.go Outdated Show resolved Hide resolved
tests/integration/mcad_test.go Outdated Show resolved Hide resolved
tests/integration/support/net.go Outdated Show resolved Hide resolved
tests/integration/support/net.go Outdated Show resolved Hide resolved
tests/integration/support/net.go Show resolved Hide resolved
tests/integration/support/net.go Show resolved Hide resolved
tests/integration/support/odh.go Show resolved Hide resolved
tests/integration/support/rbac.go Show resolved Hide resolved
@sutaakar
Copy link
Contributor Author

Changes applied.

@astefanutti
Copy link
Contributor

/lgtm

@astefanutti
Copy link
Contributor

/lgtm

@openshift-ci openshift-ci bot added the lgtm label Sep 12, 2023
@sutaakar sutaakar marked this pull request as ready for review September 12, 2023 10:35
@sutaakar
Copy link
Contributor Author

Failing because the ODH core KfDef/DSC is not installed in OpenShift CI.
Should be fixed either by openshift/release#43247 or once I migrate the ODH DW to use DSC.

@sutaakar
Copy link
Contributor Author

/retest

3 similar comments
@sutaakar
Copy link
Contributor Author

/retest

@sutaakar
Copy link
Contributor Author

/retest

@sutaakar
Copy link
Contributor Author

/retest

@sutaakar
Copy link
Contributor Author

/retest

@sutaakar
Copy link
Contributor Author

TestMnistPyTorchMCAD passed, TestMCADRay failed, supposedly on lack of resources to spin up the Ray head node

@sutaakar
Copy link
Contributor Author

Skipping unstable/currently failing tests.

@sutaakar
Copy link
Contributor Author

/retest

@sutaakar
Copy link
Contributor Author

@astefanutti Can you please take a final look/provide approval?

@astefanutti
Copy link
Contributor

/lgtm

@openshift-ci openshift-ci bot added the lgtm label Sep 15, 2023
@astefanutti
Copy link
Contributor

/approve

@openshift-ci
Copy link

openshift-ci bot commented Sep 15, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: astefanutti

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@sutaakar
Copy link
Contributor Author

/retest

@openshift-merge-robot openshift-merge-robot merged commit 1bdbacf into opendatahub-io:main Sep 15, 2023
1 check passed
@sutaakar sutaakar deleted the mcad-tests branch September 15, 2023 10:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants