Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pod status rule is misleading #65

Open
jdharmon opened this issue Jan 21, 2021 · 3 comments
Open

Pod status rule is misleading #65

jdharmon opened this issue Jan 21, 2021 · 3 comments

Comments

@jdharmon
Copy link
Contributor

jdharmon commented Jan 21, 2021

Kubectl shows the pod phase as the status, e.g. Running, Succeeded, Failed. The pod status rule checks the optional reason, e.g. Evicted. This is confusing, and also means you cannot create a POD_STATUS=Running rule. Can we change the pod status rule to use phase? Should the current reason rule be renamed/use a different env variable? POD_STATUS_REASON?

Relevant documentation:

	// The phase of a Pod is a simple, high-level summary of where the Pod is in its lifecycle.
	// The conditions array, the reason and message fields, and the individual container status
	// arrays contain more detail about the pod's status.
	// There are five possible phase values:
	//
	// Pending: The pod has been accepted by the Kubernetes system, but one or more of the
	// container images has not been created. This includes time before being scheduled as
	// well as time spent downloading images over the network, which could take a while.
	// Running: The pod has been bound to a node, and all of the containers have been created.
	// At least one container is still running, or is in the process of starting or restarting.
	// Succeeded: All containers in the pod have terminated in success, and will not be restarted.
	// Failed: All containers in the pod have terminated, and at least one container has
	// terminated in failure. The container either exited with non-zero status or was terminated
	// by the system.
	// Unknown: For some reason the state of the pod could not be obtained, typically due to an
	// error in communicating with the host of the pod.
	//
	// More info: https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle#pod-phase
	// +optional
	Phase PodPhase `json:"phase,omitempty" protobuf:"bytes,1,opt,name=phase,casttype=PodPhase"`
	// A brief CamelCase message indicating details about why the pod is in this state.
	// e.g. 'Evicted'
	// +optional
	Reason string `json:"reason,omitempty" protobuf:"bytes,4,opt,name=reason"`
@brianberzins
Copy link
Collaborator

I was wondering if/when this would ever come back to bite me... #37
Long story short... "status" as listed by kubectl isn't either pod-phase or reason... it's actually something different... something not addressed directly in the the API from what I saw.

I suspect that Running isn't actually a "pod status" but actually a "container status". I 100% agree, it's confusing as hell!

// ContainerState holds a possible state of container.
// Only one of its members may be specified.
// If none of them is specified, the default one is ContainerStateWaiting.
type ContainerState struct {
	// Details about a waiting container
	// +optional
	Waiting *ContainerStateWaiting `json:"waiting,omitempty" protobuf:"bytes,1,opt,name=waiting"`
	// Details about a running container
	// +optional
	Running *ContainerStateRunning `json:"running,omitempty" protobuf:"bytes,2,opt,name=running"`
	// Details about a terminated container
	// +optional
	Terminated *ContainerStateTerminated `json:"terminated,omitempty" protobuf:"bytes,3,opt,name=terminated"`
}

There isn't anything that detects "Running" container state right now, and I think changing the "container status" rule would probably would result in behavior that isn't intended (I think changing the already existing rule would result in the pod being flagged for reaping if ANY of the containers in it were "Running" -- like it currently identifies if ANY container in the pod is "terminated" or "waiting"... uhg...

@jdharmon
Copy link
Contributor Author

Perhaps the way forward is

  1. Rename current pod_status rule to pod_status_reason, and change the environment variable to POD_STATUS_REASON
  2. Add pod_status_phase rule with environment variable POD_STATUS_PHASE

@gburiola
Copy link

+1 to the suggestion from @jdharmon above.

We spent quite some time understanding why pod reaper was not working when using POD_STATUSES=Succeeded and POD_STATUSES=Completed only to realize it was using pod.reason and not pod.phase.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants