loader

Troubleshooting Guide
Cluster Connectivity Issues

Gateway Pod Connections

Verify nodes in the cluster(s) are labeled:

kubectl get nodes --show-labels | grep "avesha/node-type=gateway"
Note
It is mandatory to have one or more nodes labeled for slice gateway pods to get scheduled. Please configure one or more nodes.

If nodes are properly labeled:

Verify the presence of a slice gateway object on the cluster:

kubectl get slicegw -n avesha-system
Note
If the object is not present, make sure the api key and the access token that were used to register the cluster are correct.

If the api key and access token are valid and correct:

Verify the presence of the slice router pod beginning with prefix “vl3-nse-<slice_name>-*” in its name is present and in RUNNING state.

kubectl get po -n avesha-system |grep vl3-nse-<slice name>
vl3-nse-slice-iperf-7f9c696776-58jlf     2/2     Running     0     2d2h

If it is not present or in any state other than RUNNING:

Verify if all pods with prefix “nsm-*” are in RUNNING state.

kubectl get po -n avesha-system |grep nsm
nsm-admission-webhook-6f7d8897c9-p78sv    1/1   Running   0   2d3h
nsm-kernel-forwarder-bh6xd                1/1   Running   0   2d3h
nsm-kernel-forwarder-zvg6p                1/1   Running   0   2d3h
nsmgr-mj6g6                               3/3   Running   0   2d3h
nsmgr-nndd8                               3/3   Running   0   2d3h

If the vl3 and nsm pods are missing or not in a RUNNING state:

Note
Collect the following logs and contact support: avehsa-mesh-controller-manager-* pod logs All nsm-* named pods
kubectl get pods -n avesha-system | grep controller
kubectl logs <avesha-mesh-controller-manager-xxxx> -c manager -n avesha-system
kubectl get pods -n avesha-system | grep nsm
kubectl logs nsm-* --all-containers -n avesha-system

If vl3 and nsm pod are in a RUNNING state:

Check if the REMOTE SUBNET and the REMOTE CLUSTER fields are populated in the slice gateway object.

kubectl get slicegw -n avesha-system
NAME                      SUBNET        REMOTE SUBNET  REMOTE CLUSTER  GW STATUS
slice-iperf-s1-6da494a6   10.6.2.0/24                                  SLICE_GATEWAY_STATUS_CERTS_PENDING
Note
If one or both the columns are empty, make sure the slice is installed on the remote cluster. The slice must be installed in at least two clusters for the gateway pods to show up.

If the REMOTE SUBNET and the REMOTE CLUSTER columns are populated:

Check the GW STATUS field. If it says SLICE_GATEWAY_STATUS_CERTS_PENDING, wait a few minutes for the connection to establish and the status to change to SLICE_GATEWAY_STATUS_REGISTERED.

kubectl get slicegw -n avesha-system
NAME                      SUBNET       REMOTE SUBNET   REMOTE CLUSTER                         GW STATUS
slice-iperf-s1-6da494a6   10.6.2.0/24  10.6.1.0/24     6ace0341-7e80-4dcb-be83-aeb3dd783e46   SLICE_GATEWAY_STATUS_CERTS_PENDING
kubectl get slicegw -n avesha-system
NAME                      SUBNET        REMOTE SUBNET   REMOTE CLUSTER                         GW STATUS
slice-iperf-s1-6da494a6   10.6.2.0/24   10.6.1.0/24     6ace0341-7e80-4dcb-be83-aeb3dd783e46   SLICE_GATEWAY_STATUS_REGISTERED
Note
If it takes more than 5 minutes to change status to SLICE_GATEWAY_STATUS_REGISTERED, please contact support.

If the GW STATUS is SLICE_GATEWAY_STATUS_REGISTERED:

Verify gateway pods are present:

There are both Server and Client gateway pods.

Server gateway pods are prefixed with <slice_name>-s1-*

Client gateway pods are prefixed with <slice_name>-c1-s1-*

kubectl get pods -n avesha-system | grep c1
kubectl get pods -n avesha-system | grep s1
Note
If the gateway pods are not present. Please collect the following information and logs and contact support.
kubectl get pods -n avesha-system | grep controller
kubectl logs <avesha-mesh-controller-manager-xxxx> -c manager -n avesha-system
kubectl describe slice -n avesha-system
kubectl describe slicegw -n avesha-system

Identify the cluster that is running the gateway vpn server:

kubectl get pods -n avesha-system | grep s1
kubectl describe slicegw <slice name>-s1* |grep "Slice Gateway Host Type:"
Note
Slice Gateway Host Type: SLICE_GATEWAY_HOST_TYPE_SERVER

Check if the node port service to reach the VPN server is created on the cluster:

kubectl get svc -n avesha-system |grep svc-<name_of_slice>
Note
If the node port service is not present, collect the following logs and contact support:
kubectl logs <slice gateway pod name> --all-containers -n avesha-system
kubectl logs vl3-nse-<name_of_slice>-<> --all-containers -n avesha-system
Note
If the service is present, check if the UDP port number assigned to the service is open to accept new connection requests.

If the above validations pass, change the context to the VPN client cluster and follow these steps:

Check if the headless service to reach the remote vpn server is configured:

kubectl get svc -n avesha-system |grep <slice name>
Note
Note the service name

Check if an endpoint has been populated for the service:

kubectl exec -it <slice gateway pod> -c avesha-sidecar -n avesha-system — nslookup <name of the service obtained in previous step>
Note
If the service is not present or the endpoint is not resolved, collect the following logs and contact support:
kubectl get pods -n avesha-system
kubectl logs <avesha-mesh-controller-manager-xxxx> -c manager -n avesha-system
kubectl logs <slice gateway pod name> --all-containers -n avesha-system
kubectl logs vl3-nse-<name_of_slice>-<> --all-containers -n avesha-system

If the service is present:

Check if a tun interface is created in the gateway pod:

kubectl get pods -n avesha-system | grep s1
kubectl exec -it <slice name>-c1-s1-* -c avesha-sidecar -- ip a | grep tun0
Note
If the interface is not present, please check the underlying network connectivity between the clusters

If all the validations on the status of the slice gateway and the vpn tunnel pass:

Note
Try restarting the application pods on all registered clusters.
Note
If the network connectivity between the clusters is good, collect the following logs and contact support
kubectl get pods -n avesha-system
kubectl logs <avesha-mesh-controller-manager-xxxx> -c manager -n avesha-system
kubectl logs <slice gateway pod name> -c avesha-sidecar -n avesha-system
kubectl logs <slice gateway pod name> -c avesha-openvpn-client -n avesha-system
Post Application Onboarding

Application Unreachable

Slice Configuration

Annotations

Check if the slice is installed on the cluster:

kubectl get slice <slice name> -n avesha-system

Verify slice annotation is present in the application deployment:

kubectl describe deployment <deployment name> -n <namespace> | grep Annotations -A 5
Note
If annotation avesha.io/slice: is not present in the deployment. Update the application deployment yaml and redeploy.

If the annotation is properly configured:

Review slice configuration for any Error/Warning events:

kubectl describe slice < slice name > -n avesha-system
Note
If you see any errors or warnings that indicate slice installation failure, and unable to resolve based on error messages.
Note
Collect the following information and logs and contact support
kubectl get pods -n avesha-system | grep controller
kubectl logs <avesha-mesh-controller-manager-xxxx> -c manager -n avesha-system
kubectl describe slice <name_of_the_slice> -n avesha-system

Webhooks

If there are no errors in the slice configuration:

Check if admission web hooks are present:

kubectl get mutatingwebhookconfigurations

The following web hooks must be present:

avesha-admission-webhook-cfg, nsm-admission-webhook-cfg

Check if the web hook services are installed:

kubectl get svc -n avesha-system

The following services should be present:

avesha-admission-webhook-svc, nsm-admission-webhook-svc
Note
If the web hooks are not installed, there are two possible explanations: Issue during installation time Web hooks inadvertently deleted

To remediate:

Delete the application deployment
Uninstall the slice
Uninstall avesha-mesh
Re-install avesha-mesh
Verify the webhooks are installed correctly
Re-install the slice
Redeploy the application with the slice annotation

Namespace

Check the namespaceIsolationProfile block in the slice configuration.

kubectl get slice <slice name> -n avesha-system -o=jsonpath="{..status.config.namespaceIsolationProfile.isolationEnabled}"
true
Note
If the isolationEnabled field is set to ‘true’, check if the namespace in which the app is being deployed is listed in the applicationNamespaces field.
kubectl get slice <slice name> -n avesha-system -o=jsonpath="{..status.config.namespaceIsolationProfile.applicationNamespaces}"
["<cluster name>:<slice name>"]
Note
Configuration block of the slice yaml file:
namespaceIsolationProfile:
isolationEnabled: true 
applicationNamespaces:
- “<cluster name>:<namespace>”
Note
If the application namespace is not listed, please follow these steps:
Delete the application deployment
Uninstall the slice on all the clusters
Re-install the slice with the application namespace listed in the namespaceIsolationProfile config block of the slice yaml file 
Redeploy the application with the slice annotation.
Note
If no discrepancy is found in the above steps, collect the following logs and contact support:
kubectl get pods -n avesha-system | grep controller
kubectl logs <avesha-mesh-controller-manager-xxxx> -c manager -n avesha-system
kubectl get pods -n avesha-system | grep nsm
kubectl logs <nsm-admission-webhook-xxxx> -n avesha-system

< PREVIOUS
Uninstalling the Slice Operator This will guide you through uninstalling the slice operator from and thereby deregistering your cluster. The order to these steps is very important to ensure proper resource cleanup. Please ensure the tasks in the prerequisites have been completed. Read More
NEXT >
Troubleshooting Guide Troubleshooting Guide Read More