Quantcast
Channel: Open Knowledge Base
Viewing all 137 articles
Browse latest View live

Minikube cheat sheet

$
0
0
This article records the command commands for Minikube from https://kubernetes.io/docs/tutorials/hello-minikube/

1. Start Minikube cluster using "Docker for Mac".

minikube start --vm-driver=hyperkit

2. Set the context which determines which cluster kubectl is interacting with.

kubectl config use-context minikube

3. Verify that kubectl is configured to communicate with your cluster.

kubectl cluster-info

4. Open Kubernetes dashboard

minikube dashboard

5. Create Node.js application inside "server.js".

var http = require('http');

var handleRequest = function(request, response) {
console.log('Received request for URL: ' + request.url);
response.writeHead(200);
response.end('Hello World!');
};
var www = http.createServer(handleRequest);
www.listen(8080);

6. Create Docker container image file based on above application.

Create a file named "Dockerfile":
FROM node:6.14.2
EXPOSE 8080
COPY server.js .
CMD node server.js
Point the 'docker' command to your Minikube’s Docker daemon:
eval $(minikube docker-env)
Build your Docker image named "hello-node:v1":
docker build -t hello-node:v1 .

7. List the images in Minikube’s Docker registry

minikube ssh docker images

8. Create a Deployment named "hello-node" that manages a Pod based on above Docker (local) image named "hello-node:v1"

kubectl run hello-node --image=hello-node:v1 --port=8080 --image-pull-policy=Never

9. List the Deployment

kubectl get deployments

NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
hello-node 1 1 1 1 1m

10. List the Pod

kubectl get pods

NAME READY STATUS RESTARTS AGE
hello-node-57c6b66f9c-7j2k6 1/1 Running 0 1m

11. View cluster events

kubectl get events

LAST SEEN FIRST SEEN COUNT NAME KIND SUBOBJECT TYPE REASON SOURCE MESSAGE
22m 22m 1 minikube.15619424ad09a7a1 Node Normal Starting kube-proxy, minikube Starting kube-proxy.
2m 2m 1 hello-node.1561953f3a1d817f Deployment Normal ScalingReplicaSet deployment-controller Scaled up replica set hello-node-57c6b66f9c to 1
2m 2m 1 hello-node-57c6b66f9c.1561953f3c3d1c9e ReplicaSet Normal SuccessfulCreate replicaset-controller Created pod: hello-node-57c6b66f9c-7j2k6
2m 2m 1 hello-node-57c6b66f9c-7j2k6.1561953f4f86ec39 Pod Normal SuccessfulMountVolume kubelet, minikube MountVolume.SetUp succeeded for volume "default-token-rs6s2"
2m 2m 1 hello-node-57c6b66f9c-7j2k6.1561953f3e24dbb8 Pod Normal Scheduled default-scheduler Successfully assigned hello-node-57c6b66f9c-7j2k6 to minikube
2m 2m 1 hello-node-57c6b66f9c-7j2k6.1561953f7764cf06 Pod spec.containers{hello-node} Normal Started kubelet, minikube Started container
2m 2m 1 hello-node-57c6b66f9c-7j2k6.1561953f6c184d6f Pod spec.containers{hello-node} Normal Created kubelet, minikube Created container
2m 2m 1 hello-node-57c6b66f9c-7j2k6.1561953f6866f7d5 Pod spec.containers{hello-node} Normal Pulled

12. Expose the Pod to the public internet as a service

kubectl expose deployment hello-node --type=LoadBalancer

13. List the Service

kubectl get services

NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
hello-node LoadBalancer 10.xxx.xx.xxx <pending> 8080:32354/TCP 8s
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 1d

Open Browser using a local IP address that serves your app:
minikube service hello-node

14. View Pod logs

kubectl logs <POD-NAME>
kubectl logs hello-node-57c6b66f9c-7j2k6

15. Update the server.js

response.end('Hello World Again!');

16. Build a new version of the Docker image named "hello-node:v2"

docker build -t hello-node:v2 .

17. Update the image of your Deployment

kubectl set image deployment/hello-node hello-node=hello-node:v2

18. Run your app again to view the new message

minikube service hello-node

19. List Kubernetes Addons

minikube addons list
To enable heapster addon:
minikube addons enable heapster
View the pod and service created:
kubectl get pod,service -n kube-system
Interacting with heapster in a browser:
minikube addons open heapster

20. Cleanup and Shutdown Cluster

Delete the service:
kubectl delete service hello-node
Delete the deployment:
kubectl delete deployment hello-node
Optionally, force removal of the Docker images created:
docker rmi hello-node:v1 hello-node:v2 -f
Optionally, stop the Minikube VM:
minikube stop
eval $(minikube docker-env -u)
Optionally, delete the Minikube VM:
minikube delete




Kubernetes Tutorial cheat sheet

$
0
0
This article records the commands for Kubernetes Tutorial online sessions:https://kubernetes.io/docs/tutorials/kubernetes-basics/

1. Create a Cluster

https://kubernetes.io/docs/tutorials/kubernetes-basics/create-cluster/cluster-interactive/

1.1 Show minikube version

$ minikube version
minikube version: v0.28.2

1.2 Start minikube cluster

$ minikube start
Starting local Kubernetes v1.10.0 cluster...
Starting VM...
Getting VM IP address...
Moving files into cluster...
Setting up certs...
Connecting to cluster...
Setting up kubeconfig...
Starting cluster components...
Kubectl is now configured to use the cluster.
Loading cached images from config file.

1.3 Show client and server side version

$ kubectl version
Client Version: version.Info{Major:"1", Minor:"11", GitVersion:"v1.11.0", GitCommit:"91e7b4fd31fcd3d5f436da26c980becec37ceefe", GitTreeState:"clean", BuildDate:"2018-06-27T20:17:28Z", GoVersion:"go1.10.2", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.0", GitCommit:"fc32d2f3698e36b93322a3465f63a14e9f0eaead", GitTreeState:"clean", BuildDate:"2018-04-10T12:46:31Z", GoVersion:"go1.9.4", Compiler:"gc", Platform:"linux/amd64"}

1.4 View cluster information

$ kubectl cluster-info
Kubernetes master is running at https://172.17.0.109:8443

To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.

1.5 List all Nodes in the cluster

$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
minikube Ready <none> 2m v1.10.0
(Currently only 1 node named "minikube" is in the cluster, and its STATUS is ready to accept applications for deployment.)

2. Deploy an App

https://kubernetes.io/docs/tutorials/kubernetes-basics/deploy-app/deploy-interactive/

2.1 Create a deployment for application

$ kubectl run kubernetes-bootcamp --image=gcr.io/google-samples/kubernetes-bootcamp:v1--port=8080
deployment.apps/kubernetes-bootcamp created
This command creates a deployment named "kubernetes-bootcamp" based on Docker image from "gcr.io/google-samples/kubernetes-bootcamp" with tag "v1". And this application will run on port 8080.

2.2 List deployment

$ kubectl get deployments
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
kubernetes-bootcamp 1 1 1 1 2m
Above shows there is 1 deployment running a single instance of the application.

2.3 Create a proxy

$ kubectl proxy
Starting to serve on 127.0.0.1:8001
This proxy creates the connection between our host (the online terminal) and the Kubernetes cluster.

2.4 Query the API through proxy

$ curl http://localhost:8001/version
{
"major": "1",
"minor": "10",
"gitVersion": "v1.10.0",
"gitCommit": "fc32d2f3698e36b93322a3465f63a14e9f0eaead",
"gitTreeState": "clean",
"buildDate": "2018-04-10T12:46:31Z",
"goVersion": "go1.9.4",
"compiler": "gc",
"platform": "linux/amd64"
}

2.5 Save the POD name into environment variable $POD_NAME

$ export POD_NAME=$(kubectl get pods -o go-template --template '{{range .items}}{{.metdata.name}}{{"\n"}}{{end}}')
$ echo Name of the Pod: $POD_NAME
Name of the Pod: kubernetes-bootcamp-5c69669756-4qdmk

2.6 Query the API through proxy to specific POD

$ curl http://localhost:8001/api/v1/namespaces/default/pods/$POD_NAME/proxy/
Hello Kubernetes bootcamp! | Running on: kubernetes-bootcamp-5c69669756-4qdmk | v=1

3. Explorer App

https://kubernetes.io/docs/tutorials/kubernetes-basics/explore/explore-interactive/

3.1 List PODs

$ kubectl get pods
NAME READY STATUS RESTARTS AGE
kubernetes-bootcamp-5c69669756-5mxvh 0/1 Pending 0 5s
Currently only 1 POD named "kubernetes-bootcamp-5c69669756-5mxvh" is running.

3.2 View Container information inside the POD

$ kubectl describe pods
Name: kubernetes-bootcamp-5c69669756-5mxvh
Namespace: default
Node: minikube/172.17.0.17
Start Time: Thu, 01 Nov 2018 21:20:21 +0000
Labels: pod-template-hash=1725225312
run=kubernetes-bootcamp
Annotations: <none>
Status: Running
IP: 172.18.0.2
Controlled By: ReplicaSet/kubernetes-bootcamp-5c69669756
Containers:
kubernetes-bootcamp:
Container ID: docker://f9d6331b1c7dcc24e6c18a920fa65f2344c2421f5f394665668fd406c5641e22
Image: gcr.io/google-samples/kubernetes-bootcamp:v1
Image ID: docker-pullable://gcr.io/google-samples/kubernetes-bootcamp@sha256:0d6b8ee63bb57c5f5b6156f446b3bc3b3c143d233037f3a2f00e279c8fcc64af
Port: 8080/TCP
Host Port: 0/TCP
State: Running
Started: Thu, 01 Nov 2018 21:20:22 +0000
Ready: True
Restart Count: 0
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-b8t9g (ro)
Conditions:
Type Status
Initialized True
Ready True
PodScheduled True
Volumes:
default-token-b8t9g:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-b8t9g
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 1m (x4 over 1m) default-scheduler 0/1 nodes are available: 1 node(s) were not ready.
Normal Scheduled 1m default-scheduler Successfully assigned kubernetes-bootcamp-5c69669756-5mxvh to minikube
Normal SuccessfulMountVolume 1m kubelet, minikube MountVolume.SetUp succeeded for volume "default-token-b8t9g"
Normal Pulled 1m kubelet, minikube Container image "gcr.io/google-samples/kubernetes-bootcamp:v1" already present on machine
Normal Created 1m kubelet, minikube Created container
Normal Started 1m kubelet, minikube Started container

3.3 View Container logs

$ kubectl logs $POD_NAME
Kubernetes Bootcamp App Started At: 2018-11-01T21:20:22.891Z | Running On: kubernetes-bootcamp-5c69669756-5mxvh

Running On: kubernetes-bootcamp-5c69669756-5mxvh | Total Requests: 1 | App Uptime: 309.865 seconds | Log Time: 2018-11-01T21:25:32.756Z
Here we only have 1 Container inside the POD, so no need to specify the Container name.
The usage for "kubectl logs" is:
kubectl logs [-f] [-p] (POD | TYPE/NAME) [-c CONTAINER] [options]
So we can specify both POD and Container name as well:
kubectl logs $POD_NAME -c kubernetes-bootcamp

3.4 Execute command inside the Container

$ kubectl exec $POD_NAME env
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
HOSTNAME=kubernetes-bootcamp-5c69669756-5mxvh
KUBERNETES_PORT_443_TCP_PORT=443
KUBERNETES_PORT_443_TCP_ADDR=10.96.0.1
KUBERNETES_SERVICE_HOST=10.96.0.1
KUBERNETES_SERVICE_PORT=443
KUBERNETES_SERVICE_PORT_HTTPS=443
KUBERNETES_PORT=tcp://10.96.0.1:443
KUBERNETES_PORT_443_TCP=tcp://10.96.0.1:443
KUBERNETES_PORT_443_TCP_PROTO=tcp
NPM_CONFIG_LOGLEVEL=info
NODE_VERSION=6.3.1
HOME=/root

3.5 Start an open console inside the Container

$ kubectl exec -ti $POD_NAME bash
root@kubernetes-bootcamp-5c69669756-5mxvh:/#

4. Expose App

https://kubernetes.io/docs/tutorials/kubernetes-basics/expose/expose-interactive/

4.1 List Services

$ kubectl get services
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 17m
The default service(created by minikube) named "kubernetes" is based on TYPE "ClusterIP".

4.2 Expose a new Service

$ kubectl expose deployment/kubernetes-bootcamp --type="NodePort" --port 8080
service/kubernetes-bootcamp exposed
Above command creates a new service named "kubernetes-bootcamp" based on TYPE "NodePort" on internal port 8080. The external port is 30336.
$ kubectl get services
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 15s
kubernetes-bootcamp NodePort 10.100.77.17 <none> 8080:30336/TCP 1s

4.3 View Service information

$ kubectl describe services/kubernetes-bootcamp
Name: kubernetes-bootcamp
Namespace: default
Labels: run=kubernetes-bootcamp
Annotations: <none>
Selector: run=kubernetes-bootcamp
Type: NodePort
IP: 10.100.77.17
Port: <unset> 8080/TCP
TargetPort: 8080/TCP
NodePort: <unset> 30336/TCP
Endpoints: 172.18.0.2:8080
Session Affinity: None
External Traffic Policy: Cluster
Events: <none>

4.4 Fetch the external port of the Service

$ export NODE_PORT=$(kubectl get services/kubernetes-bootcamp -o go-template='{{(index.spec.ports 0).nodePort}}')
$ echo NODE_PORT=$NODE_PORT
NODE_PORT=30336
Test from IP of the Node and the external port:
$ curl $(minikube ip):$NODE_PORT
Hello Kubernetes bootcamp! | Running on: kubernetes-bootcamp-5c69669756-f7d5c | v=1

4.5 Fetch the label from the deployment

$ kubectl describe deployment|grep -i label
Labels: run=kubernetes-bootcamp
Labels: run=kubernetes-bootcamp

4.6 List PODs and Services based on label

$ kubectl get pods -l run=kubernetes-bootcamp
NAME READY STATUS RESTARTS AGE
kubernetes-bootcamp-5c69669756-f7d5c 1/1 Running 0 17m
$ kubectl get services -l run=kubernetes-bootcamp
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes-bootcamp NodePort 10.100.77.17 <none> 8080:30336/TCP 17m

4.7 Apply a new label to a POD

$ kubectl label pod $POD_NAME app=v1
pod/kubernetes-bootcamp-5c69669756-f7d5c labeled
Check the new label of the POD:
$ kubectl describe pods $POD_NAME
Name: kubernetes-bootcamp-5c69669756-f7d5c
Namespace: default
Node: minikube/172.17.0.18
Start Time: Thu, 01 Nov 2018 22:22:22 +0000
Labels: app=v1
pod-template-hash=1725225312
run=kubernetes-bootcamp
...

4.8 Delete a Service based on label

$ kubectl delete service -l run=kubernetes-bootcamp
service "kubernetes-bootcamp" deleted

4.9 Confirm a Service is removed

$ kubectl get services
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 29m
$ curl $(minikube ip):$NODE_PORT
curl: (7) Failed to connect to 172.17.0.18 port 30336: Connection refused
Confirm the App is still running inside POD:
$ kubectl exec -ti $POD_NAME curl localhost:8080
Hello Kubernetes bootcamp! | Running on: kubernetes-bootcamp-5c69669756-f7d5c | v=1

5. Scale App

https://kubernetes.io/docs/tutorials/kubernetes-basics/scale/scale-interactive/

5.1 Scale up the deployment to 4 replicas

$ kubectl scale deployments/kubernetes-bootcamp --replicas=4
deployment.extensions/kubernetes-bootcamp scaled
$ kubectl get deployments
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
kubernetes-bootcamp 4 4 4 4 1m
Confirm that more PODs are running now:
$ kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE
kubernetes-bootcamp-5c69669756-5dcc6 1/1 Running 0 1m 172.18.0.7 minikube
kubernetes-bootcamp-5c69669756-8bbrg 1/1 Running 0 2m 172.18.0.2 minikube
kubernetes-bootcamp-5c69669756-9jcgl 1/1 Running 0 1m 172.18.0.5 minikube
kubernetes-bootcamp-5c69669756-s7vlm 1/1 Running 0 1m 172.18.0.6 minikube

5.2 Check event log

$ kubectl describe deployments/kubernetes-bootcamp
...
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal ScalingReplicaSet 3m deployment-controller Scaled up replica set kubernetes-bootcamp-5c69669756 to 1
Normal ScalingReplicaSet 2m deployment-controller Scaled up replica set kubernetes-bootcamp-5c69669756 to 4

5.3 Find the external port for the Service

$ kubectl describe services/kubernetes-bootcamp
Name: kubernetes-bootcamp
Namespace: default
Labels: run=kubernetes-bootcamp
Annotations: <none>
Selector: run=kubernetes-bootcamp
Type: NodePort
IP: 10.107.230.53
Port: <unset> 8080/TCP
TargetPort: 8080/TCP
NodePort: <unset> 30798/TCP
Endpoints: 172.18.0.2:8080,172.18.0.5:8080,172.18.0.6:8080 + 1 more...
Session Affinity: None
External Traffic Policy: Cluster
Events: <none>
$ export NODE_PORT=$(kubectl get services/kubernetes-bootcamp -o go-template='{{(index.spec.ports 0).nodePort}}')
$ echo NODE_PORT=$NODE_PORT
NODE_PORT=30798
$ curl $(minikube ip):$NODE_PORT
Hello Kubernetes bootcamp! | Running on: kubernetes-bootcamp-5c69669756-9jcgl | v=1

5.4 Scale down the deployment to 2 replicas

$ kubectl scale deployments/kubernetes-bootcamp --replicas=2
deployment.extensions/kubernetes-bootcamp scaled
$ kubectl get deployments
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
kubernetes-bootcamp 2 2 2 2 8m
$ kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE
kubernetes-bootcamp-5c69669756-5dcc6 1/1 Terminating 0 7m 172.18.0.7 minikube
kubernetes-bootcamp-5c69669756-8bbrg 1/1 Running 0 8m 172.18.0.2 minikube
kubernetes-bootcamp-5c69669756-9jcgl 1/1 Terminating 0 7m 172.18.0.5 minikube
kubernetes-bootcamp-5c69669756-s7vlm 1/1 Running 0 7m 172.18.0.6 minikube
Eventually the 2 "Terminating" PODs will be gone:
$ kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE
kubernetes-bootcamp-5c69669756-8bbrg 1/1 Running 0 12m 172.18.0.2 minikube
kubernetes-bootcamp-5c69669756-s7vlm 1/1 Running 0 11m 172.18.0.6 minikube

6. Update App

https://kubernetes.io/docs/tutorials/kubernetes-basics/update/update-interactive/

6.1 List current Deployment and PODs

$ kubectl get deployments
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
kubernetes-bootcamp 4 4 4 4 15s
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
kubernetes-bootcamp-5c69669756-4gvjd 1/1 Running 0 18s
kubernetes-bootcamp-5c69669756-j8wf8 1/1 Running 0 18s
kubernetes-bootcamp-5c69669756-m5fnv 1/1 Running 0 18s
kubernetes-bootcamp-5c69669756-qlqsj 1/1 Running 0 18s

6.2 Update the image for the Deployment

$ kubectl set image deployments/kubernetes-bootcamp kubernetes-bootcamp=jocatalin/kubernetes-bootcamp:v2
deployment.extensions/kubernetes-bootcamp image updated

6.3 Monitor the PODs status

$ kubectl get pods
NAME READY STATUS RESTARTS AGE
kubernetes-bootcamp-5c69669756-4gvjd 1/1 Running 0 3m
kubernetes-bootcamp-5c69669756-j8wf8 1/1 Terminating 0 3m
kubernetes-bootcamp-5c69669756-m5fnv 1/1 Terminating 0 3m
kubernetes-bootcamp-5c69669756-qlqsj 1/1 Running 0 3m
kubernetes-bootcamp-7799cbcb86-7c6h7 1/1 Running 0 2s
kubernetes-bootcamp-7799cbcb86-xwglv 1/1 Running 0 2s
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
kubernetes-bootcamp-5c69669756-4gvjd 1/1 Terminating 0 3m
kubernetes-bootcamp-5c69669756-j8wf8 1/1 Terminating 0 3m
kubernetes-bootcamp-5c69669756-m5fnv 1/1 Terminating 0 3m
kubernetes-bootcamp-5c69669756-qlqsj 1/1 Terminating 0 3m
kubernetes-bootcamp-7799cbcb86-6rk6l 1/1 Running 0 11s
kubernetes-bootcamp-7799cbcb86-7c6h7 1/1 Running 0 13s
kubernetes-bootcamp-7799cbcb86-x4cgg 1/1 Running 0 11s
kubernetes-bootcamp-7799cbcb86-xwglv 1/1 Running 0 13s
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
kubernetes-bootcamp-7799cbcb86-6rk6l 1/1 Running 0 2m
kubernetes-bootcamp-7799cbcb86-7c6h7 1/1 Running 0 2m
kubernetes-bootcamp-7799cbcb86-x4cgg 1/1 Running 0 2m
kubernetes-bootcamp-7799cbcb86-xwglv 1/1 Running 0 2m

6.4 Check the rollout status

$ kubectl rollout status deployments/kubernetes-bootcamp
deployment "kubernetes-bootcamp" successfully rolled out
And check the image for each container:
$ kubectl describe pods|grep Image:
Image: jocatalin/kubernetes-bootcamp:v2
Image: jocatalin/kubernetes-bootcamp:v2
Image: jocatalin/kubernetes-bootcamp:v2
Image: jocatalin/kubernetes-bootcamp:v2

6.5 Another WRONG image update

$ kubectl set image deployments/kubernetes-bootcamp kubernetes-bootcamp=gcr.io/google-samples/kubernetes-bootcamp:v10
deployment.extensions/kubernetes-bootcamp image updated
$ kubectl get deployments
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
kubernetes-bootcamp 4 5 2 3 12m
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
kubernetes-bootcamp-5f76cd7b94-6smkb 0/1 ErrImagePull 0 34s
kubernetes-bootcamp-5f76cd7b94-sbk7t 0/1 ErrImagePull 0 34s
kubernetes-bootcamp-7799cbcb86-7c6h7 1/1 Running 0 9m
kubernetes-bootcamp-7799cbcb86-x4cgg 1/1 Running 0 9m
kubernetes-bootcamp-7799cbcb86-xwglv 1/1 Running 0 9m

6.6 Find out the root cause

$ kubectl describe pods|grep Failed
Warning Failed 5m (x4 over 6m) kubelet, minikube Failed to pull image "gcr.io/google-samples/kubernetes-bootcamp:v10": rpc error: code = Unknown desc =unauthorized: authentication required
Warning Failed 5m (x4 over 6m) kubelet, minikube Error: ErrImagePull
Warning Failed 4m (x6 over 6m) kubelet, minikube Error: ImagePullBackOff
Warning Failed 5m (x4 over 6m) kubelet, minikube Failed to pull image "gcr.io/google-samples/kubernetes-bootcamp:v10": rpc error: code = Unknown desc =unauthorized: authentication required
Warning Failed 5m (x4 over 6m) kubelet, minikube Error: ErrImagePull
Warning Failed 4m (x6 over 6m) kubelet, minikube Error: ImagePullBackOff
There is no image called v10 in the repository.

6.7 Rollback to previous working version

$ kubectl rollout undo deployments/kubernetes-bootcamp
deployment.extensions/kubernetes-bootcamp
Confirm the rollback is done:
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
kubernetes-bootcamp-7799cbcb86-7c6h7 1/1 Running 0 19m
kubernetes-bootcamp-7799cbcb86-8d5h6 1/1 Running 0 1m
kubernetes-bootcamp-7799cbcb86-x4cgg 1/1 Running 0 19m
kubernetes-bootcamp-7799cbcb86-xwglv 1/1 Running 0 19m

$ kubectl describe pods |grep Image:
Image: jocatalin/kubernetes-bootcamp:v2
Image: jocatalin/kubernetes-bootcamp:v2
Image: jocatalin/kubernetes-bootcamp:v2
Image: jocatalin/kubernetes-bootcamp:v2


How to install a Kubernetes Cluster on CentOS 7

$
0
0

Goal:

How to install a Kubernetes Cluster on CentOS 7

Env:

CentOS 7.4
4 Nodes(v1 to v4, and v1 will be the master node for Kubernetes Cluster):
  • xx.xx.xx.41 v1.poc.com v1
  • xx.xx.xx.42 v2.poc.com v2
  • xx.xx.xx.43 v3.poc.com v3
  • xx.xx.xx.44 v4.poc.com v4
Kubernetes v1.12.2
Docker 18.06.1-ce

Solution:

1. Node preparation on all nodes

1.1 Disable SELinux

Change it at current runtime session level:
setenforce 0
Change it at system level by modifying /etc/selinux/config:
SELINUX=disabled
After rebooting, use below command to confirm selinux is disabled:
sestatus

1.2 Disable Swap

swapoff -a
Then edit /etc/fstab to comment out the swap, for example:
#/dev/mapper/centos-swap swap                    swap    defaults        0 0

1.3 Enable br_netfilter

br_netfilter module is required to enable transparent masquerading and to facilitate VxLAN traffic for communication between Kubernetes pods across the cluster.
modprobe br_netfilter
echo '1'> /proc/sys/net/bridge/bridge-nf-call-iptables

2. Install Docker

Follow below documentation:
https://docs.docker.com/install/linux/docker-ce/centos/
Quick commands are:
yum install -y yum-utils device-mapper-persistent-data lvm2
yum install docker-ce
systemctl start docker
systemctl enable docker
After that, Docker should be started and enabled.
$ ps -ef|grep -i docker
root 2468 1 0 16:41 ? 00:00:01 /usr/bin/dockerd
root 2476 2468 0 16:41 ? 00:00:00 docker-containerd --config /var/run/docker/containerd/containerd.toml
And verify the installation by running:
docker run hello-world

3. Install Kubernetes tools: kubectl, kubelet and kubeadm

Follow below documentation:
https://kubernetes.io/docs/tasks/tools/install-kubectl/
https://kubernetes.io/docs/setup/independent/install-kubeadm/
Quick commands are:
cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://packages.cloud.google.com/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://packages.cloud.google.com/yum/doc/yum-key.gpg https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg
EOF
yum install -y kubectl kubelet kubeadm
Enable kubelet:
systemctl enable kubelet
Then reboot all the nodes.
Note: When using Docker, kubeadm will automatically detect the cgroup driver for the kubelet and set it in the /var/lib/kubelet/kubeadm-flags.env file during runtime. Eg:
$ cat /var/lib/kubelet/kubeadm-flags.env
KUBELET_KUBEADM_ARGS=--cgroup-driver=cgroupfs --network-plugin=cni

4. Initialize Kubernetes Cluster on master node

Follow below documentation:
https://kubernetes.io/docs/setup/independent/create-cluster-kubeadm/ 
Quick commands are:
kubeadm init --pod-network-cidr 10.244.0.0/16
Here we will choose flannel as the POD network, and basically Flannel runs a small, single binary agent called flanneld on each host, and is responsible for allocating a subnet lease to each host out of a larger, preconfigured address space.
So we specify the network range as "--pod-network-cidr 10.244.0.0/16".
The steps to deploy flannel is in step #5.3.

If it completes successfully, save below "kubeadm join" command which will be used to join other worker nodes into this Kubernetes Cluster.
For example:
kubeadm join xx.xx.xx.41:6443 --token 65l31r.cc43l28kcyx4xefp --discovery-token-ca-cert-hash sha256:9c6ec245668161a61203776a0621911463df72b80a32590d9fa2bb16da2a46ac

5. Configure Kubernetes Cluster on master node

5.1 Create a user named "testuser" who has sudo privilege

5.2 Create config for Kubernetes Cluster for the testuser.

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

After that, run simple commands to verify:
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
v1.poc.com NotReady master 21m v1.12.2

5.3 Deploy the flannel network

kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml

5.4 Master Isolation

Here I want to be able to schedule PODs on master as well:
kubectl taint nodes --all node-role.kubernetes.io/master-

5.5 Join other nodes into Kubernetes Cluster

As root user on all other nodes:
kubeadm join xx.xx.xx.41:6443 --token 65l31r.cc43l28kcyx4xefp --discovery-token-ca-cert-hash sha256:9c6ec245668161a61203776a0621911463df72b80a32590d9fa2bb16da2a46ac
Verify on master node as testuser:
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
v1.poc.com Ready master 14m v1.12.2
v2.poc.com Ready <none> 28s v1.12.2
v3.poc.com Ready <none> 25s v1.12.2
v4.poc.com Ready <none> 23s v1.12.2

6. Test by creating a nginx POD and service

6.1 Create a nginx deployment

kubectl create deployment nginx --image=nginx
Verify nginx pod is running:
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
nginx-55bd7c9fd-7v9kz 1/1 Running 0 14s

6.2 Expose nginx service

kubectl create service nodeport nginx --tcp=80:80
Verify Service:
$ kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 18m
nginx NodePort 10.103.4.141 <none> 80:31707/TCP 24s

6.3 Test nginx service

From above output, we know the nginx is actually on the port 31707 of the nodes.
Test by running below commands or open the browser with below addresses:
curl v1.poc.com:31707
curl v2.poc.com:31707
curl v3.poc.com:31707
curl v4.poc.com:31707

Common Issues

1. "kubeadm init" fails if swap is not turned off.
$ kubeadm init
[init] using Kubernetes version: v1.12.2
[preflight] running pre-flight checks
[preflight] Some fatal errors occurred:
[ERROR Swap]: running with swap on is not supported. Please disable swap
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
Solution is above step #1.2 to disable swap.
2. "kubectl" fails if there is no config for Kubernetes Cluster
$ kubectl get pods
The connection to the server localhost:8080 was refused - did you specify the right host or port?
The reason is there is no config for Kubernetes Cluster for this specific user.
That is why it thought the kubernetes API server is listening on 8080 port.
Solution is above step #5.2.
After that, the commands will connect to API server mentions in the config file:
[testuser@v1 .kube]$ pwd
/home/testuser/.kube
[testuser@v1 .kube]$ cat config |grep server
server: https://xx.xx.xx.41:6443
And the API server should be listening on this port 6443:
# netstat -anp|grep 6443|grep LISTEN
tcp6 0 0 :::6443 :::* LISTEN 9658/kube-apiserver



How to start Dashboard in a test Kubernetes Cluster

$
0
0

Goal:

How to start Dashboard in a test Kubernetes Cluster.
Dashboard is a web-based Kubernetes user interface. You can use Dashboard to deploy containerized applications to a Kubernetes cluster, troubleshoot your containerized application, and manage the cluster itself along with its attendant resources.

Env:

CentOS 7.4
4 Nodes(v1 to v4, and v1 will be the master node for Kubernetes Cluster):

  • xx.xx.xx.41 v1.poc.com v1
  • xx.xx.xx.42 v2.poc.com v2
  • xx.xx.xx.43 v3.poc.com v3
  • xx.xx.xx.44 v4.poc.com v4
Kubernetes v1.12.2
Docker 18.06.1-ce
Dashboard 1.10
[Please follow How to install a Kubernetes Cluster on CentOS 7  to create this test cluster firstly]

Solution:

Refer to below documentation:
https://kubernetes.io/docs/tasks/access-application-cluster/web-ui-dashboard/

1. Deploy the Dashboard UI

kubectl create -f https://raw.githubusercontent.com/kubernetes/dashboard/master/src/deploy/recommended/kubernetes-dashboard.yaml

2. Start the proxy on master node

Here v1.poc.com is the master node.
kubectl proxy --address="0.0.0.0" -p 8001 --accept-hosts='^*$'
This proxy will listen on port 8001 of master node and will accept connection from ANY hosts.
Note: This is not for production cluster, and it is only for test purpose.

3. Authentication and Authorization for Dashboard(Option A)

Please refer to below documentation on understand authentication and authorization.
Since this is a test cluster, we will grant admin privilege to Dashboard's Service Account so that you can just click "Skip" button when you open the UI to skip "login".

3.1 Fetch the name of the Dashboard's Service Account

$ kubectl get serviceaccount -n kube-system |grep -i dashboard
kubernetes-dashboard 1 2d21h
Here the name of the Dashboard's Service Account is "kubernetes-dashboard".

3.2 Grant admin privilege

This done by creating a "ClusterRoleBinding" object to grant role named "cluster-admin" to Service Account named "kubernetes-dashboard":
cat <<EOF | kubectl create -f -
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
name: kubernetes-dashboard
labels:
k8s-app: kubernetes-dashboard
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-admin
subjects:
- kind: ServiceAccount
name: kubernetes-dashboard
namespace: kube-system
EOF

3.3 Confirm the role is granted properly

$ kubectl describe clusterrolebinding kubernetes-dashboard
Name: kubernetes-dashboard
Labels: k8s-app=kubernetes-dashboard
Annotations: <none>
Role:
Kind: ClusterRole
Name: cluster-admin
Subjects:
Kind Name Namespace
---- ---- ---------
ServiceAccount kubernetes-dashboard kube-system
Note: here the "ClusterRoleBinding" object has the same name as "kubernetes-dashboard".

3.4 Open Dashboard UI from client

Open below Dashboard UI from client, for example, your own Mac which has access to the master node -- v1.poc.com.
http://v1.poc.com:8001/api/v1/namespaces/kube-system/services/https:kubernetes-dashboard:/proxy/
Click "SKIP" button to skip "login".

Again, above steps are only for test cluster since it gives admin privilege to Dashboard Service Account.

4. Authentication and Authorization for Dashboard(Option B)

If you do not want to grant the admin privilege to the Dashboard's Service Account, you can create a new Service Account with admin privilege as well, and then use its token to login.

4.1 Create a new Service Account named "my-account-for-dashboard"

kubectl create serviceaccount my-account-for-dashboard

4.2 Grant admin privilege

kubectl create clusterrolebinding my-account-for-dashboard-rolebinding --clusterrole=cluster-admin --serviceaccount=default:my-account-for-dashboard

4.3 Get the token of this new Service Account

$ kubectl describe secret my-account-for-dashboard
Name: my-account-for-dashboard-token-j6rzh
Namespace: default
Labels: <none>
Annotations: kubernetes.io/service-account.name: my-account-for-dashboard
kubernetes.io/service-account.uid: a2f918a5-e46d-11e8-a6d7-000c29562394

Type: kubernetes.io/service-account-token

Data
====
ca.crt: 1025 bytes
namespace: 7 bytes
token: xxx
Here "xxx" in "token:" field is what we need.

4.4 Port forwarding on clien

On the client machine, for example, on your Mac, do port forwarding for the "8001" port on the master node -- v1.poc.com:
ssh -L 8001:localhost:8001 root@v1.poc.com

4.5 Open dashboard UI from client

Open below Dashboard UI from client, for example, your own Mac which has access to the master node -- v1.poc.com.
http://localhost:8001/api/v1/namespaces/kube-system/services/https:kubernetes-dashboard:/proxy/
Choose "Token" and paste the token fetched from step 4.3 to login.
After login, if you click the profile icon on the top right, you should see : "LOGGED IN WITH TOKEN".

Please refer to access control page for Dashboard for more options.

How to install MapR Data Fabric for Kubernetes(KDF) on Kubernetes Cluster

$
0
0

Goal:

This article explains the steps in details on how to install MapR Data Fabric for Kubernetes(KDF) on Kubernetes Cluster.

Env:

CentOS 7.4
4 Nodes(v1 to v4, and v1 will be the master node for Kubernetes Cluster):
  • xx.xx.xx.41 v1.poc.com v1
  • xx.xx.xx.42 v2.poc.com v2
  • xx.xx.xx.43 v3.poc.com v3
  • xx.xx.xx.44 v4.poc.com v4
Kubernetes v1.12.2
Docker 18.06.1-ce
MapR Data Fabric for Kubernetes(KDF) v1.0.2

Solution:

Before following below steps, please follow How to install a Kubernetes Cluster on CentOS 7 to install a Kubernetes Cluster firstly.
Below are several reference links to be read firstly as well:
  • Concept and Architecture of KDF:
https://mapr.com/docs/home/PersistentStorage/kdf_overview.html
  • KDF YAML file Download location:
https://package.mapr.com/tools/KubernetesDataFabric
  • KDF Installation Documentation:
https://mapr.com/docs/home/PersistentStorage/kdf_installation.html

1. Download the KDF YAML files

wget https://package.mapr.com/tools/KubernetesDataFabric/v1.0.2/kdf-namespace.yaml
wget https://package.mapr.com/tools/KubernetesDataFabric/v1.0.2/kdf-rbac.yaml
wget https://package.mapr.com/tools/KubernetesDataFabric/v1.0.2/kdf-plugin-centos.yaml
wget https://package.mapr.com/tools/KubernetesDataFabric/v1.0.2/kdf-provisioner.yaml

2. Create mapr-system namespace

kubectl create -f kdf-namespace.yaml

Confirm:
$ kubectl get namespace|grep -i mapr
mapr-system Active 6h21m

3. Install RBAC file

kubectl create -f kdf-rbac.yaml
In this YAML file, it will complete below tasks in namespace "mapr-system":
  • Create a Service Account named "maprkdf".
  • Create a Cluster Role named "mapr:kdf" with several privileges.
  • Create a ClusterRoleBinding named "kdf" to assign Cluster Role "mapr:kdf" to Service Account "maprkdf".
Confirm:
$ kubectl get serviceaccount -n mapr-system |grep -i mapr
maprkdf 1 66s

$ kubectl get clusterroles |grep -i mapr
mapr:kdf 4m45s

$ kubectl describe clusterrolebinding mapr:kdf
Name: mapr:kdf
Labels: <none>
Annotations: <none>
Role:
Kind: ClusterRole
Name: mapr:kdf
Subjects:
Kind Name Namespace
---- ---- ---------
ServiceAccount maprkdf mapr-system

4. Create MapR KDF plugin

We need to modify the "kdf-plugin-centos.yaml" to change the IP:PORT of the kubernetes API server.
You can find the IP:PORT of the kubernetes API server by running below command:
$ kubectl config view|grep -i server
server: https://xx.xx.xx.41:6443
Then change below part of the "kdf-plugin-centos.yaml":
          - name : KUBERNETES_SERVICE_LOCATION
value: "changeme!:6443"
To:
- name : KUBERNETES_SERVICE_LOCATION
value: "xx.xx.xx.41:6443"
And then create the MapR KDF plugin:
kubectl create -f kdf-plugin-centos.yaml
In this YAML file, basically it creates a DaemonSet named "mapr-kdfplugin" running a container named "mapr-kdfplugin" on all nodes of the Kubernetes Cluster.
A DaemonSet ensures that all (or some) Nodes run a copy of a Pod.
Confirm:
$ kubectl get pods -n mapr-system -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE
mapr-kdfplugin-hbv4f 1/1 Running 0 93s 10.244.0.10 v1.poc.com <none>
mapr-kdfplugin-hkwll 1/1 Running 0 93s 10.244.3.8 v4.poc.com <none>
mapr-kdfplugin-lvg2z 1/1 Running 0 93s 10.244.1.6 v2.poc.com <none>
mapr-kdfplugin-zkqbt 1/1 Running 0 93s 10.244.2.7 v3.poc.com <none>

5. Create provisioner

kubectl create -f kdf-provisioner.yaml
In this YAML file, it creates a Deployment named "mapr-kdfprovisioner" in namespace "mapr-system".
This Deployment will run a container named "mapr-kdfprovisioner".
Confirm:
$ kubectl get pods -n mapr-system -o wide|grep -i provisioner
mapr-kdfprovisioner-c49954679-vgbjx 1/1 Running 0 70s 10.244.3.9 v4.poc.com <none>

Then next steps are following documentation on how to use KDF:
https://mapr.com/docs/home/PersistentStorage/kdf_configuration.html

How to write an example MapR Drill JDBC code which connects to a MapR-Drill Cluster with MapRSASL authentication

$
0
0

Goal:

How to write an example MapR Drill JDBC code which connects to a MapR-Drill Cluster with MapRSASL authentication.
The reason why we are using Simba Drill JDBC Driver instead of open source JDBC Driver is:
The open-source JDBC driver is not tested on the MapR Converged Data Platform. The driver supports Kerberos and Plain authentication mechanisms, but does not support the MapR-SASL authentication mechanism. 

Env:

Drill 1.14 using MapRSASL authentication
MapR 6.1 secure Cluster
Simba Drill JDBC 1.6.0

Solution:

1. Download example JAVA code

https://github.com/viadea/MapRDrillJDBCExample
git clone git@github.com:viadea/MapRDrillJDBCExample.git

2. Compile

mvn clean package
The compiled jar is here:
./target/MapRDrillJDBCExample-1.0.jar

3. Copy the jar dependencies and remove possible jar conflict

mkdir drilljars
cp /opt/mapr/drill/drill-1.14.0/jars/*.jar ./drilljars/
cp /opt/mapr/drill/drill-1.14.0/jars/3rdparty/*.jar ./drilljars/
cp /opt/mapr/drill/drill-1.14.0/jars/classb/reflections-0.9.10.jar ./drilljars/
cp /opt/mapr/drill/drill-1.14.0/jars/classb/javax.servlet-api-3.1.0.jar ./drilljars/
rm ./drilljars/drill-jdbc-1.14.0-mapr-SNAPSHOT.jar

mkdir maprjars
cp /opt/mapr/lib/*.jar ./maprjars/
rm ./maprjars/slf4j-log4j12-1.7.12.jar

4. Download Simba Drill JDBC Driver

Please always follow below documentation for the correct version of the driver:
https://mapr.com/docs/home/Drill/drill_jdbc_connector.html
wget https://package.mapr.com/tools/MapR-JDBC/MapR_Drill/MapRDrill_jdbc_v1.6.0.1001/MapRDrillJDBC-1.6.0.1001.zip
unzip MapRDrillJDBC-1.6.0.1001.zip
cd MapRDrillJDBC41-1.6.0.1001
unzip MapRDrillJDBC41-1.6.0.1001.zip

5. Run the program

java -cp ./MapRDrillJDBC41-1.6.0.1001/DrillJDBC41.jar:./drilljars/*:./target/MapRDrillJDBCExample-1.0.jar:./maprjars/* openkb.drill.MapRDrillJDBCExample
This program will run "select hostname from sys.drillbits" and display the results. For example:
hostname: v1.poc.com
hostname: v4.poc.com
hostname: v3.poc.com
hostname: v2.poc.com


How to configure Drill to use cgroups to hard limit CPU resource in Redhat/CentOS 7

$
0
0

Goal:

This article explains how to configure Drill to use cgroups to hard limit CPU resource in Redhat/CentOS 7.

Env:

Drill 1.14
MapR 6.1
CentOS 7.4 with linux kernel 3.10.0

Solution:

As per current Drill Documentation to configure cgroups, the steps are to use cgroups with libcgroup.
However libcgroup is depcreated in RedHat/CentOS 7.
This article shares some key steps to configure Drill to use cgroups with systemd in RedHat/CentOS 7.
One major difference is you do NOT need to install libcgroup.

1. Confirm cgroups are mounted by default.

# mount -v |grep -i cgroup
tmpfs on /sys/fs/cgroup type tmpfs (ro,nosuid,nodev,noexec,mode=755)
cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd)
cgroup on /sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,devices)
cgroup on /sys/fs/cgroup/net_cls,net_prio type cgroup (rw,nosuid,nodev,noexec,relatime,net_prio,net_cls)
cgroup on /sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,blkio)
cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,cpuacct,cpu)
cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,nosuid,nodev,noexec,relatime,perf_event)
cgroup on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,freezer)
cgroup on /sys/fs/cgroup/memory type cgroup (rw,nosuid,nodev,noexec,relatime,memory)
cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,cpuset)
cgroup on /sys/fs/cgroup/hugetlb type cgroup (rw,nosuid,nodev,noexec,relatime,hugetlb)
cgroup on /sys/fs/cgroup/pids type cgroup (rw,nosuid,nodev,noexec,relatime,pids)

2. Understand the logic how Drill configures cgroups when drillbit starts.

The code logic is starting from Drill 1.14, and it is inside bin/drillbit.sh:
    SYS_CGROUP_DIR=${SYS_CGROUP_DIR:-"/sys/fs/cgroup"}
if [ -f $SYS_CGROUP_DIR/cpu/$DRILLBIT_CGROUP/cgroup.procs ]; then
echo $dbitPid > $SYS_CGROUP_DIR/cpu/$DRILLBIT_CGROUP/cgroup.procs
Inside drill-env.sh, there are 2 environment variables:
  • SYS_CGROUP_DIR -- by default "/sys/fs/cgroup"
  • DRILLBIT_CGROUP -- by default "drillcpu"
What drillbit does when starting is: Put the pid of drillbit into the file "$SYS_CGROUP_DIR/cpu/$DRILLBIT_CGROUP/cgroup.procs" which is by default "/sys/fs/cgroup/cpu/drillcpu/cgroup.procs".
That is it.

After understanding the logic of what drillbit does when starting, then we can understand what should we do next.

3. Uncomment cgroups related enviroment variables inside drill-env.sh

export DRILL_PID_DIR=${DRILL_PID_DIR:-$DRILL_HOME}
export SYS_CGROUP_DIR=${SYS_CGROUP_DIR:-"/sys/fs/cgroup"}
export DRILLBIT_CGROUP=${DRILLBIT_CGROUP:-"drillcpu"}
Of course, you can change the directory of cgroup or the cgroup name as you like.

4. Create the cgroup directory based on $DRILLBIT_CGROUP

mkdir -p /sys/fs/cgroup/cpu/drillcpu
echo 100000 > /sys/fs/cgroup/cpu/drillcpu/cpu.cfs_quota_us
echo 100000 > /sys/fs/cgroup/cpu/drillcpu/cpu.cfs_period_us
Here I am hard limiting the CPU resource to only 1 CPU core using the 2 parameters:
  • cpu.cfs_period_us
    The cpu.cfs_period_us parameter specifies a segment of time (in microseconds represented by us for µs) for how often the access to CPU resources should be reallocated.
  • cpu.cfs_quota_us
    The cpu.cfs_quota_us parameter specifies the total amount of runtime (in microseconds represented by usfor µs) for which all tasks in the Drill cgroup can run during one period (as defined by cpu.cfs_period_us). As soon as tasks in the Drill cgroup use the time specified by the quota, they are throttled for the remainder of the time specified by the period and not allowed to run until the next period.
If you want to soft limit instead of hard limit CPU resource, you can choose another parameter "cpu.shares". Of course, you may want to use any other cgroups parameters.
For the definition of those parameters, please check linux kernel documentation

5. Change the ownership/permission of this cgroup directory

As per our understanding from #3, drillbit will put its pid into file "/sys/fs/cgroup/cpu/drillcpu/cgroup.procs". That is why the user("mapr" by default) who starts drillbit should have the permission to write to that file.
chown -R mapr:mapr /sys/fs/cgroup/cpu/drillcpu

6. Restart drillbit

After restarting drillbit, please double check that the pid of each drillbit is put into "/sys/fs/cgroup/cpu/drillcpu/cgroup.procs".
# cat /sys/fs/cgroup/cpu/drillcpu/cgroup.procs
29589
# jps -m|grep -i drillbit
29589 Drillbit
This means the drillbit process will be controlled by this cgroup.

7. Test

Run some complex query and check the "top -p <pid of drillbit>" to confirm that only 1 CPU core can be used.
  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
29589 mapr 20 0 4215740 1.139g 53400 S 99.0 7.3 1:29.58
Then reduce parameter "cpu.cfs_quota_us" from current 100000 to 50000.
echo 50000 > /sys/fs/cgroup/cpu/drillcpu/cpu.cfs_quota_us
Run the complex query again, and you will find that only 0.5 CPU core can be used by drillbit now.
  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
29589 mapr 20 0 4167360 1.111g 53512 S 49.8 7.2


gcloud cheat sheet

$
0
0
This article records the common gcloud commands.
Please refer to the gcloud doc.

1. Quick Start

1.1 Install google cloud SDK on Mac

Follow https://cloud.google.com/sdk/docs/quickstart-macos 

1.2 Initialize SDK

gcloud init --console-only

1.3 List accounts with credentials

$ gcloud auth list
Credentialed Accounts
ACTIVE ACCOUNT
* xxx@xxx.com

To set the active account, run:
$ gcloud config set account `ACCOUNT`

1.4 View information of SDK installation and configuration

gcloud info

1.5 Help manual for gcloud

# Help for command "gcloud compute instances create":
gcloud help compute instances create

2. Configurations and Properties

2.1 List all configurations

$ gcloud config configurations list
NAME IS_ACTIVE ACCOUNT PROJECT DEFAULT_ZONE DEFAULT_REGION
default True xxx@xxx.com myprojectname us-west1-a

2.2 Create a new configuration named "myconfig"

$ gcloud config configurations create myconfig
Created [myconfig].
Activated [myconfig].

2.3 Activate configuration named "myconfig"

$ gcloud config configurations activate myconfig
Activated [myconfig].

2.4 Switch configuration for a single command, just use --configuration flag.

gcloud auth list --configuration=[CONFIGURATION_NAME]

2.5 Delete a configuration named "myconfig"

$ gcloud config configurations delete myconfig
The following configurations will be deleted:
- myconfig
Do you want to continue (Y/n)? Y

Deleted [myconfig].
Note: you can not delete an active configuration.

2.6 View properties in the active configuration

$ gcloud config list
[core]
account = xxx@xxx.com
disable_usage_reporting = True
project = myprojectname

Your active configuration is: [default]

2.7 View ALL properties in the active configuration

gcloud config list --all

2.8 View properties in some other configuration

gcloud config configurations describe [CONFIGURATION_NAME]
Note: the properties are documented here.

2.9 Change default project

gcloud config set project [PROJECT_ID]

2.10 Change default compute zone

gcloud compute zones list
gcloud config set compute/zone us-west1-a
Note: here "compute" is the section name while "zone" is the property name.

2.11 Unset a property

gcloud config unset disable_usage_reporting

2.12 List all available properties

gcloud topic configurations

3 Components

3.1 List all components

gcloud components list

3.2 Install a component

gcloud components install [COMPONENT-ID]

3.3 Update all installed components to latest version

gcloud components update
Or to revert SDK to the previously installed version, say 228.0.0:
gcloud components update --version 228.0.0

3.4 Remove a component

gcloud components remove [COMPONENT-ID]

4. Account

4.1 List accounts whose credentials are stored on the local system

gcloud auth list

4.2 Switch the active account

gcloud config set account [ACCOUNT]

4.3 Revoke credentials for the account

gcloud auth revoke [ACCOUNT]

5. Compute Engine

5.1 List all VM instances

gcloud compute instances list
Or to list the instances with some pattern matched:
gcloud compute instances list --filter="name~'my-.*'"

 5.2 Create a VM instance

gcloud compute instances create my-instance

5.3 Show information about one VM instance

gcloud compute instances describe my-instance --zone us-central1-a

5.4 ssh to a VM instance

gcloud compute ssh my-instance --zone us-central1-a

5.5 scp a file to a VM instance

gcloud compute scp ~/file-1 my-instance:~/remote-destination --zone us-central1-a

5.6 scp a file from a VM instance

gcloud compute scp my-instance:~/file-1 ~/local-destination --zone us-central1-a

5.7 Generate a ssh configuration for ssh/scp to use directly

gcloud compute config-ssh
ssh my-instance.us-central1-a.myproject
Note, the ssh configuration is stored here : ~/.ssh/config

5.8 Add or Remove instance metadata 

gcloud compute instances add-metadata my-instance \
--zone us-central1-a \
--metadata role=worker

gcloud compute instances remove-metadata my-instance \
--zone us-central1-a \
--keys role

5.9 View, Add or Remove project level metadata

gcloud compute project-info describe

gcloud compute project-info add-metadata \
--metadata-from-file startup-script=/local/path/to/script
--metadata startup-id=1234

gcloud compute project-info remove-metadata --keys startup-script startup-id

5.10 Delete a VM instance

gcloud compute instances delete my-instance --zone us-central1-a

5.11 List operations

gcloud compute operations list
gcloud compute operations list --zones us-central1-a,us-central1-b

6. Container

6.1 Create a GKE cluster

gcloud container clusters create [CLUSTER_NAME]
It will create a 3 nodes(compute engine) cluster by default.

6.2 Delete a GKE cluster

gcloud container clusters delete [CLUSTER_NAME]

==

"msck repair" of a partition table fails with error "Expecting only one partition but more than one partitions are found.".

$
0
0

Symptom:

In spark-shell, "msck repair" of a partition table named "database_name.table_name" fails with error "Expecting only one partition but more than one partitions are found.".

Env:

Hive 2.1(with MySQL as the backend database for Hive Metastore)
Spark 2.2.1

Troubleshooting:

The source code for this error message is :
metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java
      query =
pm.newQuery(MPartition.class,
"table.tableName == t1 && table.database.name == t2 && partitionName == t3");
query.declareParameters("java.lang.String t1, java.lang.String t2, java.lang.String t3");
mparts = (List<MPartition>) query.execute(tableName, dbName, name);
pm.retrieveAll(mparts);
commited = commitTransaction();
// We need to compare partition name with requested name since some DBs
// (like MySQL, Derby) considers 'a' = 'a ' whereas others like (Postgres,
// Oracle) doesn't exhibit this problem.
if (mparts != null && mparts.size() > 0) {
if (mparts.size() > 1) {
throw new MetaException(
"Expecting only one partition but more than one partitions are found.");
}

Basically it will generate a query in MySQL(Hive Metastore backend database) to check if there are any duplicate entries based on Table Name, Database Name and Partition Name.
Then we can run below query in MySQL to find out the duplicate entries from PARTITIONS table for that specific Hive partition table -- database_name.table_name:
select p.PART_NAME,count(*)
from TBLS t, PARTITIONS p, DBS d
where d.DB_ID=t.DB_ID
and p.TBL_ID=t.TBL_ID
and t.TBL_NAME='table_name' and d.NAME='database_name'
group by p.PART_NAME
having count(*)>1;

If above query returns one partition name as "key=0", then it means there are 2 or more entries inside PARTITIONS table for that specific partition "key=0".
Then we need to list all the data for that partition:
select p.*
from TBLS t, PARTITIONS p, DBS d
where d.DB_ID=t.DB_ID
and p.TBL_ID=t.TBL_ID
and t.TBL_NAME='table_name' and d.NAME='database_name'
and p.PART_NAME='key=0';
For example, if above query returns two entries as below:
+---------+-------------+------------------+-----------+-------+--------+
| PART_ID | CREATE_TIME | LAST_ACCESS_TIME | PART_NAME | SD_ID | TBL_ID |
+---------+-------------+------------------+-----------+-------+--------+
| 255 | 1536697318 | 0 | key=0 | 262 | 7 |
| 256 | 1536697319 | 0 | key=0 | 263 | 7 |
+---------+-------------+------------------+-----------+-------+--------+
So in all, above results show there is already inconsistency such as duplicate entries in Hive Metastore backend database.

Root Cause:

For this specific case, we found that there were no constraints created for those Hive Metastore backend MySQL tables.
Take "PARTITIONS" table for example, we can run "show create table PARTITIONS;" in MySQL and check if below 2 FOREIGN KEY CONSTRAINT exists or not:
  CONSTRAINT `PARTITIONS_FK1` FOREIGN KEY (`TBL_ID`) REFERENCES `TBLS` (`TBL_ID`),
CONSTRAINT `PARTITIONS_FK2` FOREIGN KEY (`SD_ID`) REFERENCES `SDS` (`SD_ID`)

Solution:

1. Short term fix

The safest way to clean the duplicate entries is to use Hive command instead of manually deleting entries in MySQL.
We just need to drop and re-create that problematic Hive partition, and after that, those duplicate entries are cleaned.
For example:
ALTER TABLE database_name.table_name DROP PARTITION(key=0);
After that, double confirm all the duplicate entries are gone in MySQL:
select p.PART_NAME,count(*)
from TBLS t, PARTITIONS p, DBS d
where d.DB_ID=t.DB_ID
and p.TBL_ID=t.TBL_ID
and t.TBL_NAME='table_name' and d.NAME='database_name'
group by p.PART_NAME
having count(*)>1;

2. Long term fix

Since this issue is due to wrong DDLs in Hive Metastore, we would suggest backup the mysql database and re-create the Hive Metastore backend database from scratch using Hive schema tool.


How to use snapshot feature of MapR-FS to isolate Hive reads and writes

$
0
0

Goal:

As we know, Hive locks can be used to isolate reads and writes.
Please refer to my previous blogs for details:
http://www.openkb.info/2014/11/hive-locks-tablepartition-level.html
http://www.openkb.info/2018/07/hive-different-lock-behaviors-between.html

That lock mechanism is maintained by Hive and it could have performance overhead or unexpected lock behavior based on your application logic.
This article shows another way to use snapshot feature of MapR-FS to isolate Hive reads and writes.

Env:

MapR 6.1
Hiv 2.3

Solution:

The idea of this solution is straightforward and simple:
  • When Hive writes such as data loading finishes, take the MapR volume snapshot.
  • Create an external Hive table based on that snapshot Hive data as a read-only copy for Hive reads.
  • In the meantime, the Hive writes can still happen on the original Hive table.
The Hive application logic should handle the life cycle of the volume snapshot and Hive external table.
Below is a simple example:
1. Create a MapR Volume named "testhive" mounted at /testhive
maprcli volume create -name testhive -path /testhive
2. Create a Hive table in that volume
CREATE TABLE `testsnapshot`(
`id` int
)
LOCATION
'maprfs:/testhive/testsnapshot';
And Load 500 rows into this table:
insert into testsnapshot select 1 from src;

select count(*) from testsnapshot;
500
3. Create a snapshot on the volume when no writes are happening
maprcli volume snapshot create -snapshotname mysnapshot -volume testhive
The snapshot data for that Hive table named "testsnapshot" is:
$ hadoop fs -ls /testhive/.snapshot/mysnapshot/testsnapshot
Found 1 items
-rwxr-xr-x 3 mapr mapr 1000 2019-03-11 10:34 /testhive/.snapshot/mysnapshot/testsnapshot/000000_0
4. Create an external table on that snapshot data
CREATE EXTERNAL TABLE `testsnapshot_ext`(
`id` int
)
LOCATION
'maprfs:/testhive/.snapshot/mysnapshot/testsnapshot';
5. In the meantime, write into the original Hive table
insert into testsnapshot select 1 from src;
select count(*) from testsnapshot;
1000

select count(*) from testsnapshot_ext;
500

As you can see, the Hive external table "testsnapshot_ext" based on the snapshot data can be a read-only copy for Hive reads.
The Hive writes can still happen in the meantime on the original Hive table "testsnapshot".

This use case works well if you plan to periodically load data into the Hive table, and you do not want to cause inconsistency for Hive reads instead of using Hive locks feature.

How to customize FileOutputCommitter for MapReduce job by overwriting Output Format Class

$
0
0

Goal:

This article explains how to customize FileOutputCommitter for MapReduce job by overwriting Output Format Class.
This can be used to change the output directory, customize the file name etc.

Env:

MapR 6.1
Hadoop 2.7.0

Solution:

Here is the sample code by modifying the wordcount sample MapReduce job.
In this example, we just changes the job output directory to add a sub-directory named "mysubdir".
This sample code only have 2 java files:
  • WordCount.java -- Job driver class.
  • myOutputFormat.java -- Output Format Class defined by us
1. WordCount.java
Most of the code is the same as a sample WordCount job, and we only overwrite the Output Format Class:
job.setOutputFormatClass(myOutputFormat.class);

2. myOutputFormat.java
We customized the method "getOutputCommitter" as below:
  public synchronized OutputCommitter getOutputCommitter(TaskAttemptContext context)
throws IOException
{
if (this.myCommitter == null)
{
Path output = new Path(getOutputDir(context));
this.myCommitter = new FileOutputCommitter(output, context);
}
return this.myCommitter;
}

protected static String getOutputDir(TaskAttemptContext context)
{
int taskID = context.getTaskAttemptID().getTaskID().getId();
String taskType = context.getTaskAttemptID().getTaskID().getTaskType().toString();
System.err.println("MyDebug: taskattempt id is: " + taskID + " and tasktype is: " + taskType);
String outputBaseDir = getOutputPath(context).toString() + "/mysubdir";
return outputBaseDir;
}

This is a simple demo to change the job output directory.
You can also customize the file name or compression types by overriding getRecordWriter method as well.

After running this MapReduce job, the output files will be put in:
# hadoop fs -ls -R /hao/wordfinal
drwxr-xr-x - mapr mapr 4 2019-04-23 13:29 /hao/wordfinal/mysubdir
-rwxr-xr-x 3 mapr mapr 0 2019-04-23 13:29 /hao/wordfinal/mysubdir/_SUCCESS
-rwxr-xr-x 3 mapr mapr 0 2019-04-23 13:29 /hao/wordfinal/mysubdir/part-r-00000
-rwxr-xr-x 3 mapr mapr 0 2019-04-23 13:29 /hao/wordfinal/mysubdir/part-r-00001
-rwxr-xr-x 3 mapr mapr 51 2019-04-23 13:29 /hao/wordfinal/mysubdir/part-r-00002


What is the difference between mapreduce.fileoutputcommitter.algorithm.version=1 and 2

$
0
0

Goal:

This article explains the difference between mapreduce.fileoutputcommitter.algorithm.version=1 and 2 using a sample wordcount job.

Env:

MapR 6.1
Hadoop 2.7

Solution:

The definition of mapreduce.fileoutputcommitter.algorithm.version is documented here.
The file output committer algorithm version valid algorithm version number: 1 or 2 
default to 1, which is the original algorithm In algorithm version 1,

1. commitTask will rename directory
$joboutput/_temporary/$appAttemptID/_temporary/$taskAttemptID/ to
$joboutput/_temporary/$appAttemptID/$taskID/

2. recoverTask will also do a rename
$joboutput/_temporary/$appAttemptID/$taskID/ to
$joboutput/_temporary/($appAttemptID + 1)/$taskID/

3. commitJob will merge every task output file in
$joboutput/_temporary/$appAttemptID/$taskID/ to
$joboutput/,
then it will delete $joboutput/_temporary/ and write $joboutput/_SUCCESS

It has a performance regression, which is discussed in MAPREDUCE-4815.
If a job generates many files to commit then the commitJob method call at the end of the job can take minutes.
the commit is single-threaded and waits until all tasks have completed before commencing.

algorithm version 2 will change the behavior of commitTask, recoverTask, and commitJob.

1. commitTask will rename all files in
$joboutput/_temporary/$appAttemptID/_temporary/$taskAttemptID/ to
$joboutput/

2. recoverTask actually doesn't require to do anything, but for upgrade from version 1 to version 2 case, it will check if there are any files in $joboutput/_temporary/($appAttemptID - 1)/$taskID/ and rename them to
$joboutput/

3. commitJob can simply delete $joboutput/_temporary and write $joboutput/_SUCCESS
This algorithm will reduce the output commit time for large jobs by having the tasks commit directly to the final output directory as they were completing and commitJob had very little to do.

This parameter is introduced by MAPREDUCE-4815.
The reason why this algorithm.version=2 was introduced is to speed up the method "commitJob" inside FileOutputCommitter.java.

In simple, the major difference is who will do the mergePaths() work -- Reducer or ApplicationMaster(AM).
Now let's use a sample wordcount job to explain the details.

1. mapreduce.fileoutputcommitter.algorithm.version=1

AM will do mergePaths() in the end after all reducers complete.
If this MR job has many reducers,  AM will firstly wait for all reducers to finish and then use a single thread to merge the outout files.
So this algorithm has some performance concern for large jobs.

Take above wordcount job for example, the job output directory is set to /hao/wordfinal/mysubdir.
Firstly reducers(here are 3 of them) will write the output into temporary directory:
/hao/wordfinal/mysubdir/_temporary/1/_temporary/attempt_1554837836642_0136_r_000000_0/part-r-00000
/hao/wordfinal/mysubdir/_temporary/1/_temporary/attempt_1554837836642_0136_r_000001_0/part-r-00001
/hao/wordfinal/mysubdir/_temporary/1/_temporary/attempt_1554837836642_0136_r_000002_0/part-r-00002
Above theory can be confirmed from container log from reducers:
# grep -R "Saved output of task" *
container_e07_1554837836642_0136_01_000003/syslog:2019-04-23 15:44:10,158 INFO [main] org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter: Saved output of task 'attempt_1554837836642_0136_r_000000_0' to maprfs:/hao/wordfinal/mysubdir/_temporary/1/task_1554837836642_0136_r_000000
container_e07_1554837836642_0136_01_000004/syslog:2019-04-23 15:44:15,161 INFO [main] org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter: Saved output of task 'attempt_1554837836642_0136_r_000001_0' to maprfs:/hao/wordfinal/mysubdir/_temporary/1/task_1554837836642_0136_r_000001
container_e07_1554837836642_0136_01_000005/syslog:2019-04-23 15:44:20,177 INFO [main] org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter: Saved output of task 'attempt_1554837836642_0136_r_000002_0' to maprfs:/hao/wordfinal/mysubdir/_temporary/1/task_1554837836642_0136_r_000002

After all reducers complete, AM will do the mergePaths() to move those files to the final output directory:
$ hadoop fs -ls -R /hao/wordfinal
drwxr-xr-x - mapr mapr 4 2019-04-23 15:44 /hao/wordfinal/mysubdir
-rwxr-xr-x 3 mapr mapr 0 2019-04-23 15:44 /hao/wordfinal/mysubdir/_SUCCESS
-rwxr-xr-x 3 mapr mapr 0 2019-04-23 15:44 /hao/wordfinal/mysubdir/part-r-00000
-rwxr-xr-x 3 mapr mapr 0 2019-04-23 15:44 /hao/wordfinal/mysubdir/part-r-00001
-rwxr-xr-x 3 mapr mapr 51 2019-04-23 15:44 /hao/wordfinal/mysubdir/part-r-00002
Above theory can be confirmed from DEBUG log from AM:
# grep " Merging data from" syslog
2019-04-23 15:44:20,263 DEBUG [CommitterEvent Processor #1] org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter: Merging data from MapRFileStatus{path=null; isDirectory=false; modification_time=0; access_time=0; owner=; group=; permission=rw-rw-rw-; isSymlink=false} to maprfs:///hao/wordfinal/mysubdir
2019-04-23 15:44:20,265 DEBUG [CommitterEvent Processor #1] org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter: Merging data from MapRFileStatus{path=null; isDirectory=false; length=0; replication=0; blocksize=0; modification_time=0; access_time=0; owner=; group=; permission=rw-rw-rw-; isSymlink=false} to maprfs:/hao/wordfinal/mysubdir/part-r-00002
2019-04-23 15:44:20,268 DEBUG [CommitterEvent Processor #1] org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter: Merging data from MapRFileStatus{path=null; isDirectory=false; modification_time=0; access_time=0; owner=; group=; permission=rw-rw-rw-; isSymlink=false} to maprfs:///hao/wordfinal/mysubdir
2019-04-23 15:44:20,270 DEBUG [CommitterEvent Processor #1] org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter: Merging data from MapRFileStatus{path=null; isDirectory=false; length=0; replication=0; blocksize=0; modification_time=0; access_time=0; owner=; group=; permission=rw-rw-rw-; isSymlink=false} to maprfs:/hao/wordfinal/mysubdir/part-r-00001
2019-04-23 15:44:20,272 DEBUG [CommitterEvent Processor #1] org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter: Merging data from MapRFileStatus{path=null; isDirectory=false; modification_time=0; access_time=0; owner=; group=; permission=rw-rw-rw-; isSymlink=false} to maprfs:///hao/wordfinal/mysubdir
2019-04-23 15:44:20,274 DEBUG [CommitterEvent Processor #1] org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter: Merging data from MapRFileStatus{path=null; isDirectory=false; length=0; replication=0; blocksize=0; modification_time=0; access_time=0; owner=; group=; permission=rw-rw-rw-; isSymlink=false} to maprfs:/hao/wordfinal/mysubdir/part-r-00000

2. mapreduce.fileoutputcommitter.algorithm.version=2

Each Reducer will do mergePaths() to move their output files into the final output direcotry concurrently.
So this algorithm saves a lot of time for AM when job is commiting.

Take above wordcount job for example, the job output directory is set to /hao/wordfinal/mysubdir.
Firstly reducers(here are 3 of them) will write the output into temporary directory.
This is the same as #1, nothing different.
And then reducers will do mergePaths() to move those files to the final output directory.

Above theory can be confirmed from DEBUG log from one Reducer:
# grep hao *
syslog:2019-04-23 15:57:23,101 DEBUG [main] com.mapr.fs.jni.MapRClient: Create: /hao/wordfinal/mysubdir/_temporary/1/_temporary/attempt_1554837836642_0137_r_000002_0/part-r-00002 mode = 493 replication = 3 chunkSize = default overwrite = false
syslog:2019-04-23 15:57:23,103 DEBUG [main] com.mapr.fs.Inode: >Inode GetAttr: file: /hao/wordfinal/mysubdir/_temporary/1/_temporary/attempt_1554837836642_0137_r_000002_0/part-r-00002, size: 0, chunksize: 268435456, fid: 2049.112100.1092966
syslog:2019-04-23 15:57:23,188 DEBUG [main] org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter: Merging data from MapRFileStatus{path=null; isDirectory=false; modification_time=0; access_time=0; owner=; group=; permission=rw-rw-rw-; isSymlink=false} to maprfs:///hao/wordfinal/mysubdir
syslog:2019-04-23 15:57:23,192 DEBUG [main] org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter: Merging data from MapRFileStatus{path=null; isDirectory=false; length=0; replication=0; blocksize=0; modification_time=0; access_time=0; owner=; group=; permission=rw-rw-rw-; isSymlink=false} to maprfs:/hao/wordfinal/mysubdir/part-r-00002
syslog:2019-04-23 15:57:23,195 INFO [main] org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter: Saved output of task 'attempt_1554837836642_0137_r_000002_0' to maprfs:///hao/wordfinal/mysubdir

So for a normal MR job, the major difference between mapreduce.fileoutputcommitter.algorithm.version=1 and 2 is :
Either AM or Reducers will do the mergePaths().

Note: the mergePaths() method is defined inside:
src/main/java/org/apache/hadoop/mapreduce/lib/output/FileOutputCommitter.java

Some reducer output files are not moved into final output directory after a MapReduce Job with customized FileOutputCommitter is migrated from MRv1 to MRv2.

$
0
0

Symptom:

After a MapReduce Job with customized FileOutputCommitter got migrated from MRv1 to MRv2, some reducer output files are not moved into final output directory.

Take a sample WordCount job with a customized FileOutputCommitter for example.
In this specific case, only 1st reducer's output files are moved into final output directory.
All other reducers' output files are still sitting in the temporary location:
/hao/wordfinal/part_00000/part-r-00000
/hao/wordfinal/part_00001/_temporary/1/task_1554837836642_0091_r_000001/part-r-00001
/hao/wordfinal/part_00002/_temporary/1/task_1554837836642_0091_r_000001/part-r-00002
The expected final output files should be:
/hao/wordfinal/part_00000/part-r-00000
/hao/wordfinal/part_00001/part-r-00001
/hao/wordfinal/part_00002/part-r-00002
Note: The same job works fine in MRv1 though before migrating to MRv2.

Env:

The MapReduce job is migrated from:
MapR 5.x with Hadoop 0.20.2 (MRv1)
to
MapR 6.1 with Hadoop 2.7.0 (MRv2)

Root Cause:

This WordCount job has a customized FileOutputCommitter by overwriting Output Format Class:
job.setOutputFormatClass(myOutputFormat.class);
Inside this "myOutputFormat.java",  the logic of the customized FileOutputCommitter is to set different output directory for different reducers:
  public synchronized OutputCommitter getOutputCommitter(TaskAttemptContext context)
throws IOException
{
if (this.myCommitter == null)
{
Path output = new Path(getOutputDir(context));
this.myCommitter = new FileOutputCommitter(output, context);
}
return this.myCommitter;
}

protected static String getOutputDir(TaskAttemptContext context)
{
int taskID = context.getTaskAttemptID().getTaskID().getId();
String outputBaseDir = getOutputPath(context) + "/part_" + NUMBER_FORMAT.format(taskID);
// String outputBaseDir = getOutputPath(context) + "/part_static" ;
return outputBaseDir;
}

In MRv1's world:
This job works as expected.
$ hadoop fs -ls -R /hao/wordfinal
drwxr-xr-x - mapr mapr 1 2019-04-24 14:15 /hao/wordfinal/_logs
drwxr-xr-x - mapr mapr 1 2019-04-24 14:15 /hao/wordfinal/_logs/history
-rwxr-xr-x 3 mapr mapr 18461 2019-04-24 14:15 /hao/wordfinal/_logs/history/s1.poc.com_1556138521585_job_201904241341_0004_mapr_wordcount.jar
drwxr-xr-x - mapr mapr 2 2019-04-24 14:15 /hao/wordfinal/part_00000
drwxr-xr-x - mapr mapr 0 2019-04-24 14:15 /hao/wordfinal/part_00000/_temporary
-rwxr-xr-x 3 mapr mapr 26 2019-04-24 14:15 /hao/wordfinal/part_00000/part-r-00000
drwxr-xr-x - mapr mapr 2 2019-04-24 14:15 /hao/wordfinal/part_00001
-rwxr-xr-x 3 mapr mapr 0 2019-04-24 14:15 /hao/wordfinal/part_00001/_SUCCESS
-rwxr-xr-x 3 mapr mapr 18 2019-04-24 14:15 /hao/wordfinal/part_00001/part-r-00001
drwxr-xr-x - mapr mapr 2 2019-04-24 14:15 /hao/wordfinal/part_00002
drwxr-xr-x - mapr mapr 0 2019-04-24 14:15 /hao/wordfinal/part_00002/_temporary
-rwxr-xr-x 3 mapr mapr 8 2019-04-24 14:15 /hao/wordfinal/part_00002/part-r-00002
This is because all of the 3 reducer outputs are actually moved by the 3 reducers themselves based on Reducer Logs:
# grep -R hao *
attempt_201904241341_0004_r_000000_0/syslog:2019-04-24 14:15:13,124 INFO output.FileOutputCommitter [main]: Saved output of task 'attempt_201904241341_0004_r_000000_0' to /hao/wordfinal/part_00000
attempt_201904241341_0004_r_000000_0/syslog:2019-04-24 14:15:18,257 INFO output.FileOutputCommitter [main]: Saved output of task 'attempt_201904241341_0004_r_000002_0' to /hao/wordfinal/part_00002
attempt_201904241341_0004_r_000001_0/syslog:2019-04-24 14:15:16,376 INFO output.FileOutputCommitter [main]: Saved output of task 'attempt_201904241341_0004_r_000001_0' to /hao/wordfinal/part_00001

In MRv2's world:
"The functionality of JobTracker in 1.x i.e resource management and job scheduling/monitoring are divided into separate daemons. - global ResourceManager (RM) and per-application ApplicationMaster (AM)."

So basically, not just reducers, AM will also call some methods(such as "mergePaths()" in this case) inside FileOutputCommitter.
As described in my previous article <What is the difference between mapreduce.fileoutputcommitter.algorithm.version=1 and 2>:
When mapreduce.fileoutputcommitter.algorithm.version=1(default value):
AM will do mergePaths() in the end after all reducers complete.
If this MR job has many reducers,  AM will firstly wait for all reducers to finish and then use a single thread to merge the output files.

Here is where the issue is.
In above customized code, the output for each reducer is defined to "part_$taskID".
So developer expects the 3 reducer can write to 3 different sub-directories:
part_00000
part_00001
part_00002
All of the 3 reducers will firstly write its output into a temporary location as below:
part_00000/_temporary/1/task_1554837836642_0091_r_000001/part-r-00000
part_00001/_temporary/1/task_1554837836642_0091_r_000001/part-r-00001
part_00002/_temporary/1/task_1554837836642_0091_r_000001/part-r-00002
After all reducers complete, AM will call "mergePath()" inside FileOutputCommitter to move above files into the final output directory.
However AM's task attempt ID is also 0, so AM thinks the output directory should be part_00000.
That is why only reducer part_00000's output file is moved by AM into final output directory.
After this job completes, below will be the problematic situation:
part_00000/part-r-00000
part_00001/_temporary/1/task_1554837836642_0091_r_000001/part-r-00001
part_00002/_temporary/1/task_1554837836642_0091_r_000001/part-r-00002

Troubleshooting:

When this kind of issue happens in MRv2, we need to simplify the original job to a minimum reproduce like WordCount job in this case.
To get to know "who is doing what", we can enable DEBUG log for Mapper, Reducer and AM:
-Dyarn.app.mapreduce.am.log.level=DEBUG -Dmapreduce.map.log.level=DEBUG -Dmapreduce.reduce.log.level=DEBUG
From AM DEBUG log,we will know which reducers' temporary output is moved by AM, and which reducers' are not.
Eg:
2019-04-18 11:36:28,234 DEBUG [CommitterEvent Processor #1] org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter: Merging data from MapRFileStatus{path=null; isDirectory=false; length=0; replication=0; blocksize=0; modification_time=0; access_time=0; owner=; group=; permission=rw-rw-rw-; isSymlink=false} to maprfs:/hao/wordfinal/part-r-00000
From above DEBUG log, we know only 1st reducer' output is merged by AM.

Solution:

In MRv2's world, parameter "mapreduce.fileoutputcommitter.algorithm.version" is introduced per MAPREDUCE-4815.
Setting "mapreduce.fileoutputcommitter.algorithm.version=2" in MRv2 can achive the same behavior for this specific job in MRv1, which is:
Reducers call "mergePath()" inside FileOutputCommitter instead of AM.

So for this specific customized job, in MRv2, if we set "-Dmapreduce.fileoutputcommitter.algorithm.version=2", then this job output will be moved by reducers.
As a result, below is the expected output file location:
# hadoop fs -ls -R /hao/wordfinal
-rwxr-xr-x 3 mapr mapr 0 2019-04-23 11:55 /hao/wordfinal/_SUCCESS
drwxr-xr-x - mapr mapr 2 2019-04-23 11:55 /hao/wordfinal/part_00000
drwxr-xr-x - mapr mapr 1 2019-04-23 11:55 /hao/wordfinal/part_00000/_temporary
drwxr-xr-x - mapr mapr 1 2019-04-23 11:55 /hao/wordfinal/part_00000/_temporary/1
drwxr-xr-x - mapr mapr 1 2019-04-23 11:55 /hao/wordfinal/part_00000/_temporary/1/_temporary
drwxr-xr-x - mapr mapr 0 2019-04-23 11:55 /hao/wordfinal/part_00000/_temporary/1/_temporary/attempt_1554837836642_0127_r_000000_0
-rwxr-xr-x 3 mapr mapr 0 2019-04-23 11:55 /hao/wordfinal/part_00000/part-r-00000
drwxr-xr-x - mapr mapr 2 2019-04-23 11:55 /hao/wordfinal/part_00001
drwxr-xr-x - mapr mapr 1 2019-04-23 11:55 /hao/wordfinal/part_00001/_temporary
drwxr-xr-x - mapr mapr 1 2019-04-23 11:55 /hao/wordfinal/part_00001/_temporary/1
drwxr-xr-x - mapr mapr 1 2019-04-23 11:55 /hao/wordfinal/part_00001/_temporary/1/_temporary
drwxr-xr-x - mapr mapr 0 2019-04-23 11:55 /hao/wordfinal/part_00001/_temporary/1/_temporary/attempt_1554837836642_0127_r_000001_0
-rwxr-xr-x 3 mapr mapr 0 2019-04-23 11:55 /hao/wordfinal/part_00001/part-r-00001
drwxr-xr-x - mapr mapr 2 2019-04-23 11:55 /hao/wordfinal/part_00002
drwxr-xr-x - mapr mapr 1 2019-04-23 11:55 /hao/wordfinal/part_00002/_temporary
drwxr-xr-x - mapr mapr 1 2019-04-23 11:55 /hao/wordfinal/part_00002/_temporary/1
drwxr-xr-x - mapr mapr 1 2019-04-23 11:55 /hao/wordfinal/part_00002/_temporary/1/_temporary
drwxr-xr-x - mapr mapr 0 2019-04-23 11:55 /hao/wordfinal/part_00002/_temporary/1/_temporary/attempt_1554837836642_0127_r_000002_0
-rwxr-xr-x 3 mapr mapr 51 2019-04-23 11:55 /hao/wordfinal/part_00002/part-r-00002

Reference:

How to customize FileOutputCommitter for MapReduce job by overwriting Output Format Class
What is the difference between mapreduce.fileoutputcommitter.algorithm.version=1 and 2


Step by step on how to install a MapR Client on Windows 7 to access multiple secured clusters

$
0
0

Goal:

This article explains the detailed steps to install and configure a MapR Client on Windows 7 to access multiple secured MapR Clusters.

Env:

MapR 6.1 client on Windows 7
2 MapR 6.1 secured clusters running : ClusterA and ClusterB

Solution:

Note: The Windows command mentioned below should be typed using "cmd" tool.

1. MapR Client Installation

1.1 Install JDK 8 and then set environment variable JAVA_HOME

setx JAVA_HOME "C:\Program Files\Java\jdk1.8.0_221"
Note: here we install JDK 8 because it is the supported JDK version based on the JDK support matrix.
https://mapr.com/docs/61/InteropMatrix/r_jdk_matrix.html

1.2 Create the directory and set environment variable MAPR_HOME

mkdir c:\opt\mapr
setx MAPR_HOME "c:\opt\mapr"

1.3 Download the MapR Client and install into MAPR_HOME

For example:
https://package.mapr.com/releases/v6.1.0/windows/mapr-client-6.1.0.20180926230239.GA-1.amd64.zip

1.4. Extract the archive by right-clicking on the file and selecting "Extract All"

Make sure the content is inside MAPR_HOME which is C:\opt\mapr

1.5 Configure the environment variable MAPR_TICKETFILE_LOCATION

mkdir c:\maprticket
setx MAPR_TICKETFILE_LOCATION "c:\maprticket\abc"

2. Configure MapR Cluster settings

2.1 Add etc/hosts entries

Right click "Notepad" and click "Run as Administrator".
Add both MapR Cluster A and B's hostname and IP into "C:\Windows\system32\drivers\etc\hosts".
For example:
10.xx.xx.1 v1.poc.com v1
10.xx.xx.2 v2.poc.com v2
10.xx.xx.3 v3.poc.com v3
10.xx.xx.4 v4.poc.com v4

10.xx.xx.5 s1.poc.com s1
10.xx.xx.6 s2.poc.com s2
10.xx.xx.7 s3.poc.com s3
10.xx.xx.8 s4.poc.com s4

2.2 Add mapr-clusters.conf enrties

Put both MapR Cluster A and B's /opt/mapr/conf/mapr-clusters.conf entries into %MAPR_HOME%\conf\mapr-clusters.conf
For example:
ClusterA secure=true v1.poc.com:7222,v2.poc.com:7222,v3.poc.com:7222
ClusterB secure=true s1.poc.com:7222,s2.poc.com:7222,s3.poc.com:7222

2.3 Merge the ssl_truststore from Cluster A and Cluster B

Copy the /opt/mapr/conf/ssl_truststore from Cluster B to Cluster A as "/tmp/ssl_truststore.b".
On Cluster A:
Copy the /opt/mapr/conf/ssl_truststore from Cluster A to "/tmp/ssl_truststore.a".
Run below command to merge /tmp/ssl_truststore.b into /tmp/ssl_truststore.a
/opt/mapr/server/manageSSLKeys.sh merge /tmp/ssl_truststore.b /tmp/ssl_truststore.a

2.4 Copy the merged /tmp/ssl_truststore.a to client under diretory c:\opt\mapr\conf

2.5 Generate MapR tickets for both MapR Clusters

cd %MAPR_HOME%\bin
maprlogin password -cluster ClusterA -user mapr
maprlogin password -cluster ClusterB -user mapr
Note: the ticket file should be located in c:\maprticket\abc as we set MAPR_TICKETFILE_LOCATION.

3. Test

Try "hadoop fs -ls" to access both MapR Clusters.
hadoop fs -ls maprfs://ClusterA/tmp
hadoop fs -ls maprfs://ClusterB/tmp

Reference:

https://mapr.com/docs/61/AdvancedInstallation/SettingUptheClient-windows.html
https://mapr.com/docs/61/SecurityGuide/RunningCommandsOnRemoteSecureClusters.html


Failed to login a secured Drill on Yarn cluster with MapRSASL authentication

$
0
0

Symptom:

Failed to login a secured Drill on Yarn cluster with MapRSASL authentication.
The sample stacktrace inside drillbit.log when trying to use sqlline to connect is:
2019-09-16 14:23:13,331 [UserServer-1] ERROR o.a.d.exec.rpc.RpcExceptionHandler - Exception in RPC communication.  Connection: /10.10.72.41:31010 <--> /10.10.72.41:48032 (user server).  Closing connection.
io.netty.handler.codec.DecoderException: org.apache.drill.exec.rpc.RpcException: javax.security.sasl.SaslException: Bad server key [Caused by javax.security.sasl.SaslException: Error while trying to decrypt ticket: 2]
at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:98) [netty-codec-4.0.48.Final.jar:4.0.48.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356) [netty-transport-4.0.48.Final.jar:4.0.48.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342) [netty-transport-4.0.48.Final.jar:4.0.48.Final]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335) [netty-transport-4.0.48.Final.jar:4.0.48.Final]
at io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:287) [netty-handler-4.0.48.Final.jar:4.0.48.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356) [netty-transport-4.0.48.Final.jar:4.0.48.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342) [netty-transport-4.0.48.Final.jar:4.0.48.Final]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335) [netty-transport-4.0.48.Final.jar:4.0.48.Final]
at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102) [netty-codec-4.0.48.Final.jar:4.0.48.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356) [netty-transport-4.0.48.Final.jar:4.0.48.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342) [netty-transport-4.0.48.Final.jar:4.0.48.Final]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335) [netty-transport-4.0.48.Final.jar:4.0.48.Final]
at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:312) [netty-codec-4.0.48.Final.jar:4.0.48.Final]
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:286) [netty-codec-4.0.48.Final.jar:4.0.48.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356) [netty-transport-4.0.48.Final.jar:4.0.48.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342) [netty-transport-4.0.48.Final.jar:4.0.48.Final]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335) [netty-transport-4.0.48.Final.jar:4.0.48.Final]
at io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86) [netty-transport-4.0.48.Final.jar:4.0.48.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356) [netty-transport-4.0.48.Final.jar:4.0.48.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342) [netty-transport-4.0.48.Final.jar:4.0.48.Final]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335) [netty-transport-4.0.48.Final.jar:4.0.48.Final]
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1294) [netty-transport-4.0.48.Final.jar:4.0.48.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356) [netty-transport-4.0.48.Final.jar:4.0.48.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342) [netty-transport-4.0.48.Final.jar:4.0.48.Final]
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:911) [netty-transport-4.0.48.Final.jar:4.0.48.Final]
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131) [netty-transport-4.0.48.Final.jar:4.0.48.Final]
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:645) [netty-transport-4.0.48.Final.jar:4.0.48.Final]
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:580) [netty-transport-4.0.48.Final.jar:4.0.48.Final]
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:497) [netty-transport-4.0.48.Final.jar:4.0.48.Final]
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:459) [netty-transport-4.0.48.Final.jar:4.0.48.Final]
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:131) [netty-common-4.0.48.Final.jar:4.0.48.Final]
at java.lang.Thread.run(Thread.java:748) [na:1.8.0_212]
Caused by: org.apache.drill.exec.rpc.RpcException: javax.security.sasl.SaslException: Bad server key [Caused by javax.security.sasl.SaslException: Error while trying to decrypt ticket: 2]
at org.apache.drill.exec.rpc.security.ServerAuthenticationHandler.handleAuthFailure(ServerAuthenticationHandler.java:324) ~[drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr]
at org.apache.drill.exec.rpc.security.ServerAuthenticationHandler.handle(ServerAuthenticationHandler.java:109) ~[drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr]
at org.apache.drill.exec.rpc.BasicServer.handle(BasicServer.java:182) ~[drill-rpc-1.15.0.0-mapr.jar:1.15.0.0-mapr]
at org.apache.drill.exec.rpc.BasicServer.handle(BasicServer.java:54) ~[drill-rpc-1.15.0.0-mapr.jar:1.15.0.0-mapr]
at org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:273) ~[drill-rpc-1.15.0.0-mapr.jar:1.15.0.0-mapr]
at org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:243) ~[drill-rpc-1.15.0.0-mapr.jar:1.15.0.0-mapr]
at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:88) [netty-codec-4.0.48.Final.jar:4.0.48.Final]
... 31 common frames omitted
Caused by: javax.security.sasl.SaslException: Bad server key
at com.mapr.security.maprsasl.MaprSaslServer.evaluateResponse(MaprSaslServer.java:190) ~[maprfs-6.1.0-mapr.jar:na]
at org.apache.drill.exec.rpc.security.ServerAuthenticationHandler$1.run(ServerAuthenticationHandler.java:239) ~[drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr]
at org.apache.drill.exec.rpc.security.ServerAuthenticationHandler$1.run(ServerAuthenticationHandler.java:236) ~[drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr]
at java.security.AccessController.doPrivileged(Native Method) ~[na:1.8.0_212]
at javax.security.auth.Subject.doAs(Subject.java:422) ~[na:1.8.0_212]
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1669) ~[hadoop-common-2.7.0-mapr-1808.jar:na]
at org.apache.drill.exec.rpc.security.ServerAuthenticationHandler.evaluateResponse(ServerAuthenticationHandler.java:236) ~[drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr]
at org.apache.drill.exec.rpc.security.ServerAuthenticationHandler.access$500(ServerAuthenticationHandler.java:53) ~[drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr]
at org.apache.drill.exec.rpc.security.ServerAuthenticationHandler$SaslInProgressProcessor.process(ServerAuthenticationHandler.java:176) ~[drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr]
at org.apache.drill.exec.rpc.security.ServerAuthenticationHandler$SaslStartProcessor.process(ServerAuthenticationHandler.java:164) ~[drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr]
at org.apache.drill.exec.rpc.security.ServerAuthenticationHandler.handle(ServerAuthenticationHandler.java:107) ~[drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr]
... 36 common frames omitted
Caused by: javax.security.sasl.SaslException: Error while trying to decrypt ticket: 2
at com.mapr.security.maprsasl.MaprSaslServer.evaluateResponse(MaprSaslServer.java:143) ~[maprfs-6.1.0-mapr.jar:na]
... 46 common frames omitted
The sample sqlline error message is:
Error: Failure in connecting to Drill: org.apache.drill.exec.rpc.NonTransientRpcException: javax.security.sasl.SaslException: Authentication failed. Incorrect credentials? [Details: Encryption: enabled , MaxWrappedSize: 65536 , WrapSizeLimit: 0] (state=,code=0)
java.sql.SQLNonTransientConnectionException: Failure in connecting to Drill: org.apache.drill.exec.rpc.NonTransientRpcException: javax.security.sasl.SaslException: Authentication failed. Incorrect credentials? [Details: Encryption: enabled , MaxWrappedSize: 65536 , WrapSizeLimit: 0]
at org.apache.drill.jdbc.impl.DrillConnectionImpl.<init>(DrillConnectionImpl.java:174)
at org.apache.drill.jdbc.impl.DrillJdbc41Factory.newDrillConnection(DrillJdbc41Factory.java:67)
at org.apache.drill.jdbc.impl.DrillFactory.newConnection(DrillFactory.java:67)
at org.apache.calcite.avatica.UnregisteredDriver.connect(UnregisteredDriver.java:138)
at org.apache.drill.jdbc.Driver.connect(Driver.java:72)
at sqlline.DatabaseConnection.connect(DatabaseConnection.java:130)
at sqlline.DatabaseConnection.getConnection(DatabaseConnection.java:179)
at sqlline.Commands.connect(Commands.java:1247)
at sqlline.Commands.connect(Commands.java:1139)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at sqlline.ReflectiveCommandHandler.execute(ReflectiveCommandHandler.java:38)
at sqlline.SqlLine.dispatch(SqlLine.java:722)
at sqlline.SqlLine.initArgs(SqlLine.java:416)
at sqlline.SqlLine.begin(SqlLine.java:514)
at sqlline.SqlLine.start(SqlLine.java:264)
at sqlline.SqlLine.main(SqlLine.java:195)
Caused by: org.apache.drill.exec.rpc.NonTransientRpcException: javax.security.sasl.SaslException: Authentication failed. Incorrect credentials? [Details: Encryption: enabled , MaxWrappedSize: 65536 , WrapSizeLimit: 0]
at org.apache.drill.exec.rpc.user.UserClient.connect(UserClient.java:210)
at org.apache.drill.exec.client.DrillClient.connect(DrillClient.java:458)
at org.apache.drill.exec.client.DrillClient.connect(DrillClient.java:402)
at org.apache.drill.jdbc.impl.DrillConnectionImpl.<init>(DrillConnectionImpl.java:165)
... 18 more
Caused by: javax.security.sasl.SaslException: Authentication failed. Incorrect credentials? [Details: Encryption: enabled , MaxWrappedSize: 65536 , WrapSizeLimit: 0]
at org.apache.drill.exec.rpc.security.AuthenticationOutcomeListener$SaslFailedProcessor.process(AuthenticationOutcomeListener.java:230)
at org.apache.drill.exec.rpc.security.AuthenticationOutcomeListener.success(AuthenticationOutcomeListener.java:128)
at org.apache.drill.exec.rpc.security.AuthenticationOutcomeListener.success(AuthenticationOutcomeListener.java:53)
at org.apache.drill.exec.rpc.RequestIdMap$RpcListener.set(RequestIdMap.java:134)
at org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:293)
at org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:243)
at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:88)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335)
at io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:287)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335)
at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335)
at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:312)
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:286)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335)
at io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335)
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1294)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342)
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:911)
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:645)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:580)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:497)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:459)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:131)
at java.lang.Thread.run(Thread.java:748)

Env:

MapR 6.1
Drill 1.15

Root Cause:

configure.sh does not work for Drill on YARN.
So for a Drill on YARN cluster with MapRSASL authentication, we need to manually configure it in distrib-env.sh from $DRILL_SITE.

Solution:

In $DRILL_SITE, locate the distrib-env.sh to see what is current settings for MapRSASL.
If the current setting is:
export DRILL_JAVA_OPTS="${DRILL_JAVA_OPTS} -Djava.security.auth.login.config=/opt/mapr/conf/mapr.login.conf -Dzookeeper.sasl.client=false"
Then it should be changed to:
export DRILL_JAVA_OPTS="${DRILL_JAVA_OPTS} -Djava.security.auth.login.config=/opt/mapr/conf/mapr.login.conf -Dhadoop.login=hybrid_keytab -Dzookeeper.sasl.client=true"

After that, restart the Drill on YARN cluster:
$DRILL_HOME/bin/drill-on-yarn.sh --site $DRILL_SITE start


Can not drop partition or drop Hive partition table due to UTF8 characters in partition value

$
0
0

Symptom:

Can not drop partition or drop Hive partition table due to UTF8 characters in partition value.
The partition key value with UTF8 characters shows as "???" in Hive also in mysql(Hive Metastore backend database).
For example,
User may create a Hive partition table named "hao3" and does a dynamic partition insert into this Hive partition table from a table named "t3" with some Chinese words inside the partition key.
hive> desc t3;
OK
name string
partition_col string
age int

hive> select * from t3;
OK
abc 小明 20
def part2 15
ghi part3 36
ijk part4 50

hive> CREATE TABLE hao3(
name string,
age int)
PARTITIONED BY (
partition_col1 string);

set hive.exec.dynamic.partition=true;
set hive.exec.dynamic.partition.mode=nonstrict;
insert into table hao3 partition(partition_col1) select name,age,partition_col from t3;
After the dynamic partition insert completes, the problematic partition will show as "??" in Hive:
hive> show partitions hao3;
OK
partition_col1=??
partition_col1=part2
partition_col1=part3
partition_col1=part4
Time taken: 0.092 seconds, Fetched: 4 row(s)
hive> select * from hao3 ;
OK
def 15 part2
ghi 36 part3
ijk 50 part4
Time taken: 0.229 seconds, Fetched: 3 row(s)
It also shows as "??" in backend mysql database as well:
select p.PART_NAME,s.LOCATION,pkv.PART_KEY_VAL
from
TBLS t, DBS d, PARTITIONS p, SDS s, PARTITION_KEY_VALS pkv
where t.DB_ID=d.DB_ID
and t.TBL_NAME='hao3'
and d.NAME='default'
and p.TBL_ID=t.TBL_ID
and p.SD_ID=s.SD_ID
and p.PART_ID=pkv.PART_ID
;

+----------------------+-------------------------------------------------------+--------------+
| PART_NAME | LOCATION | PART_KEY_VAL |
+----------------------+-------------------------------------------------------+--------------+
| partition_col1=part3 | maprfs:/user/hive/warehouse/hao3/partition_col1=part3 | part3 |
| partition_col1=part2 | maprfs:/user/hive/warehouse/hao3/partition_col1=part2 | part2 |
| partition_col1=part4 | maprfs:/user/hive/warehouse/hao3/partition_col1=part4 | part4 |
| partition_col1=?? | maprfs:/user/hive/warehouse/hao3/partition_col1=?? | ?? |
+----------------------+-------------------------------------------------------+--------------+
4 rows in set (0.00 sec)
Further more, you can not drop the partition or even rename/drop this table:
hive> alter table hao3 drop partition(partition_col1='小明');
Dropped the partition partition_col1=%3F%3F
OK
Time taken: 0.256 seconds
hive> show partitions hao3;
OK
partition_col1=??
partition_col1=part2
partition_col1=part3
partition_col1=part4
hive> drop table hao3;
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. org.apache.thrift.transport.TTransportException: java.net.SocketTimeoutException: Read timed out

Root Cause:

The character set encoding in backend MySQL may not support UTF8.
Below are the reasons:
1. The "character_set_server" is set to latin1 instead of utf8:
> show variables like '%char%';
+--------------------------+----------------------------+
| Variable_name | Value |
+--------------------------+----------------------------+
| character_set_client | utf8 |
| character_set_connection | utf8 |
| character_set_database | utf8 |
| character_set_filesystem | binary |
| character_set_results | utf8 |
| character_set_server | latin1 |
| character_set_system | utf8 |
| character_sets_dir | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+
8 rows in set (0.01 sec)
2.  Hive related metadata tables are created as latin1 character set by default.
This is the current behavior in Hive(including version 2.3 at least).
It is described in below Hive JIRAs:
https://issues.apache.org/jira/browse/HIVE-14156
https://issues.apache.org/jira/browse/HIVE-18083
The reason is:
[mapr@v4 mysql]$ pwd
/opt/mapr/hive/hive-2.3/scripts/metastore/upgrade/mysql
[mapr@v4 mysql]$ grep CHARSET hive-schema-2.3.0.mysql.sql
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
...

Solution:

1. Make sure the mysql character set are all set to UTF8.

In this case, we need to change "character_set_server" to utf8 by setting below in my.cnf under [mysqld] section:
character-set-server=utf8
Then restart mysql.

2. Change all hive metadata tables in mysql to UTF8 character set.

For example, for each table run below 2 SQLs:
alter table SDS default character set utf8;
alter table SDS convert to character set utf8;

For this case, the most important 3 metadata tables.columns are below:
SDS.LOCATION
PARTITIONS.PART_NAME
PARTITION_KEY_VALS.PART_KEY_VAL

Note that for table PARTITIONS, you need to reduce the length of column PART_NAME from varchar(767) to varchar(255) before running "convert to" command:
alter table PARTITIONS modify column PART_NAME varchar(255);
Otherwise the "convert to" command will fail with below error:
ERROR 1071 (42000): Specified key was too long; max key length is 767 bytes
This is a trade-off if you want to get UTF8 support.

 3. Retry the same UTF8 Chinese characters and below is working fine now.

hive> show partitions hao2;
OK
partition_col1=part2
partition_col1=part3
partition_col1=part4
partition_col1=小明
Time taken: 0.092 seconds, Fetched: 4 row(s)
hive> select * from hao2 ;
OK
def 15 part2
ghi 36 part3
ijk 50 part4
abc 20 小明
Time taken: 0.211 seconds, Fetched: 4 row(s)
hive> alter table hao2 drop partition(partition_col1='小明');
Dropped the partition partition_col1=小明
OK
Time taken: 0.531 seconds
hive> show partitions hao2;
OK
partition_col1=part2
partition_col1=part3
partition_col1=part4
Time taken: 0.091 seconds, Fetched: 3 row(s)
hive> select * from hao2;
OK
def 15 part2
ghi 36 part3
ijk 50 part4
Time taken: 0.23 seconds, Fetched: 3 row(s)
Note: Please change the encoding to UTF8 when you create Hive Metastore at the beginning.
Otherwise it is risky to modify the existing metadata.

Quick tip on how to install python3

$
0
0

Goal:

This is a quick tip on how to install python3 on CentOS 7, MacOS.

Solution:

1. CentOS 7

a. Lazy way to install python 3.6

yum install https://centos7.iuscommunity.org/ius-release.rpm 
yum install python36u

b. Install python 3.7.4

Refer to https://tecadmin.net/install-python-3-7-on-centos/

 2. MacOS

brew install python3


How to test MapR Data Access Gateway

$
0
0

Goal:

This article explains the steps on how to test MapR Data Access Gateway.

Env:

MapR 6.1 (secured)

Solution:

1. Python OJAI client

Follow the documentation https://mapr.com/docs/61/MapR-DB/JSON_DB/UsingPythonOJAIClient.html to install python library "maprdb-python-client".
Below is a sample python program named "test.py" to scan the column "_id" of maprdb table "/tmp/test":
from mapr.ojai.storage.ConnectionFactory import ConnectionFactory
connection_str = "v1.poc.com:5678?auth=basic;user=mapr;password=mapr;" \
"ssl=true;" \
"sslCA=./ssl_truststore.pem;" \
"sslTargetNameOverride=v1.poc.com"
connection = ConnectionFactory.get_connection(connection_str=connection_str)
store = connection.get_store('/tmp/test')
query = {"$select": ["_id"]}
options = {
'ojai.mapr.query.result-as-document': True
}
query_result = store.find(query, options=options)
for doc in query_result:
print(doc.as_dictionary())
connection.close()

Note: make sure a valid ssl_truststore.pem is located here.
Run it:
$ python test.py
{'_id': '0002'}

2. REST OJAI API

Refer to documentation https://mapr.com/docs/61/MapR-DB/JSON_DB/UsingMapRDBJSONRESTAPI.html
Below examples are using basic authentication.

2.a insecure mode(-k)

curl -X GET -k \
'https://v1.poc.com:8243/api/v2/table/%2Ftmp%2Ftest%2F' \
-u mapr:mapr

2.b using SSL certificates

curl --cacert ./ssl_truststore.pem -X GET \
'https://v1.poc.com:8243/api/v2/table/%2Ftmp%2Ftest%2F' \
-u mapr:mapr


Understanding different modes in kafka-connect using an example

$
0
0

Goal:

This article is to help understand different modes in kafka-connect using an example.
The example will stream data from a mysql table to MapR Event Store for Apache Kafka(aka "MapR Streams") using different modes of kafka-connect -- incrementing, bulk, timestamp and timestamp+incrementing .

Env:

MapR 6.1 (secured)
mapr-kafka-1.1.1
mapr-kafka-connect-jdbc-4.1.0

Solution:

Please read documentation https://mapr.com/docs/61/Kafka/kafkaConnect.html to understand the architecture of mapr-kafka-connect firstly.

Use case:

In Hive Metastore backend database -- MySQL, there is a table named "TBLS" which tracks the Hive table information.
I choose "TBLS" as data source because "TBLS" has a strictly incrementing column named "TBLS_ID" and also a timestamp related column named "CREATE_TIME"(int(11) data type).
We plan to keep monitoring this table and stream the data into a MapR Streams named "/tmp/hivemeta".
Standalone mode of kafka-connect is used to demonstrate this use case easily.

To monitor what query is running on the source -- MySQL, we will enable MySQL general log.
To monitor what data is written on the target -- MapR Streams, we will use Drill to query MapR Streams using kafka storage plugin.

1. Put the MySQL JDBC driver to kafka-connect JDBC path.

cp /opt/mapr/hadoop/hadoop-2.7.0/share/hadoop/common/lib/mysql-connector-java-5.1.25.jar /opt/mapr/kafka-connect-jdbc/kafka-connect-jdbc-4.1.0/share/java/kafka-connect-jdbc/

2. Recreate a MapR Streams named "/tmp/hivemeta" each time before each test.

maprcli stream delete -path /tmp/hivemeta
maprcli stream create -path /tmp/hivemeta

3. mode=incrementing

incrementing: use a strictly incrementing column on each table to detect only new rows. Note that this will not detect modifications or deletions of existing rows.
Sample Connector file "test.conf":
name=mysql-whitelist-incre-source-tbls
connector.class=io.confluent.connect.jdbc.JdbcSourceConnector
tasks.max=1
connection.url=jdbc:mysql://v4.poc.com:3306/hive?user=hive&password=hive
table.whitelist=TBLS
mode=incrementing
incrementing.column.name=TBL_ID
topic.prefix=/tmp/hivemeta:tbls
Start the standalone mode for this Connector:
/opt/mapr/kafka/kafka-1.1.1/bin/connect-standalone.sh /opt/mapr/kafka/kafka-1.1.1/config/connect-standalone.properties ~/test.conf
Source side -- MySQL receives below queries:
   2534 Query SELECT * FROM `TBLS` WHERE `TBL_ID` > -1 ORDER BY `TBL_ID` ASC
191127 14:50:49 2534 Query SELECT * FROM `TBLS` WHERE `TBL_ID` > 77 ORDER BY `TBL_ID` ASC
191127 14:50:54 2534 Query SELECT * FROM `TBLS` WHERE `TBL_ID` > 77 ORDER BY `TBL_ID` ASC
191127 14:50:59 2534 Query SELECT * FROM `TBLS` WHERE `TBL_ID` > 77 ORDER BY `TBL_ID` ASC
The logic is the first query will do full table scan with where condition "WHERE `TBL_ID` > -1" + order-by.
Then for new incoming data, keep scanning based on "WHERE `TBL_ID` > 77" which is the current commit offset.
Target side -- MapR Streams receives the data up to that offset.
> select t.payload.TBL_ID, t.payload.TBL_NAME from kafka.`/tmp/hivemeta:tblsTBLS` as t;
+---------+-----------------------+
| EXPR$0 | EXPR$1 |
...
| 77 | t1 |
+---------+-----------------------+

4. mode=bulk

bulk: perform a bulk load of the entire table each time it is polled.
Sample Connector file "test_bulk.conf":
name=mysql-whitelist-bulk-source-tbls
connector.class=io.confluent.connect.jdbc.JdbcSourceConnector
tasks.max=1
connection.url=jdbc:mysql://v4.poc.com:3306/hive?user=hive&password=hive
table.whitelist=TBLS
mode=bulk
topic.prefix=/tmp/hivemeta:tbls
Source side -- MySQL receives below queries:
191127 14:57:54  2537 Query SELECT * FROM `TBLS`
191127 14:57:59 2537 Query SELECT * FROM `TBLS`
The logic is each time it is doing a full table scan.
It is like taking a snapshot of the whole data source periodically.
If the source table is huge, it will take lots of resource(CPU/Memory/Disk/Network) on both source and target sides.

5. mode=timestamp

timestamp: use a timestamp (or timestamp-like) column to detect new and modified rows. This assumes the column is updated with each write, and that values are monotonically incrementing, but not necessarily unique.
Sample Connector file "test_timestamp.conf":
name=mysql-whitelist-timestamp-source-tbls
connector.class=io.confluent.connect.jdbc.JdbcSourceConnector
tasks.max=1
connection.url=jdbc:mysql://v4.poc.com:3306/hive?user=hive&password=hive
#table.whitelist=TBLS
mode=timestamp
query=SELECT t.* from (SELECT *,FROM_UNIXTIME(CREATE_TIME) as custom_timestamp FROM TBLS) t
timestamp.column.name=custom_timestamp
topic.prefix=/tmp/hivemeta:tbls
In this example, "CREATE_TIME" column is not "timestamp" data type in MySQL.
The data for "CREATE_TIME" is actually unix timestamp in "int" data type in MySQL.
To workaround it so that we can use "mode=timestamp", I use "query" instead of "table.whitelist" to use a MySQL function "FROM_UNIXTIME" to convert that column to a "timestamp" data type with a new column name "custom_timestamp".
Then set timestamp.column.name=custom_timestamp instead.
Source side -- MySQL receives below queries:
191127 15:34:30  2551 Query select CURRENT_TIMESTAMP
2551 Query SELECT t.* from (SELECT *,FROM_UNIXTIME(CREATE_TIME) as custom_timestamp FROM TBLS) t WHERE `custom_timestamp` > '2019-11-27 15:34:24' AND `custom_timestamp` < '2019-11-27 15:34:30' ORDER BY `custom_timestamp` ASC
191127 15:34:35 2551 Query select CURRENT_TIMESTAMP
2551 Query SELECT t.* from (SELECT *,FROM_UNIXTIME(CREATE_TIME) as custom_timestamp FROM TBLS) t WHERE `custom_timestamp` > '2019-11-27 15:34:24' AND `custom_timestamp` < '2019-11-27 15:34:35' ORDER BY `custom_timestamp` ASC
The logic is it will firstly check current timestamp and use it as the upper boundary for where condition. The lower boundary is the commited offset.

6. mode=timestamp+incrementing

timestamp+incrementing: use two columns, a timestamp column that detects new and modified rows and a strictly incrementing column which provides a globally unique ID for updates so each row can be assigned a unique stream offset.
Sample Connector file "test_incre+timestamp.conf":
name=mysql-whitelist-timestampincre-source-tbls
connector.class=io.confluent.connect.jdbc.JdbcSourceConnector
tasks.max=1
connection.url=jdbc:mysql://v4.poc.com:3306/hive?user=hive&password=hive
#table.whitelist=TBLS
mode=timestamp+incrementing
query=SELECT t.* from (SELECT *,FROM_UNIXTIME(CREATE_TIME) as custom_timestamp FROM TBLS) t
timestamp.column.name=custom_timestamp
incrementing.column.name=TBL_ID
topic.prefix=/tmp/hivemeta:tbls
Source side -- MySQL receives below queries:
191127 15:38:02  2554 Query select CURRENT_TIMESTAMP
2554 Query SELECT t.* from (SELECT *,FROM_UNIXTIME(CREATE_TIME) as custom_timestamp FROM TBLS) t WHERE `custom_timestamp` < '2019-11-27 15:38:02' AND ((`custom_timestamp` = '2019-11-27 15:34:24' AND `TBL_ID` > 81) OR `custom_timestamp` > '2019-11-27 15:34:24') ORDER BY `custom_timestamp`,`TBL_ID` ASC
191127 15:38:07 2554 Query select CURRENT_TIMESTAMP
2554 Query SELECT t.* from (SELECT *,FROM_UNIXTIME(CREATE_TIME) as custom_timestamp FROM TBLS) t WHERE `custom_timestamp` < '2019-11-27 15:38:07' AND ((`custom_timestamp` = '2019-11-27 15:37:57' AND `TBL_ID` > 82) OR `custom_timestamp` > '2019-11-27 15:37:57') ORDER BY `custom_timestamp`,`TBL_ID` ASC
The logic is a little more complex now:

timestamp is earlier than current timestamp
AND
( timestamp is the same but incrementing is larger
  OR
  timestamp is newer
)

How to submit REST requests to a distributed Kafka Connect cluster

$
0
0

Goal:

This article shares the examples of curl commands to submit REST requests to a distributed Kafka Connect cluster.

Env:

MapR 6.1 (secured)
mapr-kafka-1.1.1
mapr-kafka-connect-jdbc-4.1.0

Prerequisite:

1. This article will use the same use case documented in this article talking about standalone Kafka Connect. Please understand this use case firstly.
2. To better format the json object, we need to install a tool "jq" on this CentOS 7 env using below command.
yum install epel-release
yum install jq
3. Create a  MapR Streams named "/tmp/hivemeta".
maprcli stream delete -path /tmp/hivemeta
maprcli stream create -path /tmp/hivemeta
4. Put the MySQL JDBC driver to kafka-connect JDBC path.
cp /opt/mapr/hadoop/hadoop-2.7.0/share/hadoop/common/lib/mysql-connector-java-5.1.25.jar /opt/mapr/kafka-connect-jdbc/kafka-connect-jdbc-4.1.0/share/java/kafka-connect-jdbc/
5. Restart all the Kafka Connect cluster on all nodes.
In this example, this is a 4-nodes Kafka Connect cluster -- v1.poc.com, v2.poc.com, v3.poc.com and v4.poc.com.
maprcli node services -name kafka-connect -action restart -nodes `hostname -f`

Solution:

1. Create a Connector named "mysql-source-dist"

curl -v -X POST https://v1.poc.com:8083/connectors \
--cacert /opt/mapr/conf/ssl_truststore.pem -u mapr:mapr \
-H "Content-Type: application/json" \
--data-binary @- << EOF
{
"name": "mysql-source-dist",
"config": {
"connector.class": "io.confluent.connect.jdbc.JdbcSourceConnector",
"tasks.max": "1",
"connection.url": "jdbc:mysql://v4.poc.com:3306/hive?user=hive&password=hive",
"table.whitelist": "TBLS",
"mode": "incrementing",
"incrementing.column.name": "TBL_ID",
"topic.prefix": "/tmp/hivemeta:tbls"
}
}
EOF

2. Check Kafka Connect version

curl https://v1.poc.com:8083 --cacert /opt/mapr/conf/ssl_truststore.pem  -u mapr:mapr | jq

{
"version": "1.1.1-mapr-1901",
"commit": "8185d9b9630a495d",
"kafka_cluster_id": "7038424823489051793"
}

3. List connector plugins available on this worker

curl https://v1.poc.com:8083/connector-plugins --cacert /opt/mapr/conf/ssl_truststore.pem  -u mapr:mapr | jq

[
{
"class": "io.confluent.connect.hdfs.HdfsSinkConnector",
"type": "sink",
"version": "4.1.0"
},
{
"class": "io.confluent.connect.hdfs.tools.SchemaSourceConnector",
"type": "source",
"version": "1.1.1-mapr-1901"
},
{
"class": "io.confluent.connect.jdbc.JdbcSinkConnector",
"type": "sink",
"version": "4.1.0"
},
{
"class": "io.confluent.connect.jdbc.JdbcSourceConnector",
"type": "source",
"version": "4.1.0"
},
{
"class": "io.confluent.connect.storage.tools.SchemaSourceConnector",
"type": "source",
"version": "1.1.1-mapr-1901"
},
{
"class": "org.apache.kafka.connect.file.FileStreamSinkConnector",
"type": "sink",
"version": "1.1.1-mapr-1901"
},
{
"class": "org.apache.kafka.connect.file.FileStreamSourceConnector",
"type": "source",
"version": "1.1.1-mapr-1901"
}
]

4. Listing active connectors on a worker

curl https://v1.poc.com:8083/connectors --cacert /opt/mapr/conf/ssl_truststore.pem  -u mapr:mapr | jq

[
"mysql-source-dist"
]

5. Get Connector information

curl https://v1.poc.com:8083/connectors/mysql-source-dist --cacert /opt/mapr/conf/ssl_truststore.pem  -u mapr:mapr | jq

{
"name": "mysql-source-dist",
"config": {
"connector.class": "io.confluent.connect.jdbc.JdbcSourceConnector",
"mode": "incrementing",
"incrementing.column.name": "TBL_ID",
"topic.prefix": "/tmp/hivemeta:tbls",
"task.user": "mapr",
"tasks.max": "1",
"name": "mysql-source-dist",
"connection.url": "jdbc:mysql://v4.poc.com:3306/hive?user=hive&password=hive",
"table.whitelist": "TBLS"
},
"tasks": [],
"type": "source"
}

6. Get Connector configuration

curl https://v1.poc.com:8083/connectors/mysql-source-dist/config --cacert /opt/mapr/conf/ssl_truststore.pem  -u mapr:mapr | jq

{
"connector.class": "io.confluent.connect.jdbc.JdbcSourceConnector",
"mode": "incrementing",
"incrementing.column.name": "TBL_ID",
"topic.prefix": "/tmp/hivemeta:tbls",
"task.user": "mapr",
"tasks.max": "1",
"name": "mysql-source-dist",
"connection.url": "jdbc:mysql://v4.poc.com:3306/hive?user=hive&password=hive",
"table.whitelist": "TBLS"
}

7. Get Connector status

curl https://v1.poc.com:8083/connectors/mysql-source-dist/status --cacert /opt/mapr/conf/ssl_truststore.pem  -u mapr:mapr | jq

{
"name": "mysql-source-dist",
"connector": {
"state": "RUNNING",
"worker_id": "v2.poc.com:8083"
},
"tasks": [
{
"state": "RUNNING",
"id": 0,
"worker_id": "v4.poc.com:8083"
}
],
"type": "source"
}

8. Pause a Connector

curl -X PUT https://v1.poc.com:8083/connectors/mysql-source-dist/pause --cacert /opt/mapr/conf/ssl_truststore.pem  -u mapr:mapr
After a while, if we check the Connector status again, both the Connector and all tasks should be PAUSED.
curl https://v1.poc.com:8083/connectors/mysql-source-dist/status --cacert /opt/mapr/conf/ssl_truststore.pem  -u mapr:mapr | jq

{
"name": "mysql-source-dist",
"connector": {
"state": "PAUSED",
"worker_id": "v2.poc.com:8083"
},
"tasks": [
{
"state": "PAUSED",
"id": 0,
"worker_id": "v4.poc.com:8083"
}
],
"type": "source"
}

9. Resume a Connector

curl -X PUT https://v1.poc.com:8083/connectors/mysql-source-dist/resume --cacert /opt/mapr/conf/ssl_truststore.pem  -u mapr:mapr
After a while, if we check the Connector status again, both the Connector and all tasks should be RUNNING.
curl https://v1.poc.com:8083/connectors/mysql-source-dist/status --cacert /opt/mapr/conf/ssl_truststore.pem  -u mapr:mapr | jq

{
"name": "mysql-source-dist",
"connector": {
"state": "RUNNING",
"worker_id": "v2.poc.com:8083"
},
"tasks": [
{
"state": "RUNNING",
"id": 0,
"worker_id": "v4.poc.com:8083"
}
],
"type": "source"
}

10. Updating an exisitng Connector configuration

curl -v -X PUT https://v1.poc.com:8083/connectors/mysql-source-dist/config \
--cacert /opt/mapr/conf/ssl_truststore.pem -u mapr:mapr \
-H "Content-Type: application/json" \
--data-binary @- << EOF
{
"connector.class": "io.confluent.connect.jdbc.JdbcSourceConnector",
"tasks.max": "2",
"connection.url": "jdbc:mysql://v4.poc.com:3306/hive?user=hive&password=hive",
"table.whitelist": "TBLS",
"mode": "incrementing",
"incrementing.column.name": "TBL_ID",
"topic.prefix": "/tmp/hivemeta:tbls"
}
EOF

11. Get tasks status

curl https://v1.poc.com:8083/connectors/mysql-source-dist/tasks --cacert /opt/mapr/conf/ssl_truststore.pem  -u mapr:mapr | jq

[
{
"id": {
"connector": "mysql-source-dist",
"task": 0
},
"config": {
"connector.class": "io.confluent.connect.jdbc.JdbcSourceConnector",
"mode": "incrementing",
"incrementing.column.name": "TBL_ID",
"topic.prefix": "/tmp/hivemeta:tbls",
"task.user": "mapr",
"tables": "TBLS",
"task.class": "io.confluent.connect.jdbc.source.JdbcSourceTask",
"tasks.max": "2",
"name": "mysql-source-dist",
"connection.url": "jdbc:mysql://v4.poc.com:3306/hive?user=hive&password=hive",
"table.whitelist": "TBLS"
}
}
]

12. Get an individual task status

curl https://v1.poc.com:8083/connectors/mysql-source-dist/tasks/0/status --cacert /opt/mapr/conf/ssl_truststore.pem  -u mapr:mapr | jq

{
"state": "RUNNING",
"id": 0,
"worker_id": "v4.poc.com:8083"
}

13. Restart an individual task

curl -X POST https://v4.poc.com:8083/connectors/mysql-source-dist/tasks/0/restart --cacert /opt/mapr/conf/ssl_truststore.pem  -u mapr:mapr

14. Delete the Connector

curl -X DELETE https://v1.poc.com:8083/connectors/mysql-source-dist --cacert /opt/mapr/conf/ssl_truststore.pem  -u mapr:mapr

Reference:

https://mapr.com/docs/61/Kafka/Connect-rest-api.html

Viewing all 137 articles
Browse latest View live