Running a Container for Data Science in a Kubernetes Cluster
Using this guide, you can run a container with pre-installed machine learning tools in your Kubernetes cluster and run the Jupyter Notebook service in it.
Below is the list of packages in a container:
- PIP
- PyTorch
- TensorFlow
- Keras
- Anaconda
- Jupyter Notebook
- scikit-learn
- Numpy
- Scipy
- Pandas
- NLTK
- OpenCV
- Catboost
- XGBoost
- LightGBM
Containers can be used for training and model inference when developing apps and working with data.
Creating a Cluster and Getting Started
Follow these steps to create a cluster and get started with it in the Control panel:
-
Go to the Kubernetes section.
-
We recommend choosing a configuration of at least 4 vCPUs, 8 GB RAM, and 20 GB SSD for comfortable working with the container.
Please note that system requirements may vary depending on the executable scripts.
-
Wait for the cluster status to switch to Active.
-
Select the created cluster and go to the Settings tab.
-
Click Download kubeconfig and save the YAML configuration file.
-
Set it a name, for example,
my-kube.yaml
.
Managing Clusters through Console Client
Install kubectl, the Kubernetes console client.
In the console, export the path to the my-kube.yaml
file downloaded in step 6 to an environment variable:
export KUBECONFIG=my-kube.yaml
To ensure that the configuration was successful:
kubectl get nodes
If the configuration was successful, the output will be similar to the following:
NAME STATUS ROLES AGE VERSION
your-node Ready <none> 2m01s v1.17.6
Running a Container
Follow these steps to run a container:
- Download the YAML deployment configuration file.
- Run the following:
kubectl apply -f selectel-ml.yaml
- The command has been accepted:
deployment.apps/selectel-ml created
- Check the container status by running the following:
kubectl get pod -w
- The
ContainerCreating
status will be displayed first:
Press Ctrl+C to exit preview mode.NAME READY STATUS RESTARTS AGE selectel-ml 0/1 ContainerCreating 0 10s
- Wait for the
Running
status to be displayed. It means that the container is created and running. This may take several minutes:selectel-ml 1/1 Running
Running the Jupyter Notebook
Follow these steps to run the Jupyter Notebook:
-
Open the port to access the service:
kubectl expose deployment selectel-ml --type=LoadBalancer --name=my-service
-
The command has been accepted:
service/my-service exposed
-
To get the port to connect to the Jupyter server, run the following:
kubectl get services
-
The output should be similar to the following:
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE my-service LoadBalancer 10.100.90.86 203.0.113.1 8888:31779/TCP 30s
-
Enter the IP address from
EXTERNAL-IP
and the port number fromPORT(S)
in the browser address bar, for example:203.0.113.1:8888
. IfEXTERNAL-IP
field shows<pending>
, runkubectl get services
after a few minutes. -
Enter the default password:
9lG0eXCevt
in the Jupyter Notebook web interface that opens.
Learn more about changing your password in the Jupyter Notebook documentation.
Managing Containers through the Console
To manage containers through the console, run the following:
kubectl exec -it [pod name] /bin/bash