Running a Container for Data Science in a Kubernetes Cluster
Using this guide, you can run a container with pre-installed machine learning tools in your Kubernetes cluster and run the Jupyter Notebook service in it.
Below is the list of packages in a container:
- Jupyter Notebook
Containers can be used for training and model inference when developing apps and working with data.
Creating a Cluster and Getting Started
Follow these steps to create a cluster and get started with it in the Control panel:
Go to the Kubernetes section.
We recommend choosing a configuration of at least 4 vCPUs, 8 GB RAM, and 20 GB SSD for comfortable working with the container.
Please note that system requirements may vary depending on the executable scripts.
Wait for the cluster status to switch to Active.
Select the created cluster and go to the Settings tab.
Click Download kubeconfig and save the YAML configuration file.
Set it a name, for example,
Managing Clusters through Console Client
Install kubectl, the Kubernetes console client.
In the console, export the path to the
my-kube.yaml file downloaded in step 6 to an environment variable:
To ensure that the configuration was successful:
kubectl get nodes
If the configuration was successful, the output will be similar to the following:
NAME STATUS ROLES AGE VERSION your-node Ready <none> 2m01s v1.17.6
Running a Container
Follow these steps to run a container:
- Download the YAML deployment configuration file.
- Run the following:
kubectl apply -f selectel-ml.yaml
- The command has been accepted:
- Check the container status by running the following:
kubectl get pod -w
ContainerCreatingstatus will be displayed first:
Press Ctrl+C to exit preview mode.
NAME READY STATUS RESTARTS AGE selectel-ml 0/1 ContainerCreating 0 10s
- Wait for the
Runningstatus to be displayed. It means that the container is created and running. This may take several minutes:
selectel-ml 1/1 Running
Running the Jupyter Notebook
Follow these steps to run the Jupyter Notebook:
Open the port to access the service:
kubectl expose deployment selectel-ml --type=LoadBalancer --name=my-service
The command has been accepted:
To get the port to connect to the Jupyter server, run the following:
kubectl get services
The output should be similar to the following:
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE my-service LoadBalancer 10.100.90.86 203.0.113.1 8888:31779/TCP 30s
Enter the IP address from
EXTERNAL-IPand the port number from
PORT(S)in the browser address bar, for example:
kubectl get servicesafter a few minutes.
Enter the default password:
9lG0eXCevtin the Jupyter Notebook web interface that opens.
Learn more about changing your password in the Jupyter Notebook documentation.
Managing Containers through the Console
To manage containers through the console, run the following:
kubectl exec -it [pod name] /bin/bash