How to create private Kubernetes clusters on GKE
There are workloads that you don't want to expose publicly to the Internet. Nevertheless, you usually want to expose these workloads to the rest of your organization to show them how the development of a new feature is going, or simply because you want to expose a service to your teammates that is only used internally.
At Bitnami, we've decided to be really strict about this. We've shared in the past how we are enforcing service privacy in our Kubernetes clusters, so that no service will be exposed publicly ever as long as it was deployed in one of our development or internal clusters.
Some time ago, we started evaluating Google Container Engine (GKE) to understand if we could use it to host our Kubernetes clusters, and at that point, the first question we had was: Can we also enforce service privacy using GKE?
GKE: The default configuration
Let's begin by describing how GKE currently works when you create a cluster with the default configuration.
Every worker node in a GKE cluster has both a public IP and an internal IP allocated from your project's default network (or the network you specify). The main node, where the Kubernetes API is exposed, also has a public IP. That means everyone on the Internet can reach those nodes and the K8s API. While the default GCP firewall rules allow only traffic destined for ssh, rdp, and icmp to reach the worker nodes, you can (and probably should) further restrict access to really lock down your cluster.
Say you have deployed some pods on your cluster and now you want to expose that service so your clients can consume it. How do you do it?
Well, you have basically two options:
- Create a Kubernetes LoadBalancer Service, which will create a GCP Load Balancer with a public IP and point it to your service.
- Use the GKE Ingress controller to expose the service. You would create, usually, a ClusterIP Service that points to your pods, and then an Ingress resource that points to that ClusterIP Service. That would create an HTTP(S) Load Balancer in GCP, which also has a Public IP.

We use Kubernetes to test, develop and deploy internal services, and we don't want any of those services to be publicly exposed, so all of this obviously doesn't match our needs.
Can't we just whitelist by IP?
When someone thinks of restricting access to services in the cloud, the obvious and simple solution is to place a firewall that filters by IP address, so that only some specific IPs (for instance, your offices, or your VPN server) can contact those services.
In fact, that's something you can do, and it is well documented in the official Kubernetes documentation.
There are two ways of implementing this approach:
- Whitelist your client IPs. Create and maintain a list of IPs that can contact each service, and put that list in the Service or Ingress resource using the property specified in the documentation above.
- Whitelist your VPN IP. Make everyone connect to a VPN server and push routes for the services' IPs to the clients and configure the VPN server to do SNAT. Then, add the VPN server IP to every Service or Ingress firewall, so the VPN server can contact the services.
Unfortunately, neither of these approaches would work for us, because of two reasons:
- Half our employees are remote. Maintaining a list of remote IPs that can contact each service and updating these lists every time a remote employee's IP changes is a huge maintenance burden we would like to avoid.
- When developing, people tend to create and delete Services and Ingress frequently. When the Service or Ingress resources are recreated, they would likely get a new IP, which would mean we need to reconfigure our VPN server to push those routes, too. We would like to avoid this so we don't need to regenerate configuration every time there's a change in one of these resources.
Looking for a private topology
Ideally, for our use case, what we would want is to create a network with a private IP range (like 10.128.0.0/16), and then create a cluster that is self-contained inside that network.
That would mean that every node in the cluster would have an IP in that subnet, every load balancer would have an IP in that range, too, and ideally the Kubernetes API would only be exposed in that network too. That's usually called a private topology.
By doing that, we are guaranteeing that every Service or Ingress resource will get a private IP, and therefore won't be routable from the Internet. We can add firewall rules to filter traffic coming from outside 10.0.0.0/8, too, just to be sure those services and nodes are not being exposed to the Internet.
That way, if we create a VPN connection to that network and set up the right network routes, all Bitnami employees (both in the offices and remote) can contact all the services there directly.
We wouldn't need to apply any special property to our Services or Ingresses, since everything is already restricted at the network level, and we wouldn't need to create and maintain a list of remote IPs of people we want to access every service.
Unfortunately, that option is not supported by GKE directly, so we need to work around some issues manually in order to make it work.
Implementing a private topology in GKE
In order to implement something similar to what was described above, we created a network in GCP, and deployed a GKE cluster in that network. Then we created firewall rules to filter out traffic coming from outside of that network CIDR.
We disabled the GKE Ingress Controller, since that would create public HTTP(S) load balancers for us when creating Ingress resources.
Here's an extract of the terraform module we use internally that creates the GKE cluster in a given GCP network. As said, it disables the GKE Ingress Controller so we can deploy our own private ingress controller instead.
data "google_compute_network" "selected" {
name = "${var.network}"
}
data "google_compute_subnetwork" "selected" {
name = "${var.subnetwork}"
}
resource "google_container_cluster" "cluster" {
name = "${var.name}"
zone = "${var.zone}"
initial_node_count = "3"
node_version = "1.7.3"
network = "${data.google_compute_network.selected.name}"
subnetwork = "${data.google_compute_subnetwork.selected.name}"
# We need a CIDR block not routable for us for the internal containers
# so there is no collision on routes
cluster_ipv4_cidr = "10.30.0.0/16"
node_config {
oauth_scopes = [
"https://www.googleapis.com/auth/compute",
"https://www.googleapis.com/auth/devstorage.read_only",
"https://www.googleapis.com/auth/logging.write",
"https://www.googleapis.com/auth/monitoring",
]
}
# We disable the HTTP Load Balancing addon since we are going
# to install our own ingress controller in the cluster, so we can
# create private ingress resources.
addons_config {
http_load_balancing {
disabled = true
}
}
}
That will create the basic cluster, but we still need to figure out how to actually connect to it, since the intention is to only expose services to that network.
At Bitnami, we decided to use Cloud VPN with AWS VPN Connections to connect these GCP networks to our AWS Virtual Private Networks (VPC). We already have a VPN server for our employees running in AWS, so by connecting these two clouds and setting up the right routes we automatically make the GKE clusters reachable for all employees.
To restrict access to the Kubernetes API, we can push a route to its IP address to our VPN clients, and then use the main authorized networks feature to restrict access at the network level, so only requests coming from the VPN server IP can actually contact the API.
Regarding Load Balancers, GKE was recently updated to implement support for GCP Internal Load Balancers. These Internal Load Balancers have only private IPs in the network, which means that services using these won't be publicly exposed.
There's something to keep in mind, though, which is that if you're using Cloud VPN, your internal load balancers won't be reachable through the Cloud VPN connection. Fortunately, Google is already working on a new feature to support balancing traffic coming from Cloud VPN using Internal Load Balancers. We applied to this private alpha feature, and as soon as they implemented it into our GCP project, our Internal Load Balancers immediately started receiving traffic coming from Cloud VPN, without further configuration required.
According to Google Cloud Platform documentation, you only need to specify an annotation in your service to use an Internal Load Balancer on a Kubernetes Service:
apiVersion: v1
kind: Service
metadata:
name: [SERVICE-NAME]
annotations:
cloud.google.com/load-balancer-type: "internal"
labels:
app: echo
spec:
type: LoadBalancer
loadBalancerIP: [IP-ADDRESS]
ports:
- port: 9000
protocol: TCP
selector:
app: echo
For us, that's not ideal because forgetting to add that annotation would mean that the service would be publicly exposed. One potential solution to solve this issue would be to use a tool like kube-svc-watch to automatically delete Service resources that doesn't have that annotation in place.
In future releases we might instead use an Initializer or an External Admission Webhook to either auto-annotate services with the internal tag or reject services without it, but since those are alpha features right now, we cannot use them at the moment. These features are available in GKE alpha clusters in case you want to try them, but keep in mind that alpha clusters are automatically deleted after thirty days.
Regarding Ingress resources, we decided to deploy the same nginx ingress controller we use on our AWS clusters. We just had to add the annotation mentioned above in the Service resource so it uses a GCP Internal Load Balancer instead. The good thing about this is that it exposes services only internally, and it works basically out of the box. Developers don’t even need to know we’re not using the GKE Ingress Controller anymore!
Once we have the nginx ingress controller in place, developers can create Ingress rules as they were doing before, with no additional changes, but with an important distinction: no services would be exposed publicly if they use an Ingress rule either.
This is how the scenario described above would look like in a diagram:

Conclusion
With that, all our services deployed in our GKE clusters will be private by default, and will guarantee that no one will mistakenly expose something they didn't intend to.
Although setting up all of this is a bit of work, we get some big benefits by using GKE. To name a few, we're glad we don't need to worry about the main node anymore, since it is fully managed and highly available by default on GKE, but we're also looking forward to trying the one-command cluster upgrade, the cluster autoscaler and the auto-repair features.
I believe right now most of the changes explained above that we had to implement are workarounds to small issues in GKE, and I'm personally guessing that in the future GKE will implement some sort of private topology out of the box, which will free cluster admins from the pain of having to figure out all of this on their own.