Vitess exposes a few RPC services, and internally also uses RPCs. These RPCs may use secure transport options. This document explains how to use these features.
The following diagram represents all the RPCs we use in a Vitess cluster:
There are two main categories:
A few features in the Vitess ecosystem depend on authentication, like Called ID and table ACLs. We'll explore the Caller ID feature first.
The encryption and authentication scheme used depends on the transport used. With gRPC (the default for Vitess), TLS can be used to secure both internal and external RPCs. We'll detail what the options are.
Caller ID is a feature provided by the Vitess stack to identify the source of queries. There are two different Caller IDs:
When using gRPC transport, Vitess can use the usual TLS security features (familiarity with SSL / TLS is necessary here):
With these options, it is possible to use TLS-secured connections for all parts of the system. This enables the server side to authenticate the client, and / or the client to authenticate the server.
Note this is not enabled by default, as usually the different Vitess servers will run on a private network (in a Cloud environment, usually all local traffic is already secured over a VPN, for instance).
Additionally, if a client uses a certificate to connect to Vitess (vtgate), the common name of that certificate is passed to vttablet as the Immediate Caller ID. It can then be used by table ACLs, to grant read, write or admin access to individual tables. This should be used if different clients should have different access to Vitess tables.
In a private network, where SSL security is not required, it might still be desirable to use table ACLs as a safety mechanism to prevent a user from accessing sensitive data. The gRPC connector provides the grpc_use_effective_callerid flag for this purpose: if specified when running vtgate, the Effective Caller ID's principal is copied into the Immediate Caller ID, and then used throughout the Vitess stack.
Important: this is not secure. Any user code can provide any value for the Effective Caller ID's principal, and therefore access any data. This is intended as a safety feature to make sure some applications do not misbehave. Therefore, this flag is not enabled by default.
For a concrete example, see test/encrypted_transport.py in the source tree. It first sets up all the certificates, and some table ACLs, then uses the python client to connect with SSL. It also exercises the grpc_use_effective_callerid flag, by connecting without SSL.