The Hitchhiker’s Guide to Telco Transformation - Part 7
Trust, or rather the lack of it, is the biggest inhibitor of the adoption of cloud-based services. The fear that others can access, view or abuse our personal or business data continues to be a concern. Even as more and more services and applications move to the cloud. We just can’t seem to shake the feeling, that no one will protect our data as we would. And perhaps there is a reason, after all the number of hacks and inroads into our personal data grow on a daily basis. And yet cloud applications provide customers with centralized, network‐based access to data with less overhead than is possible with a local application.One of the biggest priorities for a prospective cloud architect is creating a SaaS data architecture that is both robust and secure enough to satisfy stakeholders, while also being efficient and cost‐effective to administer and maintain. With regards to security, a number of factors must be taken into consideration when choosing or designing SaaS data architecture. For example, data architecture requires an optimal degree of isolation for security concerns.
Experienced data architects are used to considering a broad spectrum of choices when designing an architecture to meet a specific set of challenges, and cloud apps is certainly no exception. But what cloud architecture to choose for our specific security needs, and the benefits of each, is sometimes difficult to decide. In this blog I will look at the different architectural options, and the pros and cons of each. An in my next blog, we can discuss how to choose the multi-tenant approach that is right for you.
Option 1: Separate Databases
Storing tenant (user/company) data in separate databases is the simplest approach for data isolation. Compute resources and application code are generally shared between all of the tenants, but each tenant has its own set of data that remains logically isolated from data that belongs to all other tenants. Metadata associates each database with the correct tenant, and database security prevents any tenant from accidentally or maliciously accessing other tenants' data.
Giving each tenant his own database makes it easy to extend the application's data model to meet tenants' individual needs, and restoring a tenant's data from backups in the event of a failure is a relatively simple procedure. Unfortunately, this approach tends to lead to higher costs of maintenance and backup. Hardware costs are also higher than they are under alternative approaches, as the number of tenants that can be housed on a given database server is limited by the number of databases that the server can support.
Separating tenant data into individual databases is the "premium" approach (like AWS reserved instances), and the relatively high hardware and maintenance requirements and costs make it appropriate for customers that are willing to pay extra for added security and customizability. For example, customers in fields such as finance, medicine or even government records.
Option 2: Shared Database, Separate Schemas
Another approach involves housing multiple tenants in the same database, with each tenant having its own set of tables that are grouped into a schema created specifically for the tenant. These are usually developed when the customer first subscribes to the service by the provisioning subsystem creating a discrete set of tables for the tenant and associating them with the tenant's own schema.
After the schema is created, it is set as the default schema for the tenant account. A tenant account can access tables within its default schema by specifying the table name, instead of using the SchemaName.TableName convention. This way, a single set of SQL statements can be created for all tenants, which each tenant can access his own data.
Like the isolated approach, the separate‐schema approach is relatively easy to implement, and tenants can extend the data model as easily as with the separate‐database approach. This approach offers a moderate degree of logical data isolation for security‐conscious tenants, though not as much as a completely isolated system would, and can support a larger number of tenants per database server.
A significant drawback of the separate‐schema approach is that tenant data is harder to restore in the event of a failure. If each tenant has its own database, restoring a single tenant's data means simply restoring the database from the most recent backup. With a separate‐schema application, restoring the entire database would mean overwriting the data of every tenant on the same database with backup data, regardless of whether each one has experienced any loss or not. Therefore, to restore a single customer's data, the database administrator may have to restore the database to a temporary server, and then import the customer's tables into the production server—a complicated and potentially time‐ consuming task.
The separate schema approach is appropriate for applications that use a relatively small number of database tables, in the order of ~ 100 tables per tenant or less. This approach can typically accommodate more tenants per server than the separate‐database approach can, so you can offer the application at a lower cost, as long as your customers will accept having their data co‐located with that of other tenants.
Option 3: Shared Database, Shared Schema
A third approach involves using the same database and the same set of tables to host data from multiple tenants. A given table can include records from multiple tenants stored in any order; a Tenant ID is then associated with every record. Of the three approaches explained here, the shared schema approach has the lowest hardware and backup costs, because it allows you to serve the largest number of tenants per database server.
However, because multiple tenants share the same database tables, this approach may require additional development efforts to increase security, and to ensure that tenants can never access each other’s data, even in the event of unexpected bugs or attacks. The procedure for restoring data for a tenant is similar to that for the shared‐schema approach, with the additional complication that individual rows in the production database must be deleted and then reinserted from the temporary database. If there are a very large number of rows in the affected tables, this can cause performance to suffer noticeably for all tenants served.
The shared‐schema approach is appropriate when it is important that the application be capable of serving a large number of tenants with a small number of servers, and prospective customers are willing to surrender data isolation in exchange for the lower costs that this approach makes possible.
So to summarize the options:
In my next blog I will discuss choosing an approach that is right for you.