Best Practices for Scaling Reference Databases

The premise of this document is to provide reader with ideas on how to manage a growing number of reference databases as teams adopt Datical.


Introduction

During early phase of Datical adoption, teams typically setup Reference Databases on their own, i.e., they may not be using centralized infrastructure from day 1. This allows teams flexibility to manage and configure their Reference Databases as well as underlying servers. However, as Datical usage grows, the approach above would create a sprawl of Reference Databases. To better manage sprawl, it would be ideal to move Reference Databases to centralized/shared infrastructure. 

Note that Reference Databases are typically structure-only databases without any significant amount of data. You may still need lookup data or test data to validate objects.

Reference Databases for SQL Server

You could put a large number of Reference Databases on single SQL Server instance. The number of Reference Database per instance would be dictated by the size of each Reference Database (size depends on number of objects in the database and the amount of data needed in the database).

Database name collisions could occur when different teams use the same database name. In that case, a separate database instance would be needed.

If dealing with multiple versions of SQL Server (2008, 2012, 2014, 2016, etc.). You could setup different versions of SQL Server on the same server. Each version of SQL Server would run as a separate Database Engine instance.

The best practice, however, for scaling out SQL Server-based Reference Databases would be to arrange separate servers for each version of SQL Server. And then run only one version of Database Engine (SQL Server instance) per server. It would be possible to set up multiple servers for the same version of Database Engine to support the case for database name collisions. 


Reference Database for Oracle 11g

An Oracle Reference Database is essentially one or more schemas in an Oracle instance.

In a pipeline that consists of DEV, QA, UAT and PROD environments, the schema name (e.g., "appSchema") would be consistent across each environment. As such, there would be multiple Oracle instances configured for each environment and each instance would have "appSchema" defined. 

Because schema names have to be unique within an Oracle instance, this creates the need to setup new Oracle instances for each numbered Reference Databases.

Each Datical project could have one or more pipelines. This allows application teams to have multiple concurrent release streams. Datical requires that each pipelines has a dedicated Reference Database in the following manner. The table below shows sample pipelines for one Datical project


Environments in the pipelineNotes
Pipeline 1REF1 → DEV1 → QA1 → UAT → PROD

Reference Databases (numbered) for each pipeline.

DEV1-5, QA1-5, UAT and PROD are existing databases in pipelines. 


Pipeline 2REF2 → DEV2 → QA2 → UAT → PROD
Pipeilne 3REF3 → DEV3 → QA3 → UAT → PROD
Pipeline 4REF4 → DEV4 → QA4 → UAT → PROD
Pipeline 5REF5 → DEV5 → QA5 → UAT → PROD

Each Datical project will have similar pipelines as shown in the table above where projects will require one or more numbered Reference Databases. Each of the numbered Reference Databases will be housed in their designated Oracle 11g instance. For example, all REF1 databases from all Datical projects could be housed in a single Oracle 11g instance. Similarly, all REF2 databases from all Datical projects could be housed in the second Oracle 11g instance. 

Numbered Reference DatabasesOracle 11g instances
REF1 databases for all Datical projectsInstance #1
REF2 databases for all Datical projectsInstance #2
REF3 databases for all Datical projectsInstance #3
REF4 databases for all Datical projectsInstance #4
REF5 databases for all Datical projectsInstance #5


Shared services team who own centralized infrastructure will need to figure out how many numbered Reference Databases are going to be used and from there decide how many Oracle 11g instances will be needed.

There could be a situation where two projects have the same schema name and therefore reference databases for these two projects can not be housed together on the same Oracle 11g instance. This will require additional Oracle 11g instances, perhaps in the following manner:

Numbered Reference DatabasesOracle 11g instances

REF1 databases for all Datical projects

Instance #1
REF1 databases for Datical projects with identical schema namesInstance #2
REF2 databases for all Datical projectsInstance #3
REF2 databases for Datical projects with identical schema namesInstance #4
REF3 databases for all Datical projectsInstance #5
REF3 databases for Datical projects with identical schema namesInstance #6
REF4 databases for all Datical projectsInstance #7
REF4 databases for Datical projects with identical schema namesInstance #8
REF5 databases for all Datical projectsInstance #9
REF5 databases for Datical projects with identical schema namesInstance #10


The best practice, for scaling out Oracle 11g based Reference Databases would be to setup one Oracle 11g instance per server and have separate servers for each Oracle 11g instance. In the example above, you would setup 5 servers, each with a single Oracle 11g instance. The number of Reference Database per Oracle 11g instance would be dictated by the size of each Reference Database (size depends on number of objects in the database and the amount of data needed in the database).

Reference Databases for Oracle 12c

Oracle 12c has the concept of pluggable databases (PDBs). A PDB is a portable collection of schemas, schema objects, and non-schema objects that appear to an Oracle Net client as a separate database. One or more PDBs together are called a container database (CDB).

Because each PDB is a separate database, you can configure all Reference Databases as PDBs on a single Oracle 12c container. The number of PDBs (hence number Reference Databases) per Oracle 12 container would be dictated by the size of each Reference Database (size depends on number of objects in the database and the amount of data needed in the database). Creating a PDB is very similar to creating a database instance in 11g from a resource standpoint (i.e., memory, storage, processes – although some are shared at the container level). Managing large number of Reference PDBs presents the same challenges as sprawl proliferation mentioned in the Introduction section above, except they would be more concentrated with the container concept.

With that said, the shared REF environments model described in the Oracle 11g section could be applied to Oracle 12c PDBs. Numbered Reference Databases would be PDBs in Oracle 12c containers. To address situations where two projects have the same schema name you would need additional PDBs, without requiring additional Oracle 12c containers.


Numbered Reference DatabasesOracle 12c containers

REF1 PDBs for all Datical projects (including projects with identical schema names)

Container on instance #1
REF2 PDBs for all Datical projects (including projects with identical schema names)Container on instance #2
REF3 PDBs for all Datical projects (including projects with identical schema names)Container on instance #3
REF4 PDBs for all Datical projects (including projects with identical schema names)Container on instance #4
REF5 PDBs for all Datical projects (including projects with identical schema names)Container on instance #5


Best Practices for High Availability

Because possibly large number of databases will be running on few servers, it would be necessary to consider High Availability options for each databaseReference Databases are critical components of Datical and if these databases go offline you are unable to package database changes until those database are back online.

The key consideration here is for SQL Server instances. The out-of-the-box configuration for SQL Server requires elevated privileges which requires SQL Server instances to be configured in single user mode. However, a custom, non-elevated setup can be used which can work with SQL Server instances configured in Availability Groups (AG).




Copyright © Datical 2012-2020 - Proprietary and Confidential