THE CHALLENGE

Our client has a main Data Base used by the operations of its Platform.

This Database due to its growth and high volume of records was causing efficiency problems in the system, which by its nature must meet the high-performance requirement.

THE SOLUTION

After analyzing the system and its needs it was decided:

  • To establish a process of dumping the old and unnecessary information for the daily operation of the Platform in a DataBase that would make the functions of Data Warehouse.
  • Establish a purging process on the Main Database with the aim of maintaining an appropriate volume that will not compromise the system performance.
  • Detach the Platform Reporting System from that Main Database and use the Data Warehouse to improve the general performance in the daily platform operation.

TECHNOLOGICAL SOLUTION

As a starting point for the data warehouse, a main DB backup with one-year gap was used. The databases are Oracle 12.

Dump Process

A Java dump process using Spring libraries was implemented with several configurable parameters. It was coded to be compatible with Java 1.7, Java 1.8 and Java 1.11, and so that it could be integrated into the SW or used independently of it.

Some of its features include:

  • Configurable dump time period. The main DB dump was adapted so that its period did not cause an efficiency problem in the implementation and the process did not impact on the main DB. In addition, the dump was scheduled at system off-peak time (night period).
  • Configurable Number of tables to be aligned in parallel. This allowed to adjust the resource consumption.
  • The process, being based on Oracle DBs, takes advantage of Oracle Data Pump technology, which provides high speed in the data movement from one database to another. However, the implementation contemplates the alternative of using JDBC to become independent from the Database Manager.
  • Record Collision Detection System. If there is an overlap (dates), the system detects during a record collision which of the records is the most up-to-date, discarding the other one.
  • Control of storage times and adaptation of the process based on them. The process records the times it takes to store each table. In the following cycle, the order of tables dump depends on the average times of those tables in previous executions, so that those that take more time will be the first to be executed, improving the global response time.

Purging Process

A Java process was implemented for the Platform’s Database Purging, being able to parameterize aspects such as entities to be purged and age of the data to be purged.