Adaptive Business Analytics Framework | DOCUMENTATION

This wiki will contain all of the information needed to understand what the ABA Framework is and how it can help with common BI tasks.

What is ABA Framework?

SolidQ Adaptive BA Framework is a set of tools used to build data warehouse projects from the source to the data warehouse.
Currently, during the development of analytic systems you may encounter many problems as there are heavy load processes, and the lack of automation or definition of common patterns… ETL Work consumes 60 to 80 Percent of all BI Projects
ABA Framework tries to solve some of these problems, trying to find patterns that will let us automate processes. It will create a consistent development enviroment, ensuring easy maintainability, the use of best practices and fast developments.

ABA Framework is able to synchronise the data from the helper to the data warehouse, passing through the staging area, performing the well-known ETL process.
What’s more, it also automatizes the process of testing, checking whether the ETL process has been completed successfully and generating documentation in HTML (and/or markdown for 2017’s version) automatically.

In the new version, multiple helpers, staging areas and data warehouses can be added at the same time to the process. The origin (or origins) are identified as projects.
Inside the framework, you will be able to specify the route the data will follow from each project to its helper, staging area until it reaches its data warehouse.

 

ABA Framework is based on SQL Server Integration Services (SSIS) packages.
However, it is more than a set of pre-created packages that can perform some operations.

With the help of the Bussiness Intelligence Markup Language (BIML), it is able to adapt to each case generating all the needed SSIS packages dinamically.

However, the use of ABA Framework won’t mean that we should see it as a black box, that will perform some tasks without any supervision. In every part of the process the use of ABA Framework can be complemented with the application of business rules and modifications that will make ABA Framework unique for each project, without losing the power of using patterns to automate important tasks.

 

General process of a BI solution

Databases

The databases are crucial in a BI solution. We can identify some areas with databases which presence is determining to carry it out successfully.

Helper

A helper database prevents original data from being manipulated during the processes. Keeping the data source intact is a vital condition that any BI solution must fulfil.
Additionally, so as to enable ABA Framework to automatize the different processes, we need to follow the naming conventions to the letter. The helper database also takes place in this task.

Staging area

In order not to synchronize all the data from the helper to the data warehouse straightaway, we use an intermediate area called staging area. It contains a database (or more for the 2017’s version). This area helps us performing data cleasing tasks and enhancements of the data quality.

Data warehouse

The data warehouse is the last area which houses a denormalized database (or more than one for the 2017’s version) with an ideal response time. Its database/databases is/are contains/contain both facts and dimensions tables and is/are designed following the two most common schemas: star schema and snowflake schema. These schemas differ a lot from the transactional systems.
According to Bill Inmon, a data warehouse is characterized by being:

  • Subject-Oriented
  • Integrated
  • Time-Variant
  • Non-volatile

ETL

ETL (Extract, Transform and Load) is the process of gathering all the data from the multiple origins, transform it and load it into the end databases.

  • Extract: The data is extracted from the origins.
  • Transform: Some operations are applied to make the data trustworthy and suitable for the origin.
  • Load: Finally, the data is loaded into the final target database.

Cubes

A cube is a technology in which the data is stored in a specific way. It contains both a series of measures (facts) and dimensions which is a group of attributes that represents an area of interest with respect to the facts.

The aim of the cube is to speed up the query’s carried out over vast amounts of data.

Testing

ABA Framework performs a suite of tests automatically so as to make sure the data has been synchronized properly throughout all the processes that have been carried out.
There are three types of tests:

  • Primary keys
  • ERR tables
  • Group by and Aggregate tests

A series of tests is generated automatically, executed and its results are loaded. Additionally, you can create your own tests.

Documentation

ABA Framework also automatizes the process of documentation generation, generating the documentation for every process we have carried out.

  • Framework
  • Databases
  • Cubes

The documentation can be generated in three widespread standard formats:

  • HTML
  • Markdown (for 2017’s version)
  • PDF (it’s generated automatically along with Markdown)