Datasets & Reports
Data Mart models are based on OmniFi reports, which are used both to generate the appropriate table layout as well as by the ETL pipeline to extract and transform data.
Report considerations
OmniFi Reporting supports grouped data sets, mixed data sets (i.e. different data points in a column can have different data types), sets etc., and it is possible to create reports which data structure cannot be represented in a relational database. For use with OmniFi Data Mart, the report must comply with a database table layout, with a fixed set of columns of specific data types.
-
Using e.g. Report Generator as a data source, it is recommended to base the query on a report type or a report layout without grouping. If the layout is grouped or produces mixed column data, the data must be transformed into a table in the report layer.
-
Horizontal grouping and transpose operations cause the column structure to be created from input rows. This type of structure cannot be used at all with the data mart functionality.
Parameterization
Properly parameterizing the report makes it more flexible and re-usable. Aside from necessary input to source APIs, parameters are often used to limit the data volume. If the expected data size for each execution of the ETL pipeline exceeds what is practically manageable to retrieve from the data source or, process through the transformation layer, parameters can be used to retrieve, transform and load smaller chunks of data.
Runtime parameters are ultimately included in the primary key of the mart data table. Since the primary key is not easily changed, the number of slices the data is loaded in needs to be carefully considered to meet the volume requirements also in the foreseeable future.
Naming convention
The mart data table will use the column names from your report unmodified, the naming convention is set already in the report. A naming convention should be defined for the mart and implemented in the report.
Mixed data and null handling
Unlike OmniFi reports, a database column can only contain a single, pre-defined type, optionally nullable. Whereas OmniFi Reporting allows a DateTime column to contain the occasional string value, the database does not. Some data sources supported by OmniFi Reporting module can produce mixed and/or grouped data under certain circumstances.
To handle this, the report must be configured with the appropriate setting for Mixed Data Handling.
- None is the default and is used with Excel-based reports.
- Null replaces any data point that deviates from the column data type with a null value.
- Error stops the report from running when detecting mixed data. When running as an Excel report, the error is instead output in place of the deviating value.
Mixed data in reports will cause the ETL pipeline to fail, and reports with the setting None cannot be used.
Assuming mixed data will not appear in the report, the Error mode is preferred. If the assumption is correct, the ETL pipeline runs through and loads the data. If in some edge case there can still be mixed data, the ETL pipeline will halt, and the administrator will be notified to handle the problem.
Default values can be provided either in the report using the Column modifier, or by configuring default values on the model.
Updated over 1 year ago