Project Structure

Data Models in bippLang are organized as projects which may be accessed through the user interface. A project can connect to different data sources and may reference tables from these data sources. The project file defines the datamodel in bippLang. A project may consist of multiple datasets and each dataset may contain multiple tables with corresponding columns. Tables and columns may be directly included from the data source or may be defined by the analyst as operations on the original table/column. This structure is illustrated in the following diagram.

bippLang Project Structure


The bippLang project consists of the following components.

  1. Data Model: The data model is the definition of the project using bippLang. A project may consist of several files and artefacts which are compiled together to create a datamodel with datasets. A project also serves as a unit for source control. The model gets compiled based on the rules of the language. Operations defined in the model are executed when users run reports based on the model.

  2. Data Source: This is the original database from where the model gets its data. One project/model can connect to multiple data sources. The model can then define views on the tables available in these data sources using joins and SQL operations.

  3. Dataset: A model may have multiple dataset definitions. A dataset consists of a logical group of tables, columns, relationships and operations that are likely to be accessed together in the future. For example, consider the data for a food delivery service. Different datasets may be defined in the data model to serve end-of-day reports and monthly or yearly reports.

Syntax and Semantics

Like any other language, certain rules of syntax should be followed in the bippLang language. Following is the basic skeleton for a model file.

project <project name>      //No indentation
    dataset <dataset name>  //indentation = 1
        table <table_name1> <table_name2> //indentation = 2, list of tables separated by space
        table <table_name3> <table_name4> //split into multiple lines 
        join <table1> left outer join <table2> //indentation = 2, join definitions for selected tables.
            on <table1.x> = <table2.y>    //indentation = 3, join conditions
        join <table2> inner join <table3>  
            on <table2.y> = <table3.z>     
            then right outer join <table4>
                on <table2.x> = <table4.w> AND <table3.h> = <table4.i>
    table <table name> //indentation = 1, Detailed table definition
        data_source <data_source> //indentation = 2, data source for table
        sql <sql for table>  //indentation = 2
        column <column name>  //indentation = 2, column definition
            sql <sql for column> //indentation = 3. column SQL
            type <type>  //indentation = 3. column data type

The bippLang model file should follow strict indentation rules as defined in the structure. The indentation is based on the hierarchical structure of the model. If you think of the hierarchy as a tree, all children of a node should have exactly the same indentation. (Indentation rules are mostly identical to that of Python except that tabs are replaced with 4 spaces in bippLang. It is unwise to use a mixture of spaces and tabs.) Note that any error in indentation would result in errors when committing or processing the file.

In general, each line in the file may be broken into 4 pieces as follows

  1. Indentation

  2. Keyword: One of the keywords from the list of allowed keywords.

  3. Value (Optional): Values are strings that may be specified in one of the following formats

    a. “some string” (Use \ to escape “)

    b. ‘some string’ (User \ to escape ‘)

    c. “““some string””” (useful for multi-line values)

    d. some string (end of line or trailing comment signifies the end of the string. Leading and trailing whitespaces are stripped from the value when evaluating the expression)

  4. Comment starting with # or // (Optional)

With this background about the structure of a data model, let us dive-in to see, what other features are supported by the bippLang data model.