bippLang Reference

CONTENTS

Home

Here is a reference of the common keywords used in bippLang. They can be categorized as one of these types:

  • Attribute keywords used in the bippLang data model.

  • bipp specific functions to extend standard SQL.

  • meta functions for working with derived tables. These are similar to macros, where they are based on a parameter and evaluated at run-time before the SQL statement is auto-generated.

The Category is included in each entry.

Verifying Queries

The SQL queries and expressions used in the bippLang model are evaluated only when visualizing data in Sheets. To ensure the queries are correct before during development, use the SQL Runner tool. You can also use the Run a Query option:

  1. Open Datasources Datasources.

  2. Click Run Query to the right of the Datasource and select Run Query. The Query window opens using the selected Datasource. You can change the Datasource with the dropdown.

  3. Write your query.

  4. Click Run query. You can click Save in a project if you anticipate re-using the query.

  5. When you are finished, click Close query to close the Query window.

Keyword Reference

allowed_value

Category: Attribute for parameter
Where used: table
Required? Optional
Sub-Options: optional attribute label

Syntax

allowed_value value

Allowed values specify the values a user can choose. For example, if you specify value of True and False, a user can only pick one of those options.

At run-time, a parameter can only hold a single allowed_value.

Example

table t  
    parameter p
    allowed_value True 
    allowed_value False
    label Enter a Value

See Also

default_value, label, parameter

apply_filter

Category: attribute for drilldown
Where used: table
Required? Optional

Syntax

apply_filter value

Enables the filter option for visualizations using drilldown option. If value is true then filter is available with drilldown.

Example

    column region_countrycode  
        sql region  
        type string  
        label Region_  
        drilldown  
        table hrc_fa_forecast_accuracy
    column country_code
    apply_filter false

See Also

filter, drilldown

bipp_cumulative_percent

Category: bipp specific SQL window function
Where used: column
Required? Optional

Syntax

    bipp_cumulative_percent(numericExpression)

For every row of the returned result, this is the cumulative sum of the numericExpression from the first row through this row. It is expressed as a percentage of the total sum of the numericExpression over all rows in the result.

Example

    column VCCumulativePercent
        sql """ bipp_cumulative_percent(VictimCount) """
        type float
        format auto%
        label CumulativeVictimCountPercentage

See Also

bipp_percent, bipp_total, bipp_cumulative_total

bipp_cumulative_total

Category: bipp specific SQL window function
Where used: column
Required? Optional

Syntax

    bipp_cumulative_total(numericExpression)

For every row of the returned result, this is the cumulative sum of the numericExpression from the first row through this row.

Example

    column VCCTotal
        sql """ bipp_cumulative_total(VictimCount) """
        type int
        label VictimCountTillNow

See Also

bipp_percent, bipp_total, bipp_cumulative_percent

$bipp_equals

Category: bipp-specific meta function SQL
Where used: derived table
Required? Optional

Syntax

$bipp_equals(string1, string2)

Creates a Boolean condition in the SQL for a derived table. Typically used in conjunction with $bipp_if.

Evaluates to True if the case-sensitive string1 and string2 are equal.

Example

    sql """$bipp_if( $bipp_equal("$kind", "TAPE),
        "{tape_transactions}", "{scissor_transactions}" )"""

See Also
$bipp_if

$bipp_if

Category: bipp-specific meta function SQL
Where used: Table SQL
Required? Optional

Creates an if-then-else construct in the SQL for a table. An example is to make SQL conditional on parameter values.

Syntax

$bipp_if(Condition, Then, Else)

Condition: defines the condition to be met.
Then: value to use if Condition is True. Can be a nested $bipp_if condition.
Else: value to use if Condition is False. Can be a nested $bipp_if condition.

Example

sql """$bipp_if( $bipp_equal("$kind", "TAPE"),
    "tape_transactions",
    "scissor_transactions" )"""

Example using nested bipp_if

sql """$bipp_if( $bipp_equal("$kind", "TAPE"),
    "tape_transactions",
    $bipp_if ($bipp_equal("$kind", "SCISSOR"),
    "scissor_transactions",
    "other_table") ))"""

See Also

$bipp_equals

$bipp_join

Category: bipp-specific meta function SQL
Where used: Table SQL
Required? Optional

Creates a new string by concatenating the list of strings enclosed in brackets [ ] separated by the specified delimiterString.

Syntax

    $bipp_join([string1, string2,...], delimiterString)

Example

    sql """$bipp_join( ["a", "b"], "x") )

Evaluates to: "axb"

Example with $bipp_if, $bipp_equal and $bipp_join

In the Learning bipp tutorials, you created the Tennis Stats dataset and visualizations. This example demonstrates using the $bipp functions to determine wins and losses for any player.

The ResultType parameter is added to the dataset to enable users to specify wins or losses when visualizing the results. The MatchesWon and MatchesLost views get the respective data from the Matches table in the database.

table MatchesLost
    data_source TennisData
    sql """ SELECT m.tournament, m.Surface, p.fname, p.lname, 'Lost' as Result
    FROM players.atp_matches m join players.atp_players p on m.LoserId = p.Id """
    //column definitions
table MatchesWon
    data_source TennisData
    sql """ SELECT m.tournament, m.Surface, p.fname, p.lname, 'Won' as Result
    FROM players.atp_matches m join players.atp_players p on m.winnerId = p.Id """
    //column definitions
table PlayerMatchResult
    data_source TennisData
    sql """ $bipp_if(
            $bipp_equal(
            $bipp_join(
            ['atp_players.$ResultType', ''],
            ''
            ),
            'Won'
            ),
            '{MatchesWon -}',
            '{MatchesLost -}'
    ) """

See Also

$bipp_equals, $bipp_if

bipp_percent

Category: bipp-specific meta function SQL
Where used: Table SQL
Required? Optional

Syntax

    bipp_percent(numericExpression)

For every row of the returned result, this is the value of the numeric expression for the row. It is expressed as a percentage of the sum of the numeric expression over all rows in the returned result.

Example

See Also

bipp_cumulative_percent, bipp_cumulative_total, bipp_total

bipp_total

Category: bipp-specific meta function SQL
Where used: Table SQL
Required? Optional

Syntax

    bipp_total(numericExpression)

For every row of returned result, this will be the sum of the numeric expression over all rows in the returned result. Note that by definition, the value of this will be the same in each row of the returned result.

Example

    column VCPercent
        sql """ bipp_percent(VictimCount) """
        type float
        format auto%
        label VictimCountPercentage

See Also

bipp_cumulative_percent, bipp_cumulative_total, bipp_percent

color_format

Category: Attribute
Where used: Column Property
Required? Optional

color_format changes the cell color when data is presented in the the sheet. In its most basic form, the property applies color based on the value of data in the cell. There are advanced options you can use to define the format:

  • Format conditions based on a fixed value or range of possible values of data.

  • Color can be specified as a single color or a range of hex codes for different colors.

  • Color can be applied to the text or the background.

  • Can apply color based on the value of a different column.

  • Value of the data may be treated as a color to be applied.

  • Can pass a fixed text value to be used to fill the cell.

Additional functions and attributes used to define color_format:

  • val: Defines the data values to be compared. Function argument may be a fixed value or range of values. The - sign may be used at either end if the range does not have an upper or lower limit. For example:

    • val(1000): Condition satisfied if data cell value = 1000

    • val(1001 to 5000): Condition satisfied if data cell value is between 1001 and 5000

    • val(100 to -): Condition satisfied if data cell value is greater than or equal to 100

    • val(- to 600): Condition satisfied if data cell value is less than or equal to 600

  • color: Defines the color or range of colors to use. When specified as a range, the color to be applied is calculated based on how far the actual value of data is from the minimum value specified in the val function. For example:

          color_format """val(100 to 500) color(#00FF00 to #0000FF)""" 
    

    If the data value is 450, the color used is closer to #0000FF than it is to #00FF00

  • background: Specifies the color to be applied to the cell background. By default, the color identified by color is applied to the cell text.

  • source: If the column specified as the value of this property is present in the data, the value that gets compared against the val() function will be taken from the column.

  • datacolor: Instructs bippLang to ignore the val and color functions and use the value in the data cell itself as the color value to apply.

  • transpose: Defines a text value to write to the cell if the criteria specified by the val function is satisfied. For example, you can specify certain symbols or constant display in the cell.

Example 1

column SEIndicator
    sql """ SUM(TimeDiff)/COUNT(DISTINCT PL) """
    type float
    color_format """ val(0.0 to 34.0) color(red) """
    color_format """ val(35.0 to 44.0) color(greenyellow) """
    color_format """ val(45.0 to -) color(green) """

These examples combine the functions and attributes for advanced results:

""" val(100) color(green) background """

If the value is 100, the background of the containing data cell is green.

""" val(10001 to 20000) color(#FF0000) transpose(⬤) """

If the value is between 10001 to 20000, change the text in the cell to ⬤.

""" datacolor """

Read the data cell value and apply the corresponding text color.

""" transpose(▇) """

Change all data cell values to ▇.

Example 2

""" val(745) color(green) background source(sum_fare_amount) transpose(⬤) """

Show ⬤ in the cell. If the value of sum_fare_amount is 745 for the row, apply green background to the cell.

column

Category: Attribute
Where used: Table
Required? Optional

Each instance of column corresponds to a data column, either native to the base table or derived from a SQL expression composed of other columns, constants and SQL operators.

The base table itself may correspond to a table in the backend database or a select statement.

These are the additional column attributes:

color_format
default_aggregate
drilldown
format
hidden
join_as
label
order_by
partition_by, partition_by_excluding, partition_by_including
sql
type

Example

    column amount
        sql sum(invoice_qnty*unit_price)
        type float
        format $0.00
        label ItemTotal

count_distinct

Category: bipp-specific SQL aggregation function
Where used: Table SQL
Required? Optional

Syntax

    count_distinct(expression)

Counts the first instance of expression, eliminating the subsequent appearance of the same data. This function is the same as standard SQL: count(distinct expression)

Example

This example counts the distinct number of countries from the retail transactions table:

column number_countries
    sql count_distinct(country)

See Also

count_if

count_if

Category: bipp-specific SQL aggregation function
Where used: Table SQL
Required? Optional

Syntax

    count_if(condition)

Counts the item if the condition is met. ThIs function is the same as standard SQL:

sum(case when *condition* then 1 else 0 end)

Example

    column c1
        sql count_if(c2 = "something")

See Also

count_distinct

data_source

Category: attribute
Where used: Table
Required? Optional

Syntax

data_source identifier

Specifies the data source for the table. The value is the identifier for a defined data source.

Example

table stocks
    data_source retail_ds

dataset

Category: attribute
Where used: Table
Required? Required

Syntax

dataset name

Defines a dataset called name to use in your bippLang data model. A dataset is a collection of tables and optional join rules.

Example

This example creates a dataset named retailDataset, containing the tables stocks and transactions:

dataset retailDataset
    table stocks transactions

default_aggregate

Category: attribute
Where used: Column
Required? Optional

Syntax

default_aggregate type

Automatically applies the specified aggregate type on a numeric column when the column is selected in the Data Viewer. These are standard SQL aggregate functions. If no aggregate is needed, specify none.

Here are the values for type:

sum: total of the values. This is the default value if type is not specified.
count: number of items found.
max: maximum value found.
min: minimum value found.
avg: average of the values.
none: no aggregate applied.

Example

column customer_id
    sql customer_id
    type int
    default_aggregate none

default_value

Category: attribute
Where used: Parameter
Required? Required

Syntax

default_value value

Sets the default value of a parameter to value. The value must be consistent with the specified type (int, string, float).

Example

    table transactions
        parameter item
        type string
        default_value "none selected"

See Also

allowed_value, parameter

drilldown

Category: attribute
Where used: Column
Required? Optional

Syntax

    drilldown column_name

Datasets often contain hierarchical relationships between data elements. For example, they can represent a geographical hierarchy or an organizational hierarchy.

The drilldown attribute is used to indicate this relationship between two columns. Once the relationship is defined, users can drilldown the data for these columns when visualizing the dataset.

Example

column Region
    sql Region
    type string
    drilldown dl_region
        table casedata
        column Country
column Country
    sql Country
    type string
    drilldown dl_region
        table casedata
        column State
column State
    sql State
    type string

filter

Category: attribute
Where used: Table
Required? Optional

Syntax

filter string

If string is specified, any use of the table always includes the filter in its where and/or having clauses.

format

Category: attribute
Where used: Column
Required? Optional

Syntax

format specification

Indicates how the numeric or date column data should be formatted. bippLang number formatting is based on Microsoft Excel number formats.

Custom number formatting can be achieved using the format specification:

prefix###000.000###,suffix

where:

prefix: static information to appear at the beginning of the result, for example: $
# indicates optional digits
0 indicates required digits
. indicates the position of the decimal point
, adds a comma when specified in the ### sequence (#,##0) or divides by 1000 if added at the end of the number string prior to a suffix (0.00,, M)
suffix: static information to appear at the end of the result, for example: M

For example, to specify decimal formatting for currency:

column SEIndicator
    sql SEIndicator
    type float
    format $0.00

Using this expression, a column value of 1.122e2 is displayed as $112.20. Here are some additional examples:

Value Format Result
123456789 0.00,, M 123.45 M
1.122e12 $#,##0.00,, M $1,122,000.00 M
123456789 0.00##,, M 123.4567 M
1234.56 $0.00, K $1.23 K

Custom date format is specified using a combination of letters to indicate days, month, year and time. Here are the options:

d: Day of the month as digits; no leading zero for single-digit days.
dd: Day of the month as digits; leading zero for single-digit days.
ddd: Day of the week as a three-letter abbreviation.
dddd: Day of the week as its full name.
m: Month as digits; no leading zero for single-digit months.
mm: Month as digits; leading zero for single-digit months.
mmm: Month as a three-letter abbreviation.
mmmm: Month as its full name.
yy: Year as last two digits; leading zero for years less than 10.
yyyy: Year represented by four digits.
h: Hours; no leading zero for single-digit hours (12-hour clock).
hh: Hours; leading zero for single-digit hours (12-hour clock).
H: Hours; no leading zero for single-digit hours (24-hour clock).
HH: Hours; leading zero for single-digit hours (24-hour clock).
M: Minutes; no leading zero for single-digit minutes.
MM: Minutes; leading zero for single-digit minutes.
s: Seconds; no leading zero for single-digit seconds.
ss: Seconds; leading zero for single-digit seconds.
l: Milliseconds; gives 3 digits.
L: Milliseconds; gives 2 digits.
t: Lowercase, single-character time marker string: a or p.
tt: Lowercase, two-character time marker string: am or pm.
T: Uppercase, single-character time marker string: A or P.
TT: Uppercase, two-character time marker string: AM or PM.

Here are some examples:

Value Format Result
2020-04-06T19:05:02Z ddd mmm dd yyyy HH:MM:ssM Mon Apr 06 2020 19:05:02
2020-04-06T19:05:02Z mmmm d, yyyy April 6, 2020
2020-04-06T19:05:02Z HH:MM:ss 19:05:02
2020-04-06T19:05:02.456Z HH:MM:ss L 19:05:02 46
See Also

type

hidden

Category: attribute (boolean)
Where used: Column
Required? Optional

Syntax

hidden value

Hides the column in the Data Viewer if value is true. If value is not specified, it defaults to false.

Example

column _invoice_date
    sql invoice_date
    type string
    hidden true

inherits

Category: attribute
Where used: Table
Required? Optional

Syntax

inherits table_name

Applies all of the attributes from the specified table_name to the table you are defining. Any further specifications within your table definition overwrite the specific inherited attribute.

Example

In this example, table t2 inherits all attributes of table t1.

table t2
    inherits t1

join

Category: attribute
Where used: Dataset
Required? Optional

Syntax

join table1 joinType1 table2
  on sqlExpression1
  [then joinType2 table3
    on sqlExpression2 ...]

where joinType can be any of the following:

  • inner, inner join, join OR <->

  • left, left join, left outer join OR -->

  • right, right join, right outer join OR <--

  • full, full join, full outer join OR >-<

Defines a SQL JOIN relationship between two or more tables. The auto-generated SQL automatically uses any appropriate JOIN relationships, if defined.

Example 1

project Operations
    dataset OpsDS
        table efficiency_data shipping_statement_view
        join shipping_statement_view join efficiency_data
          on shipping_statement_view.WeekYear = efficiency_data.WeekYear

Example 2

project p
    dataset ds
        table t1 t2 t3
        join t1 inner t2
          on t1.c1 <= t2.c2
          then outer t3
            on t1.c1=t3.c3 AND t2.c2=t3.c3

join_as

Category: attribute
Where used: Column
Required? Optional

Syntax

join_as name

This is a shortcut for a simple inner join using the equality (=) operator. If specified, the join_as name for columns implicitly induce a join relationship if, and only if, no explicit join relationships already exist in the corresponding dataset.

Typically, you specify explicit join relationships in a dataset and join_as is not used.

Example 1

The data model contains:

table t1
    column c1
        join_as x
table t2
    column c2
        join_as x

The dataset is defined as:

dataset ds
    table t1 t2

The two instances join_as (containing the same name x) to product the join relationship:

join t1 inner t2
    on t1.c1 = t2.c2

Example 2

In this example, the dataset has an explicit join, so the join_as instances instances are ignored. No new join relationship with t1.c1 or t2.c2 is created.

The dataset is defined as:

dataset ds
    table t1 t2 t3
    join t1 outer t3
        on someSQLexpression

See Also

join

label

Category: attribute
Where used: Dataset, Table, Column, Parameter, allowed_value
Required? Optional

Syntax

label name

Applies the specified name to the item where it is defined. The name is used in the UI when the item is used. It can be nested table, dataset, parameter, column, or allowed_value. If label is not specified, a label is automatically created from the name or value of the item.

Example

column SLIndicator
    label SLIndicator in %

on

Category: attribute
Where used: join, or then nested under a join
Required? Optional

Syntax

on expression

Specifies columns and condition to define the join.

Example

table tape_transactions
    sql """SELECT t.invoice_num AS invoice_num,
            t.stock_code AS stock_code,
            t.unit_price AS unit_price, t.country AS country
        FROM `online_retail`.`stocks` AS s
        JOIN `online_retail`.`transactions` AS t
        ON s.code = t.stock_code
            WHERE s.description like '%TAPE%'"""

order_by

Category: attribute
Where used: column
Required? Optional

This is used for sorting a column in data model based on another column.

Example

column account
    label Account
    sql account
    type string
    order_by _
        table hrc_gp_cnis
        column seq

parameter

Category: attribute
Where used: table
Required? Optional

Syntax

parameter Name
    type type
    default_value DefaultValue
    label Label
    allowed_value Allowed_Value_1
    string_format Format

where

type (required): can be int, float, datetime, string or bool.
default_value (required): is the value if no value is specified. label (optional): specifies a description label to use in the Data Viewer.
allowed_value (optional): one or more allowed values for user selection.
string_format (optional, used with type string): specifies the format to use for the string.

A parameter behaves like a required filter. A parameter’s value is passed on to the auto-SQL generator engine as a filter like tableName.parameterName = value.

Parameters can be shared across tables. If parameter p is defined in table t, it can be referred to in the SQL for a column in another table as t.p.

Example

table tapes_or_scissors
    parameter kind
        type string
        default_value TAPE
        allowed_value TAPE
        allowed_value SCISSOR

partition_by, partition_by_excluding, partition_by_including

Category: attribute
Where used: column
Required? Optional

Syntax

column columnName
    sql AggregateSQLexpression
    partition_by* ListOfColumns

where

partition_by* indicates one of the three options: partition_by, partition_by_excluding, partition_by_including.
ListOfColumns: column list separated by white space. The list of columns can be empty only for partition_by. The list cannot be empty for partition_by_excluding or partition_by_including.

The partition constructs enable bipp to support window functions, also called Analytic Functions. These functions are dynamic and agnostic of the SQL dialect. There are more details and examples in the Dynamic Analytic Functions section.

Example 1

column baf1
    sql sum(c1)
    partition_by c2, c3

Computes baf1 partitioned by (c2, c3)

Example 2

column baf2
    sql sum(c1)
    partition_by_excluding c4

Computes baf2 by partitioning on user-selected columns after excluding c4 if c4 was selected by the user.

Example 3

column baf3
    sql sum(c1)
    partition_by_including c2, c4

Computes baf3 by partitioning on user-selected columns after including c2 and c4 if they are not already selected by the user.

reftable

Category: attribute
Where used: dataset
Required? Optional

Syntax

reftable table_name

Specifies a table to be used for internal reference in the dataset. The table does not appear in the Data Viewer.

Example

dataset DataAnalysis
    table PlayerMatchResult
    reftable MatchesWon MatchesLost

sql

Category: attribute
Where used: table or column
Required? Optional

Syntax

sql AggregateSQLexpression

The sql statement can be used in both the table and column definitions:

Table: the sql statement should be either the name of a backend database in a format required by the backend, or a valid SELECT statement.

Column: the sql statement should be a valid SQL-fragment that is either the (unqualified) name of a backend column, or a valid SQL expression in the SQL-dialect of the backend. How the statement is handled for a column depends on the specification:

  • If the sql specification is atomic (consists of a single SQL token such as the name of a backend column, string, or numeric/date constant), it is interpreted in the backend namespace.

  • If the sql specification is non-atomic (consists of more than one SQL token), any column names are interpreted in the bippLang data model namespace.

Example

1 column c1
2 sql c1
3 column c2
4 sql c1 + 42

where

c1 in line 2 refers to a column in the backend namespace
c1 in line 4 refers to a column in the bippLang data model

The backend name for the bippLang column c1 also happens to be c1. If the column definition for c1 (starting in line 1) was to be removed from the data model, line 4 would produce an error although there is a backend column called c1.

type

Category: attribute
Where used: column, parameter
Required? Optional

Syntax

type data_type

where

data_type can be: int, float, datetime, string or bool.

Specifies the alphanumeric data type the column contains, or the type of value for a parameter.

Example

column _invoice_date
    sql invoice_date
    type string
    hidden true
column invoice_date
    sql cast(_invoice_date as DATE)
    type datetime
column unit_price
    sql unit_price
    type float
    format $0.00
column customer_id
    sql customer_id
    type int
    default_aggregate none

See Also

parameter, format

variables, var, value

Category: attribute
Where used: any bippLang property
Required? Optional

Syntax

variables
var name
value value

This is a mechanism for macro-substitution in bippLang.

Variables are defined with var-value pairs under a variables section.

Variables are referred to by their names enclosed in {{ and }}.

Example

table t
    variables
        var x
            value 42
        var y
            value "fortytwo"
    column c1
        sql {{x}}
    column c2
        sql concat({{y}}, "foo")

expands to:

table t
    column c1
        sql 42
    column c2
        sql concat("fortytwo", "foo")

Advanced Features

Dynamic Analytic Functions

Analytic Functions (also known as Window Functions or Partitioning Functions) are part of ANSI SQL and widely supported by many dialects, including BigQuery, Microsoft, Oracle, and Snowflake.

An analytic function operates on a group of rows, returning a single result for each row. An aggregate function also operates on a group of rows, returning a single result for the group.

Basic Analytic Function

Here is an example of an analytic function, using BigQuery syntax.

Data Table Internally grouped by ID.

ID X Region Country M
1 10 NA USA 1
1 20 NA USA 10
2 30 NA USA 100
2 40 NA USA 1000
2 50 NA USA 10000
3 60 NA USA 100000
3 70 NA USA 1000000

Query

select Region, sum(M) over (partition by ID) as S

Result

The sum aggregation is executed on the rows by ID. The number of rows in the result is the same as the data table.

Region S
NA 11
NA 11
NA 11100
NA 11100
NA 11100
NA 1100000
NA 1100000

Dynamic Column Specification

Using special constructs in bippLang enables you to dynamically specify the columns in the internal group by.

bippLang Table Definition

table T
    column c1
        sql ...
        partition_by_including c2
    column c2
    column c3
        sql ...
        partition_by c4 c5
    column c4
    column c5
    column c6
        sql ...
        partition_by_excluding c4

Column c1 and c6 have a dynamic list of columns for the partitioning:

  • c1: partition by c2 PLUS all other columns selected from table T

  • c6: partition by all columns selected from table T MINUS c4

Column c3 is the expected partitioning as illustrated in the Basic Analytic Function example, using an aggregate function in the sql expression:

... over (partition_by c4, c5) ...

Constraints

  1. If a column has a partition specification, the sql should be an aggregate (measure), though the column is considered a non-aggregate (analytic) column.

  2. Columns listed in a partition specification should be non-aggregates (dimensions or analytics).

  3. The keyword partition_by can be followed by an empty list -- in which case all rows of the table form a single group in the internal GROUP BY. But dynamic partitions cannot have empty lists.

Dynamic User Selections

bippLang Table Definition

table t1
    column c1
    column c2
    column c3
    column m1
        sql min(c3)
        partition_by_including c2

Here is the implementation when a user selects c1 and sum(m1)
or a visualization:

select t1.c1 as c1, sum(t11.m1) as sum_m1
    from t1
    inner join (
        select c1, c2, min(c3) as m1 from t1 group by c1, c2
    ) as t11
    on t1.c1 = t11.c1 AND t1.c2 = t11.c2
    group by c1

Using the inner join construct is advantageous over avoiding the join and using an outer select around the inner select:

  • at least as fast

  • works for partition_by_excluding and partition_by

  • generalizes naturally to simultaneous selection of multiple analytic functions

  • handles cases where multiple tables are joined in the original query

Recursive Analytic Functions

Analytic functions can be recursive, in other words, they can refer to other analytic functions.

bippLang Table Definition

table t1
    column c1
    column c2
    column c3
    column m1
        sql min(c3)
        partition_by_including c2
    column m2
        sql avg(m1)
        partition_by_excluding c4
    column c4

In this case, the user selects c1, c4 and sum(m2). The solution is to define a table corresponding to the selection of c1, c4 and m1. bippLang automatically creates the t1_c1_c4_m1 table for you when sum(m2) is selected.

Here is the pseudo-code for created table t1_c1_c4_m1 so you can see the underlying implementation logic:

table t1_c1_c4_m1
    sql """select t1.c1 as c1, t1.c4 as c4, sum(t11.m1) as m1
        from t1
        inner join (
            select c1, c2, c4, min(c3) as m1 from t1
            group by c1, c2, c4
        ) as t11
        on t1.c1 = t11.c1 AND t1.c2 = t11.c2 AND t1.c4=t11.c4
        group by c1"""
    column c1
    column c4
    column m1
    column m2
        sql avg(m1)
        partition_by_excluding c4

Now select c1, c4 and sum(m2) from the new table using the same algorithm as the bippLang Dynamic User Selections example.

Level of Detail (LOD) Expressions

If you use Tableu’s LODs, you will find the bippLang semantics are different. LODs are non-SQL constructs that can behave differently from a mathematical/SQL perspective. bippLang handles LOD expressions completely in SQL. Tableau takes a meta-SQL approach that is not quite a SQL equivalent.

Here is an example using bippLang to perform the same tasks as an LOD in Tableau:

LOD_MIN_M ← { FIXED [ID] : MIN([M]) }

If you are using Tableau, you cannot select Region and LOD_MIN_M, as LODs must be aggregated. For this example, we are assuming that expression would work and return this result:

Region LOD_MIN_M
NA 1
NA 1
NA 100
NA 100
NA 100
NA 100000
NA 100000

If you select Region and SUM(LOD_MIN_M) in Tableau, the results might not match your expectations:

Region SUM(LOD_MIN_M) mathematical SUM(LOD_MIN_M) Tableau
NA 200302 100101

To simulate this function in bippLang, define a count(*) column, with an identical partition type and partitioning columns:

bippLang Table Definition

table T
    column ID
    column X
    column Region
    column Country
    column M
    column A1
        sql MIN(M)
        partition_by T
    column A2
        sql COUNT(*)
        partition_by T

Tableau’s behavior for selecting Region and sum(MIN(M over ID)) is simulated by:

select Region, sum(A1/A2) from T. You would truncate or round the resulting floating-point numbers as needed.