A bipp Quickstart
In this quickstart we will start by providing a step-by-step walkthrough from datasource creation to dashboard generation using bipp. As an example we will be using information related to ATP tennis players as our datasource. In this datasource we have information about ATP tournament players like their name, height, birthdate, and country. We also have current rankings for players and data from ATP tournaments held in 2019. We will use this data to visualize the following
- Country wise distribution of top rated players.
- Plot for relation of height, age to player rank.
- Wins by type of surface.
- Distribution of left Vs right handed players.
We then follow it up by creating and including additional insights to our dashboard.
Login to bipp
To start using bipp you need to create a bipp account or login using your Google account. Register here to create a new bipp account. Click here, to login using your bipp account or your Google account.
To ensure that you have different types of visualizations enabled for your tenant profile, click on the Settings main menu option followed by the Customize Tenant tab. Ensure that the Google Map visualization is enabled as shown since we are going to need it when creating sheets.
With those basics covered, let’s get started.
1. Create a new datasource
A datasource in bipp is essentially a connection to an external database. These connections can be managed on the Datasources screen that can be accessed from the main menu. To create a new datasource click on the New Datasource button on the Datasources screen. Provide the database connection details on this screen. For our example we will call our datasource
TennisData and enter the connection details to our
Players database. This is a MySQL database, hence the dialect chosen is MySQL.
After entering the required credentials, we click on the Create button to create the datasource. It will now be listed on the Datasources screen.
To ensure that bipp can connect successfully to the new datasource, use the Ping option available in the datasource menu as shown.
A connection successful message should be shown. If bipp is not able to connect to the database, an error message will be shown. Click on the Edit option to recheck the credentials entered in this case.
2. Create Project
The next step is to create a datamodel using bippLang. This can be done by creating a bipp project. You can find detailed information about the structure of a bipp project here. A project may contain multiple datasets. A dataset in a bipp project allows you to redefine the required tables and fields from the source database in bippLang. You can also define custom or derived tables and fields. In order to create a project, go to the Projects screen from the main menu. Click the New Project button on this screen. Enter the name of the project on the pop-up screen as shown. You will be asked to enter the Git repository details for the project.
You can either provide details of your own repository or select the I don’t have a repository option. A new blank project will be created on click of the Create button as shown.
The project consists of a README.md file and a bling.sample.txt file which provides a sample structure of a datamodel.
3. Create Dataset
Let us now define our dataset based on the tables in the TennisData datasource.
To start defining the dataset, click on the Files - Add - Table from Datasource option on the screen. The Add Table from Datasource screen is shown. Select the
TennisData datasource from the Datasource dropdown at the top and select the Players schema. Select tables from this schema as shown below.
Click on the Create File(s) button to return to the main screen. The bippLang definition for the included tables will automatically be generated and included in the project in seperate files. A dataset file will also be added with its name defaulted to the project name as shown.
We edit the dataset so as to transform the raw data that is coming in from our datasource and present it in our visualizations. We will now look at a few ways in which we can tailor our tennis players data. Following images show the changes made to each of the files added and to the dataset file.
Note that we have used MySQL syntax to define these transformations in SQL because we are using a MySQL database.
atp_players: Following are the changes made to this table
A PlayerName column has been created by concatenating the FirstName and Last Name of the players available in the database.
The database had the data ‘L’ for left-handed players and ‘R’ for right-handed players. A descriptive column has been introduced to help users interpret this data.
The database contains the birthdate as an
yyyymmddformat. This has been transformed to
datetimedatatype and the age of the player calculated as difference in years between birthdate and current date.
atp_rankings: A custom column RankRange has been included in this table to define a description for the rank as shown.
player_stats: Change the column name Ht to “Height” .
Dataset ATPTennisData: The table names are already included in the dataset file when we add tables. We will now define the joins between the tables so that information may be retrieved from multiple tables when viewing data.
The required changes can be made incrementally as and when you find that you need to format some data or introduce a new column. Once the changes are saved and committed, the datamodel can be deployed using the Deploy Datamodel button as shown in the following image. You can then visualize the model using sheets by clicking on the Visualize button.
If you have already created sheets corresponding to a project, few of them might be invalidated when you deploy the latest version of the datamodel. To ensure that existing sheets refer to the latest version of the deployed datamodel you will be prompted to upgrade these sheets when deploying the datamodel. Select the affected sheets or all sheets and click on Upgrade.
You can also sync your changes with the master repo from time to time using the Push to master function available at the top-left of the screen as shown below.
Similarly when multiple team members are working on the same project, changes made by others can be merged to your branch by using the Pull from master feature
4. Create Sheets
Sheets allow you to view the output of a datamodel in different ways so that you can tweak it as necessary. You can also incorporate data visualizations created in sheets on dashboards.
To access/view data for a particular dataset for the first time, you can click on the Visualize button at the top-right of the project screen and select the dataset you wish to visualize. In this case it will be the ATPTennisData dataset. Alternately datasets that have been pushed to the master and deployed can also be visualized using the Datasets main menu option as shown.
Clicking on the ATPTennisData dataset will take you to the sheet editor for the dataset. Let us now use this dataset to create the 4 visualizations that we mentioned in the introduction.
Once you are on the sheets screen, you will notice the tables and columns in the dataset listed at the left of the screen as shown below.
The Dimensions columns are displayed first followed by the measures columns. Non-numeric columns are categorized under dimension columns by default as these will be usually displayed directly or will form the X-axis of a graph or a chart. Numeric columns will be available as Measures by default as these can be aggregated using functions like Sum, Count, Average etc. However you can change the dimension column to a measure and vice versa.
Country wise distribution of top rated players
For this visualization we need the country names and number of players from each country. We add the CountryName and Id columns from the navigation panel on the left.
The aggregation Sum is applied on measure Id by default. Since we need a count of players from each country, we will change this to Count. This can be configured in the panel at the right of the screen under the corresponding Measure.
Since we want to see the distribution of only top-rated players, we include a custom filter for
Atp_Rankings_Current.Ranking <= 10in the Filter panel on the right.
Click on Fetch to see the data retrieved from the model. The table view helps us to verify that we are getting the required data to generate the visualization.
We can now visualize this data using a
gMapvisualization. For this click on the gMap tab. To generate the map, we need to specify the field to be used for mapping to countries on the map and the field to be used as a measure. In our case, the field
Countryand the field
Count Idindicates the
Measure. On applying these settings, the distribution map will be generated as shown.
We can now save our first sheet by clicking on the Save As option from the Sheets menu.
Additional sheets may be added for the same dataset by clicking on the + button on the bar showing existing sheet names.
Plot for relation of height, age to player rank
In this visualization we will use the available data on players height, age and rank to check if there is any relation between players rank and the other two parameters. Add a new blank sheet by clicking on the + (plus) button in the tab panel at the bottom of the sheets screen. We add the respective columns for Ranking, Height and Age as shown below. Change Ranking to a Dimension, and use the
Averageaggregation for Height and Age just in case the same rank is shared between multiple players. Enable Auto Format under settings to remove the decimal places for Height and Age after calculating the average.
To visualize this data, we will use a line chart that can be generated on the Bar tab. We adjust the settings as follows to get the output as a line chart.
- Ranking is mapped to the left axis by default while the two measures, Height and Age are mapped on the Y-axis.
- Line chart type is selected for both measures on the Y-axis by using the dropdown under the measure name.
- The range for height is mapped on the left side of the Y-axis while that for age is mapped on the right side.
- Different colors are used for each measure on the Y-axis.
- We limit the Y-axis range to 160 cm to 210 cm for height and 20 years to 40 years for age. This allows us to zoom into the data.
Wins by type of surface
We will now see how many matches were won on different types of surfaces (clay, grass, hard) in 2019 by top ranked players. For this we use the
RankRangecolumn that we defined in our datamodel to group different ranks together. We also
Sumthe wins by different players in the same
We can now visualize this data as a stacked bar chart as shown. The Stack All in the settings panel may be used to stack the bars one on top of the other.
Distribution of left Vs right handed players.
Finally we want to see the number of right-handed versus left-handed players. For this we just fetch the count of players for each type as shown.
We then visualize it on a pie-chart as shown.
5. Generate Dashboard
We will now see how these different visualizations could be clubbed on a single dashboard. Dashboards can be managed from the Dashboards screen that can be accessed from the main menu.
Click on the New Dashboard button to create a new dashboard. Enter the name for the new dashboard. We will call our dashboard
Click on the Create button to create the dashboard. You will be taken to the dashboard editor, where you can see the sheets created previously in the left panel. If you have sheets corresponding to multiple datasets, you can use the project or dataset dropdowns to filter them as shown
Select the sheets to add them to the dashboard. You can see them in the editor as shown.
You can preview the dashboard or save the dashboard and view it. You will be required key-in a commit message when saving the dashboard.
You can edit the dashboard using the dashboard UI editor or by editing the bippDash model corresponding to the dashboard. To edit a dashboard using the editor, click on the Edit option from the Dashboard menu on the dashboard listing screen or from the dashboard menu options when viewing the dashboard. To view/edit the bippDash model, append a “/model” in the URL for the dashboard.
Inorder to demonstrate both methods, we will use the bippDash editor to rearrange tiles in the dashboard and the Dashboard editor to add a filter to the dashboard.
We will first rearrange the dashboard tiles in the editor for a better view. For this we adjust the heights and widths (
basis) of certain tiles and change their positions in the container tile.
We increase the basis of the map and line-chart visualization types to 100 so that they can occupy the full-width available on the screen. We also set the panel visibility to hide so that the sheet names are not displayed on the dashboard.
We commit the dashboard model and view the generated dashboard after it is rearranged.
We will now add a dashboard level filter which can be applied on the visualizations. Let us say we want to filter the statistics displayed on the dashboard with respect to specific players. For this we need to add a
filter widget for
PlayerName on the dashboard. We can do this both in the bippDash editor or the UI Editor. The following image shows the use of the UI editor for adding a PlayerName dropdown. Click on the Filters tab at the left and select the table/column name for which you wish the filter to be added.
Adjust the width of the filter to occupy the full width of the screen and save the dashboard. The filter will now be displayed on the dashboard. We can select specific player names in this and view the data. Click on the Apply button to apply the selected filters as shown here.
Try it yourself
Here is a small exercise that you can try yourself with respect to the ATP Tennis data.
- Edit your project to add a custom column
Player_Statstable, that calculates the total wins as sum of clay, grass and hard court wins. Commit your changes and deploy the datamodel.
- Create a new sheet that gets data for player Ranking and TotalWins Visualize it as a bar chart and save.
- Add this sheet to the Tennis Stats dashboard and position it to occupy full width just under the gmap visualization.
Now that you are acquainted with basic bipp functions, let us learn some additional bipp features while enhancing our dashboard at the same time. For this we will be using a fourth table from our database which contains information about individual ATP matches. We follow the same path that we took the first time i.e., change the datamodel, create a sheet and then add it to a dashboard. Lets begin.
Working with parameters
Parameters are used when data output will be generated based on the values input by the user. We can then define the list of allowed values available to users through by creating a parameter in our datamodel. More information on parameters is available here
Add Table and Parameter:
We will open the project ATPTennisData. Click on the Table from Datasource option under Files/Add and select the
ATP_Matchestable from the players database.
A file corresponding to the table will be added to the project. Let us now edit our datamodel as follows
Dataset: Link the newly added table to the dataset in the dataset file as shown.
atp_players table: Our matches table has the information about tournaments and rounds when the match was played. We will include RoundType as a parameter in the atp_players table. Following is the code we add for defining a
RoundTypeparameter in atp_players file.
atp_matches table: The data in the database contains abbreviations for round names. We expand these to supply the proper names in our atp_matches table by defining the
RoundNamecolumn. We will also include a column
RoundWinthat will consider the match only if the round when it is played is the same as the value of the input parameter RoundType.
After making these changes, save and commit the project, deploy the data model and upgrade the existing sheets.
Add new sheet
Visualize the ATPTennisData Dataset next. You will see the existing sheets created in the sheets editor. You will notice that these still work as they are in no way dependent on the table or parameter we added. Add a new sheet now. To this we will add the columns
ATP_Matches->WinsByRound. Fetch the data with different values for the
RoundTypeparameter which is available in the Filters section as shown below.
Visualize this sheet as a bar chart and include a row for grand total of all wins as shown.
Save this sheet and proceed to the next step.
We will edit the TennisStats dashboard we created earlier to add this new sheet. You can add the new sheet using the UI editor. To provide the input parameter value to the sheet, you also need to add the corresponding filter widget to the dashboard. For this include the filter widget code as shown. Note the is_param variable is set to true.
Adjust the widths of the different tiles to generate the final dashboard as shown.
Working with derived tables and aliases
Derived tables allow us to add a table to the datamodel by defining a custom SQL to fetch data for the table. Aliases are used when we are not able to define a join between two tables but data from both the tables is presented on the same dashboard with the same filters applied. More information on aliases is available here.
Create a derived table
To add a derived table to your bippLang datamodel, click on the New Table option in the Files/Add menu. Name the table
MatchesAndPlayers. A blank file will be added to the editor. The
atp_matchestable in our database contains Id’s of both winners and losers. Hence we wish to join it to two instances of the
atp_playerstable, for both the winner name and the loser name. The SQL shown in the following image helps us to get both. We also define the columns that we want to expose through the data model.
We will also include this table in the dataset by changing the dataset definition file as shown. Note that we are not going to define any joins corresponding to this table.
Remember to save and commit the project, deploy the data model and upgrade the existing sheets.
Add new sheet
We will create a simple sheet with tabular data visualization and most of the key columns from the new
MatchesAndPlayerstable as shown.
Modify the dashboard
First we will just add the new sheet to the dashboard under the filters panel and save the dashboard. All looks good on the rendered dashboard as shown here.
Now select something in the Player Name filter dropdown and apply these filters. Oops! As you can see, all is not well here.
Let us now edit the dashboard to create an alias so that the filter can be easily applied. Click on the settings for the Matches sheet and click on the Aliases tab. Enter the table and column names as shown below. This will match the filter values to the WinnerName column in the sheet.
Save the dashboard and apply the filter. You can see that it now works.
You would notice that while all the other sheets on the dashboard are related to the Player Name filter, the sheet for RankRange Vs Wins on different types of Surfaces (RankRangeVsWins) should not be affected by what player names we chose in the filter. To fix this, we can use Alias such that the player name filter will be ignored for this sheet.
Edit the dashboard and go to the settings for the RankRangeVsWins sheet. Add an alias from the
PlayerNamecolumn to itself. This will ensure that any filter on this column will be ignored for this sheet.
Save the dashboard. When viewing the dashboard, apply some filters on Player Name and verify the result. You will notice that this time the RankRangeVsWins sheet is not affected by the filter.
IN THIS PAGE