HomeLandscapeAbout me

Install DBT on Windows to target Azure SQL and Synapse - Part 1

By Valdas Maksimavicius
Published in Data Engineering
January 25, 2021
3 min read
Install DBT on Windows to target Azure SQL and Synapse - Part 1

Inspired by the growing popularity of Data Build Tool (DBT), I decided to set it up on my PC and explore the tool against Azure storages. You can read more about DBT here.

The series of articles about DBT is a part of my Azure Data Platform Landscape overview project.

DBT in Azure Data Platform
DBT in Azure Data Platform

  • Part 1 - Getting Familiar with DBT and get started with Azure SQL - You are here
  • Part 2 - Use DBT with Azure Synapse

dbt does the T in ELT (Extract, Load, Transform) processes – it doesn’t extract or load data, but it’s extremely good at transforming data that’s already loaded into your warehouse.

DBT also has DBT Cloud version with a nice UI. However, why not to try the VSCode and DBT CLI command line experience? :)

There are a few option to get DBT installed on Windows:

  • Local installation
  • Docker
  • Windows Subsystem for Linux

The installation steps are based on DBT installation instructions with PIP using local installation for the DBT version 0.18.1.

This blog post is a documentation of steps taken, with all the errors, to help you solve issues if you face similar problems

Local installation

1. Install Python (I am using Python 3.7.9)

Be aware that DBT version 0.18 has issues with Python 3.9

2. Setup python virtual environments to isolate installations

 C:\dev\dbt>python -m venv dbt-env

 C:\dev\dbt>.\dbt-env\Scripts\activate.bat

 (dbt-env) C:\dev\dbt>

3. Install DBT

 (dbt-env) C:\dev\dbt>_pip install dbt   

After a while I receive an ERROR: pyopenssl 20.0.1 has requirement cryptography>=3.2, but you’ll have cryptography 2.9.2 which is incompatible.

542e45aed6e9ae12ccdda4340731833cd84669f1

Though installation seems to complete successfully. I try running pip install cryptography, all packages seem to be fine.

Let’s keep this in mind if SSL connections starts fail at the later stages.

f642a9bebc8dc442d601f5f8cd3f49211e2e26e7

d75c46cc19c9090d4ce33dc2cef22e80ee93df7b

4. Initialize demo project

915e620e430854ff000f9aede560d60d45819d6c

And here is my example project ready in VSCode

2e53ba500ff88a881a039ae24e11a53bd3bad93a

Connect to Azure SQL

5. Install dbt-sqlserver adapter

In my demo, I want to connect to Azure SQL. The adapter for SQL is not an official version, so I need to install it manually. You can use other, native targets, e.g. Snowflake, Redshift, or BigQuery.

 pip install dbt-sqlserver

6. Create Azure SQL instance

fe9d30f5d2a4913d50bce14559615aba61f2872d

7. Configure profile to include Azure SQL connectors

 start C:\Users\<<your directory>>\.dbt

The default profiles.yml file contains only generic properties for Redshift. The configuration file contains placeholders for development and production environment. You can have more outputs if needed.

28194d07791f64e9c3347af64c02e519af7371c3

For my demo, I want to add our target database specific connection and modify dbt_project.yml file. 

fcfedbe413fc3ad6be05696ec4c8d0a1728ae8ec

Tiny thing to remember about the driver (left window, line 7) - make sure it has corresponding entry in the ODBC Data Source Administration view. If you miss the driver, install the latest version from the Microsoft repository. I use SQL Server, but there might be other names, depending on package version.

28bae8b780bc65f7d4b0c8943dc98c8eb3e8884f

8. Check connectivity with your target database

 dbt debug

75a98a970d086ec932005d8e8fdcf226949f7a3c

9. Modify example DBT models

DBT has two example models located in models/example directory. 

My_first_dbt_model will be materialized as a table, and my_second_dbt_model should be deployed as a view.

4fe0f276a3a48d1b543ff648f1f6e9414ee87b1f

As I am using basic Azure SQL edition, which does not support columnstore indexes, I need to add “as_columnstore: false” setting (left window, line 13)

10. Deploy models to Azure SQL

Before any deployment done, here’s a view of my Azure SQL objects (table is empty)

20d3afcd40eba47c2922918bc20e8308de39545a

To deploy our models and populate with data, execute command:

 dbt run

It should create schema change scripts for all models, and also move data.

a3652edbeac7be61fcda61e24e8c9ea5f7109120

Our Azure SQL database finally contains the objects with some dummy data

24e7476513325106982a0f1213e1350f025a1bbd

11. Generate and serve docs

At the end, I would like to verify either document generation and serving works well. First, execute:

 dbt docs generate

The function collects all metadata about models and dependecies, and saves it in a json document called - catalog.json

640f1b231a1be05991ad9b65c58014018763906e

What I would like to do then, is to explore my documentation in a browser window. 

 dbt docs serve

It spins up local server and opens a browser.

6a116f49b4f63d51a095f157d10243255c85fb46

And here is all our documentation, schemas and lineage (running on localhost).

9e1b78f3fef93b74ba99354b42ee60749c32bde3


In part 1 we kicked things off with DBT on Windows. You have instructions how to install DBT and connect to Azure SQL. Read part 2 where I explore how DBT works with Azure Synapse.


Tags

#data ops#engineering#ELT

The latest set of Azure Data Platform best practices - April 2021

A simple blog post evolved to 25+ page guide with 75+ different recommendations.
Download Document
Previous Article
Tools in the Data Management Zoo
Valdas Maksimavicius

Valdas Maksimavicius

IT Architect | Microsoft Data Platform MVP

Resources

ADVERTISE WITH US

Topics

Data Architecture
Data Engineering
Data Governance
Miscellaneous

Related Posts

Data Engineering Patterns and Principles
March 21, 2021
1 min
© 2021, All Rights Reserved.

Quick Links

About mePrivacyContactLandscape

Social Media