Featured resources

Check out my collection of Azure Data Platform best practices! A simple blog post evolved to 25+ page guide with 75+ different recommendations.

The document aims to speed up the implementation of the Modern Data Platform in Azure. Either you are a beginner or advanced user, you are going to find valuable inputs.

Ensure you get the most out of Azure Cloud, Azure Data Lake Gen2, Azure Databricks, Delta, Azure Data Factory and other services.

The cloud computing is the future. Cloud services slash the development time, enable novel possibilities. And at the same time, expose to new risks.

I have created a security checklist for all data professionals working in Azure. Either you are a data engineer, data scientist or an architect, you will find useful tips.

Download my security checklist and don't risk your reputation with crappy implementation.

Featured articles

Making Data Scientists Productive in Azure

In this post, you will learn about Azure Machine Learning Studio, Azure Machine Learning, Azure Databricks, Data Science Virtual Machine, and Cognitive Services. 

What tools and services can we choose based on a problem definition, skillset or infrastructure requirements?

Read more

Azure Data Analytics Privacy Implementation Examples

This article covers a few privacy pattern implementation examples that you might use in your solutions.

From the engineer’s perspective, there are 3 privacy categories that we need to be aware: security, design and process automation.

Read more

A glimpse into DataOps

A rising number of companies undertaking data analytics projects eventually faced the complexity of growing data pipelines. 

Data projects have always been tricky: data pipelines becoming data silos, no collaboration or reuse, constant data quality issues, and slow delivery times.

My overview of DataOps processes and technologies.

Read more

Avoid these two Azure pricing mistakes!

Here is one case study about my two pricing mistakes that had happened in Q4 2019. I am sharing the details with you, as I hope you can avoid my mistakes.

By the way, even though my examples are about specific Azure services, you can make similar mistakes with other offerings / other cloud providers. 

Read more

Here Is a Way to Prevent a Data Breach

The cloud computing is the future. Cloud services slash the development time, enable novel possibilities. And at the same time, expose to new risks. 

Do not rely blindly on a cloud provider or the security department. Understand security threats and be able to mitigate them. 

My tips on how all team members should treat security.

Read more

Introduction to Data Governance in Databricks

Setting up Data Governance in Databricks is not straighforward. There are many moving parts that require custom implementation.

Here is my approach on security and user access, lineage, quality and data life cycle.

Read more

Read more articles about Azure, data and software engineering on my blog

Featured videos

Pondering Distributed Data Lakes Idea

My thoughts about distributed data lakes, or in other words, data mesh architecture.

I analyze data mesh, data driven design, product and platform thinking.


Data Engineering Patterns and Principles

There are patterns for things such as domain-driven design, enterprise architectures, continuous delivery, microservices, and many others.

But where are the data science and data engineering patterns?

Sometimes, data engineering reminds me of cowboy coding - many workarounds, immature technologies and lack of market best practices.

Let's investigate either we can standardize some decisions by applying patterns.


Data Governance From an Engineering Perspective

For a long time, data governance seemed to me like an empty phrase meaning everything and nothing at the same time. I was on the fence, not sure how to approach it from an engineering perspective.

In this talk, he will split data governance into into “bite-size” pieces. What are the tools, processes, and skill needed to build a compliant data platform?