Although still in its infancy, there is already a lot of good resources out the for Mapping Data Flows in Azure Data Factory. I will try and keep this post updated as more blogs appear.
Mark Kromer: https://kromerbigdata.com/tag/mapping-data-flows/
Cathrine Wilhelmsen: https://www.cathrinewilhelmsen.net/tag/data-flows/
Azure Documentation: https://docs.microsoft.com/en-us/azure/data-factory/concepts-data-flow-overview
SQL Player: https://sqlplayer.net/tag/adfdf/
List of videos: https://github.com/kromerm/adfdataflowdocs/tree/master/videos
Hopefully this is a useful reference, let me know of any more in the comments.
As you may remember from previous posts, I was well into the SQL DW architecture and query optimisation that was a fundamental part of my job in bringing analytics into Snow Software as a company.
After this inaugural year in Snow, things have changed, Microsoft have moved so quickly it is hard to keep up with everything. But the one gamechanger is how Mark Kromer and his Azure Data Factory team are transforming ETL in the cloud.
They have a new paradigm in play for etl to replace on prem SSIS, and that is ADF data flows. This is a drag and drop interface that hides the complexity of one of the biggest names in the industry, Databricks.
The opportunities that using data flows to do the MPP transform vs SQLDW means that loading an Azure SQL DB (especially in hyperscale) opens up options for massive data and real time reporting that SQLDW cannot handle with its query concurrency limits.
The ADF team have been absolutely amazing in supporting requests for new features and I look forward to them implementing some of the awesome stuff they have in their pipeline (of which I may have contributed some ideas…)