If we go back in the recent past to 2014, we would see very highly skilled software engineers deliver data pipelines using a variety of low-level tools.
As these projects have matured a number of other open source tools (I have used the term West Coast for this) have appeared onto the market in order to bring some enterprise stability and management over the data pipelines. These would cater for relatively mature enterprise capabilities like data pipeline execution, monitoring and reruns.
Over the past 30 years there has been a stable enterprise capability (I like the term East Coast for this) in this area with products such as BMC Control-M and IBM Tivoli. What is not well known is that these enterprise products also provide Big Data pipeline execution, monitoring, and rerunning capabilities.
Utilizing enterprise capabilities that companies already have in their application stack will reduce the reliance on new open source products or home-grown capabilities, increasing reuse of current license agreements and also de-risking Big Data projects.