We are upgrading our data infrastructure and bringing on a data lake in the second half of the year. I’m looking into different middleware pipelines for data syncing into the lake, but was curious if anyone is using make to put data into a data lake for a large data infrastructure.
This is not the direction I’m leaning, but wanted to see if anyone was doing so. I’m hesitant for a few reasons.
- the make platform has been less reliable the last 6 months so makes me nervous for more core infrastructure to rely on it.
- A lot of opportunity for operation creep if the systems we’re syncing don’t support bulk upload.