fr i was working with my team recently and we hit this issue where data inconsistency became really problematic. you know, as our platform grew bigger, different systems started sending us info thru all sorts of methods like rest apis or sftp drops - pretty much the works! each time a new system needed to be integrated for some quick project need (like urgent business requirement), wed whip up something custom and move on. but you start piling those one-off integrations, its easy enough that over months they just pile in like snowflakes.
i mean seriously though - how do y'all handle this? ive been thinking abt building a reusable framework to tackle these issues head-on instead of always starting from scratch every time something new comes up. anyone tried smth similar or have any tips on how you guys are managing data consistency across your lakehouse environments without going crazy with custom solutions each and every round?
found this here:
https://dzone.com/articles/reusable-api-ingestion-framework-lakehouse