My ex company had for more than 10 years keept all the data customers shared with us. Structured and standardized, should have been easy peasy.
Somehow they were “appending wrong” in some way and data was useless. In think they were trying to reduce the size by aggregating a bit, but they did in a way that rendered the data useless.
Of course the CEO wanted to train models with it anyway…
10 years and no one bothered to pull some information at random? I mean generally companies have a schedule of assessments to ensure records. Even if it’s as simple as checksum.
The thing is they had data that expected to be slightly aggregated, do not a 1:1. The problem comes when you try to use the data for analysis and realize it didn’t make any sense
My ex company had for more than 10 years keept all the data customers shared with us. Structured and standardized, should have been easy peasy.
Somehow they were “appending wrong” in some way and data was useless. In think they were trying to reduce the size by aggregating a bit, but they did in a way that rendered the data useless.
Of course the CEO wanted to train models with it anyway…
10 years and no one bothered to pull some information at random? I mean generally companies have a schedule of assessments to ensure records. Even if it’s as simple as checksum.
The thing is they had data that expected to be slightly aggregated, do not a 1:1. The problem comes when you try to use the data for analysis and realize it didn’t make any sense
I like train models