Tuesday, September 13, 2022

Don't Assume the Data Type

 This happened to me this week while working on a data migration. 

The source system stored contacts in a table with an integer id column. Pretty standard stuff. We also had another bunch of related people that were stored in a different table, so the approach was to load the contacts table into the destination migration table, get the max id and add 1000, then insert the people from the other tables, creating new ids and doing the associated relationships.

It didn't work as expected. It seemed to work, but when we loaded the migration data into the test system there was all kinds of weird groupings.

The problem? The destination table data type was varchar rather than int, so max did a text sorted max, rather than a numeric one (but helpfully allowed adding 1000 to the answer). Result? 999 was deemed bigger than 14400 which completely messed up the relationships.

Easy fix to cast to numeric before the max, and a good reminder to not assume data types just by looking at the results of a select.

No comments:

Post a Comment

Azure Data Factory Metadata-Driven Pipelines - 1

 I've had a bit of free time work wise lately so figured I'd finally get onto revamping one of our Azure Data Factory (ADF) extract ...