SQL Ravings and Other Stuff

Wednesday, January 21, 2026

Local Testing an Azure Function with a Timer Trigger

Following on from the last post on Azure Function Deployed But Not Visible, when investigating this I wanted a way to test the function without having to deploy it to Azure. Here's how I did it, using Visual Studio Code and Postman API.

Extensions

First up was to install a couple of extra extensions in VS Code; Azurite and Azure Functions Core Tools.

Azurite is an extension that allows "mocking" Azure Blob Storage functionally and can be installed from the VS Code Extensions tab. More info on installing can be found here, https://learn.microsoft.com/en-us/azure/storage/common/storage-install-azurite?tabs=visual-studio-code%2Cblob-storage, and the page has links to other pages on what Azurite is and how to use it.

Azure Functions Core Tools is the tool set that lets you run functions locally. You can try installing by hitting the F1 key then choosing Azure Functions: Install or Update Azure Functions Core Tools but may find you need to download and install manually. The download and further instructions can be found here, https://learn.microsoft.com/en-us/azure/azure-functions/functions-run-local

Running the Function

Before we can run the function we need to start the function runtime environment. To do this, hit the F1 key then find and select Azure Functions: Start. When this runs you'll get prompted to log into Azure and select the subscription and resource group you want to use. Alternatively enter func start in the terminal window. I've found the Azure Functions: Start command a bit hit and miss, so using func start seems to work more consistently.

Note that for Python, you'll need to start the Python virtual environment if using that config, otherwise you'll get errors about modules not being found. This will be something like running & <project_path>/.venv/Scripts/Activate.ps1 in the terminal.

Postman

Once the function is running in VS Code you'll be able to send a Postman request to trigger the function. By default the function runs on port 7071 (I don't know how to change this, as this worked fine for me) so the call for Postman will be a POST request to http://localhost:7071/admin/functions/function_name, e.g. http://localhost:7071/admin/functions/MyTestFunction.

Authentication isn't needed, but there does need to be a body, so if the function doesn't take parameters then just add empty brackets to the Postman request body.

Here's how the request should look

All going well, Postman will get an HTTP 202 response, and function output will appear in the VS Code terminal window.

Settings etc.

The functions local runtime uses settings in the local.settings.json file, so these might need to be changed temporarily for testing. Also, if using service principals or managed identities for any resource authentication these will probably fail, so updating these to a different auth (e.g. SAS key or SQL login) might be needed for testing.

Friday, December 12, 2025

Azure Function Deployed But Not Visible

Playing around with Azure functions lately and struck a weird issue when trying to get the function running up in Azure (as opposed to just testing on my laptop).

This particular function is a Python script, developed and deployed from Visual Studio Code to an existing Function App. The function behaved correctly in the VS Code dev environment, and appeared to deploy fine with no errors, however when checking in the Azure Portal there was no function to be found. Deleting and recreating the Function App didn't help

The fix was to add "AzureWebJobsFeatureFlags": "EnableWorkerIndexing", to the local.settings.json file, so it looks something like

{
  "IsEncrypted": false,
  "Values": {
    "FUNCTIONS_WORKER_RUNTIME": "python",
    "AzureWebJobsFeatureFlags": "EnableWorkerIndexing",
  }
}

In addition, the .funcignore file was excluding local.settings.json, so I removed this as well.

Once this was done and the function redeployed everything turned up as expected

Friday, September 12, 2025

SSIS For Loop AssignExpression Error

Just a quick note on a frustrating error adding an AssignExpression to an SSIS For Loop Container.

The expression being added was

@[User::LoopCount] = @[User::LoopCount] + 1

which looks fine, but continually gave an error of "the equals (=) sign at position 20 was unexpected".

Changing the variable formats, and other syntactic changes didn't help. And the cause?

A space at the front of the expression. If I'd counted to position 20 I might have found it earlier, as the = sign was at position 19, but then again, probably not. Removing the space solved the error and allowed the loop to run.

Wednesday, August 20, 2025

Azure Data Factory Metadata-Driven Pipelines - 2

As mentioned in the previous post, we're using some custom tables behind our metadata-driven pipeline to provide some flexibility and hopefully allow it to be expandable. The schema is still a work in progress, and has some obvious limitations that could be improved on once we have some more time available.

Overview

High-level, the solution has some tables to hold the entities/tables that are to be loaded, along with column mappings, "high water" values for delta loads, and any pre-copy scripts to be run. Here's the ERD.

Table	Purpose
DataFeed	High level data feed details
DataFeedEntity	Tables to load
ColumnName	Column names to cut down duplication in table rows
DataFeedColumn	Source and destination columns
DataFeedEntityScript	The pre-copy script to run if required
HighWaterValue	Column and value used for delta loads

Most of the tables have an "Active" column which provides flexibility of which entities and columns to include in the ADF copy activities.

On top of the tables is a view which is used to present the table data to the ADF pipeline. In theory it should be possible to have multiple views for different data loads, but we haven't tested this yet.The easiest way to create the view is to use the table that the ADF wizard creates then recreate the table output in the view. I'll include our current version below for reference. There's also a stored proc to update the high water values, and a table valued function which gets a count of active entities to be loaded.

Limitations

There're a few baked in limitations in our solution which were design choices based on our loads and keeping it simple (ish) to start with. These should all be easy to adapt and we'll probably look at that in the down time. The key limitations are:

we assume the destination table has the same name as the source entity
also assume that the target schema is dbo. This was a bit of laziness and we'll probably add a destination schema column to the DataFeedColumn table in the near future
the data type columns in DataFeedColumn are the ADF data types, e.g. String rather than varchar. We have a separate mapping table that we use when populating the table, but this could be added to the schema
We don't store connections in these tables, which could be a useful enhancement

Example SQL View

SELECT CONCAT (
            N'{
            "entityName": "',
            dfe.DataFeedEntityName,
            N'"
        }'
            ) AS SourceObjectSettings,

        NULL AS [SourceConnectionSettingsName],

        NULL AS [CopySourceSettings],

        CONCAT (
            N'{
            "schema": "dbo",
            "table": "',
            dfe.DataFeedEntityName,
            N'"
        }'
            ) AS[SinkObjectSettings],

        NULL AS [SinkConnectionSettingsName],

        concat(N'{
            "preCopyScript": ', ISNULL(QUOTENAME(pre.ScriptBody, '"'), 'null') ,
            ',
            "tableOption": null,
            "writeBehavior": ', CASE WHEN dfe.LoadType = 'FullLoad' THEN '"insert"' ELSE '"upsert"' END, ',
            "sqlWriterUseTableLock": true,
            "disableMetricsCollection": false,
            "upsertSettings": {
                "useTempDB": true,
                "keys": [
                    "', keycols.KeyColumnName, N'"
                ]
            }') AS [CopySinkSettings],

        REPLACE(CAST(N'{
            "translator": {
                "type": "TabularTranslator", 
                "mappings": [{X}]
             }
          }' AS NVARCHAR(max)), N'{X}', ca.X) AS [CopyActivitySettings],

        N'MetadataDrivenCopyTask_ftq_TopLevel' AS [TopLevelPipelineName],

        N'[
            "Sandbox",
            "Manual"
        ]' AS [TriggerName],

        CONCAT (
            N'{
            "dataLoadingBehavior": "',
            dfe.LoadType,
            N'",',
            N'"watermarkColumnName": "',
            cm.ColumnName,
            N'",',
            N'"watermarkColumnType": "DateTime",',
            N'"watermarkColumnStartValue": "',
            convert(VARCHAR(40), hv.TimestampValue, 126),
            N'"',
            N'}'
            ) AS [DataLoadingBehaviorSettings],

        dfe.EntityGroup as [TaskId],

        dfe.Active [CopyEnabled],

        ROW_NUMBER() OVER(ORDER BY dfe.[EntityGroup], dfe.[DataFeedEntityId] DESC) AS RowNumber,

        dfe.DataFeedEntityId

    FROM adf.DataFeedEntity dfe
    JOIN (
        SELECT c.DataFeedEntityId,
            STRING_AGG(N'{"source":{"name":"' + cast(c.[SourceColumnName] AS NVARCHAR(max)) + N'",
            "type": "' + c.SourceDataType + '"},"sink":{"name":"' + cast(c.[DestinationColumnName] AS NVARCHAR(max)) + '"}}', ',') X
        FROM adf.vw_ColumnMapping c
        GROUP BY DataFeedEntityId
        ) ca ON ca.DataFeedEntityId = dfe.DataFeedEntityId
    LEFT JOIN adf.HighWaterValue hv ON dfe.DataFeedEntityId = hv.DataFeedEntityId
    LEFT JOIN adf.ColumnName cm ON hv.ColumnNameId = cm.ColumnNameId
    LEFT JOIN (
        SELECT  c.ColumnName as KeyColumnName, 
                dfc.DataFeedEntityId
        FROM [adf].[DataFeedColumn] dfc
        JOIN [adf].[ColumnName] c ON dfc.DestinationColumnNameId = c.ColumnNameId
        WHERE dfc.IsKey = 1
    ) keycols ON dfe.DataFeedEntityId = keycols.DataFeedEntityId
    LEFT JOIN adf.DataFeedEntityScript pre ON dfe.DataFeedEntityId = pre.DataFeedEntityId
        AND pre.ScriptType = 'PreCopy'
    WHERE dfe.Active = 1

Part 1: Azure Data Factory Metadata-Driven Pipelines - 1

Thursday, June 26, 2025

SSIS Azure SQL Connection Login Failure

Issue

Running an SSIS package using environment variables failed when trying to connect to an Azure SQL database. The error returned pointed to a password issue with the SQL user.

Solution

There was a couple of issues at play here.

The server name component of the connection string needed to have the full Azure database name, i.e. my-sql-server.database.windows.net rather than just my-sql-server
We also had to add the Persist Security Info=true parameter to the connection string. I think this is because the user we connect as is a contained user to the specific database.

Also worth noting that as we use a contained user the database needs to be specified in the connection string

Wednesday, June 18, 2025

Azure Data Factory Metadata-Driven Pipelines - 1

I've had a bit of free time work wise lately so figured I'd finally get onto revamping one of our Azure Data Factory (ADF) extract processes.

This particular one gets data from our Dynamics CRM and exports to the landing stage of our data warehouse. It was stood up in a hurry due to Microsoft retiring the previous, CRM driven, export process, and us internally missing the notifications so that by the time we realised what was happening the drop-dead date was almost upon us.

It's a pretty simple process which just dumps data from selected Dynamics entities into SQL tables - no transforms and very little filtering, and has been pretty stable though we have had to filter a couple of the entities due to size, and also reduce the number of attributes/columns as our customisations have resulted in 2 particular entities having > 500 attributes.

ADF Metadata-Driven copy task

What I've decided on is to update this to use the Metadata-Driven copy task, as well as cleaning up the extract attributes and implementing some better filtering to reduce the daily data extract quantity.

There's already a lot of info out there about the metadata-driven task, so I'm not going to get too into it here. Basically it will allow effectively a single process that will cycle through all the entities to be exported, rather than creating multiple datasets and copy tasks.

I always forget how to get to it though, so here's how. Once you're launched the ADF Studio from the portal you create a new task by creating a new Factory Resource and choosing the Copy Data tool.

This will open a window/blade that lets you select Metadata-driven copy task as the copy task.

A few notes about this - some of my decisions during the create will be explained later:

It needs an existing database to create the SQL objects needed for storing the task metadata. The dataset can be created from the blade but the SQL Server and database need to already exist and be accessible to the data factory
If you want the pipelines to run on a specific runtime environment that needs to already exist
Source and destination connections can be created during the wizard steps, or you can use existing ones
For this setup I was intending to customise it afterwards, so I just selected 2 tables to use as examples/templates
I chose Configure for each table separately as the loading behaviour, and then Delta load for one table and left the other as Full load. For delta loading you need to select a Highwater column which will be used to track whether changes have occurred.
Allow the wizard to generate column mappings
Also set the destination properties. We want to use Upsert for our Write behaviour so set this with the appropriate key. You can also add a Pre-copy script for example purposes and any other settings that you think might be useful later on

Once all the selections have been made the wizard will create the ADF resources such as pipelines and datasets, as well as the SQL database tables and stored procs.

From the ADF side this is 3 pipelines named xxx_TopLevel, xxx_MiddleLevel, xxx_BottomLevel. The top level pipeline is the main orchestrator and the one which will be scheduled to run. This calls the middle level, which then calls the bottom level one to do the actual export and destination population.

The heart of this SQL side is a table which contains a row for each entity that's being extracted, with columns of JSON values of the metadata that ADF interprets to run the pipelines.

Limitations and Next Steps

I might be missing something, but updating the copy task for changes to existing items or adding new ones looks to be a bit of a chore. Seeing as I have time on my hands I'm going to try altering the generated objects to allow using some custom tables - sort of a metadata-driven metadata-driven task. More on this in the next post

Friday, May 9, 2025

SSIS ScriptComponent Outputs

Something else for the "Stuff I Always Forget" category.

There's a couple of tricks when using the SSIS Script Component as a data source.

Nulls

If one of the outputs is null you need to set the _IsNull property to true

(N)Varchar Max

Max string columns need to be output using the Column.AddBlobData(System.Text.Encoding.Unicode.GetBytes(data)) format.

Note that this needs to be Encoding.Unicode not Encoding.UTF8 otherwise you get the infamous odd byte number error

DateTime

DateTime output column data types should be of type database timestamp [DT_DBTIMESTAMP], even for datetime2 otherwise you get an overflow error

Example

ApiOutputBuffer.AddRow();
// Check nulls
                    if (string.IsNullOrEmpty(fullEndpoint))
                    {
                        ApiOutputBuffer.Endpoint_IsNull = true;
                    }
                    else
                    {
                        ApiOutputBuffer.Endpoint = fullEndpoint;
                    }
// Nvarchar max
                        ApiOutputBuffer.Response.AddBlobData(System.Text.Encoding.Unicode.GetBytes(content));
 

                    ApiOutputBuffer.ResponseDateTime = DateTime.Now;