Special Columns with SQL Query, SQL Join, and Union

A recent Element Unify update addresses an issue with how special columns are handled by SQL Query, SQL Join, and Union transformations. This fix prevents unexpected behavior when building graphs but may affect some pipelines that were built before it was deployed. If you see an error like "<Column Name> has conflicting column types from the input source" on a Union or certain SQL Join transformations then read on for an explanation as well as instructions for resolving the issue.
What are special columns?
Element Unify can flag certain columns as special, indicating that the values in that column have specific meanings in the Element Graph. These special columns are required for building the Element Graph as can be seen below:
Graph v1: Templates, asset IDs, primary IDs (e.g. tag names), and attributes
Graph v2: Node IDs, node labels, relationship source/sink nodes, and relationship labels
These special columns are created by transformations starting with "Map", such as "Map Templates" and can have a default column name on creation, like TEMPLATE *
, that can be renamed using Rename Columns. Special columns in the preview pane have gray shading and a small icon in their header (see the four columns to the right below):
Some of these columns are constrained: Template and Attribute columns must agree with the template library. That is, the template exists and the attribute exists on that template. This ensures quality and consistency in the data model.
What changed about special columns?
The constraints on Template and Attribute columns are crucial. Without them, the data model will lose its consistency and, therefore, much of its value. It can also cause trouble in the Unify backend. The fix mentioned earlier removed a couple of ways that a user might accidentally create a special column with bad values.
Appending invalid values using Union or SQL Join transformations
- How this might have happened: union of two datasets where one has a special column with good values and the other has a regular column with invalid values but the same name (e.g. both columns called "Asset Name")
- Current behavior: the transformation throws a "conflicting column types" error
- How can I fix this error: add a Map Templates/Attributes transformation to the dataset without a special column, drop the original regular column, then give the auto-generated column ("TEMPLATE *" or "SENSOR *") the same name as the original. See below for a visual example of how you might make a regular "Asset Property" column special.
Altering values of a special column using a SQL Query transformation
- How this might have happened: the query includes a row-wise edit on a special column like REGEXP_REPLACE that creates invalid values
- Current behavior: the SQL Query output will remove the special-ness of any special column edited in the transformation. Similarly, special-ness is removed if you create multiple copies of a special column.
- What if I need to use SQL Query on a special column? Duplicate the special column with a Parse or Assign Values transformation and perform the query against that new regular column. See below for a simple Parse configuration that copies the values of the special column "TEMPLATE *" to a regular "templateCopy" column.
- Note: special-ness will be preserved if the query does not alter the contents of the column (e.g.
SELECT <column>
andSELECT <column> AS <newName>
are both fine).
Hopefully this helps if anyone runs into this issue. Please feel free to comment with any questions.