This is an advanced topic and requires not only proficiency in working with the Coalesce platform, but also proficiency in the YAML, Jinja2, and SQL languages.
Before embarking on building and configuring your own Custom Node Types, work with your Coalesce team to determine if they have already built a Custom Node Type to fulfill your use case. If you're unsure who to reach out to we encourage you to open a support request.
Introduction to Custom Node Types (UDNs)
Custom Node Types, also known as User Defined Nodes or UDNs, allow you to extend your Coalesce data pipeline for use cases that are not addressed by the out-of-the-box Node Types provided with every new Coalesce organization. Custom Node Types offer unparalleled flexibility in handling numerous data modeling scenarios such as Streams, Dynamic Tables, and External Tables.
In the following video, we delve into a visual tour of Custom Node Types and explain how you can use them to accelerate your data pipeline automation.
Creating Custom Node Types
To create a Custom Node Type in Coalesce:
- Navigate to the Project in which you wish to add the Custom Node Type, and launch the Workspace you wish to add it to.
- From within the Workspace, navigate to the Build Settings, which is activated from the gear icon located at the bottom of the left sidebar.
- From here, you can:
- Duplicate an existing Node Type to use it as a starting point to modify according to your needs
- Create a net-new Node Type to build a Custom Node Type from scratch
Components of Custom Node Types
Once in the Node Type editor, you'll be introduced to the core components of the Node Type, which include the:
- Node Definition. Written in YAML, the Node Definition is where you specify the UI elements and other configurations that should be available for Nodes of this Node Type. This includes aspects like the node's color, its materialization options (ie: table or view), its business keys, and more.
- Create Template (optional). Written in Jinja 2 wrapping SQL, the Create Template defines how the Data Definition Language, or DDL, is written for Nodes of this Node Type, ensuring consistent methodology and formatting of the DDL code across all Nodes of this Node Type. DDL defines the structure of the object in the database, and is leveraged during Create (for Workspaces) / Deploy (for Environments) operations. This is an optional component, relevant only to Nodes that require a DDL operation in the database.
- Run Template (optional). Written in Jinja 2 wrapping SQL, the Run Template defines how the Data Manipulation Language, or DML, is written for Nodes of this Node Type, ensuring consistent methodology and formatting of the DML code across all Nodes of this Node Type. DML defines how the data is transformed and loaded in the database, and is leveraged during Run (for Workspaces) / Refresh (for Environments) operations. This is an optional component, relevant only to Nodes that require a DML operation in the database.
Prerequisites for Building and Configuring Custom Node Types
Aside from having proficiency in working with the Coalesce platform, because Custom Node Types are defined in YAML (Node Definition) and Jinja2 + SQL (Create and Run Templates), this is an advanced endeavor and it is essential that a developer undertaking this activity is proficient in these languages.
We have provided several resources below to assist you with upskilling in each language. Note: these resources are provided via third parties not associated with Coalesce.
- YAML
- What is YAML? A beginner's guide via CircleCI's blog
- YAML 101 via Wriju Ghosh's blog
- YAML 101 via Micah Akpan's blog
- YAML - Basics from tutorialspoint
- Jinja2
- Intro to Jinja via a blog by Fabio Barbazza
- Jinja2 Templating via a blog by Hasini Witharana
- Jinja2 Tutorials via Przemek Rogala's blog
- SQL
- Playground for testing YAML definitions + Jinja templates via Infrastructure as Code.
Advanced Custom Node Type Functionality
Custom Node Types offer extensive control, flexibility, and extensibility, allowing database developers to cater to specific needs within their data pipelines.
The detailed configuration of a Custom Node Type includes, but is not limited to, the following tasks:
- Specifying the Coalesce UI elements available across all Nodes of this Node Type
- Defining the materialization options and SQL configuration based on the materialization type
- Customizing the behavior of as well as defining additional system columns
- Pre-populating transform logic
- Supporting non-table/view Snowflake object types
The following video gives a detailed walkthrough of advanced Custom Node Type functionality, and demonstrates how Node Types can be validated on the fly as they are developed.
Custom Node Type Metadata Schema
Custom Node Types have an underlying metadata schema that is leveraged in the Node Definition, Create Template, and Run Template. This dynamically defines and executes behavior based on inputs provided by the front-end developers creating and configuring Nodes of the given Node Type.
The Coalesce Product Documentation on Node Config Options and User Defined Nodes includes foundational information about this metadata and its usage. This documentation is currently undergoing enhancements that will expand the schema documentation and provide contextual usage examples If you have further questions about building and configuring Custom Node Types we encourage you to open a support request so our team can connect you with the resources relevant to your requirements.
Conclusion
While building and configuring Custom Node Types in Coalesce is an advanced endeavor that requires proficiency in YAML, Jinja2, and SQL, this journey introduces infinite flexibility and extensibility into your data pipelines. We encourage those with the necessary technical expertise to delve into this sophisticated area while relying on the Coalesce Product Documentation for assistance. For those looking for Custom Node Types but without the technical expertise on staff to build them, we're here to help! Please reach out to our team so we can get you connected with the relevant resources.