A zero-dependency, browser-based SQL Studio for writing, testing, visualising, and monitoring streaming pipelines — connects to any Flink SQL Gateway.

Single HTML file MIT License Flink 1.16+ Live Kafka results Kubernetes Operator Docker ready Confluent Cloud Admin Session PDF Reports
Get Started → Open IDE ↗ GitHub ↗

About Str:::lab Studio

Str:::lab Studio is a browser-based SQL Studio that connects to any Flink SQL Gateway. It provides a complete development, monitoring, and administration experience — without the overhead of a full platform deployment. Works with local Docker, Kubernetes, and any self-hosted Flink cluster.

What it is

A zero-dependency single-file HTML application served by nginx. All SQL execution is driven by the Flink SQL Gateway REST API. Results stream live into the browser. Session state is managed client-side in localStorage.

Who it's for

Streaming data engineers and platform teams who want a proper IDE experience for Apache Flink SQL — query editing, live result streaming, real-time job monitoring, checkpoint tracking, and professional reporting — all from a browser.

Open source

Created by Nestor A. A — open source project under the MIT license. Community contributions and issue reports welcome at github.com/coded-streams/strlabstudio.

Kube Operator

The Str:::lab Studio Kube Operator manages Studio deployments as first-class Kubernetes CRDs. Install via Helm from strlabstudio-operator.

Flink 1.16+ Zero dependencies MIT License Docker · K8s · Cloud Confluent Cloud

Quickstart

git clone https://github.com/coded-streams/strlabstudio
cd strlabstudio
docker compose up -d
open http://localhost:3030

Starts: JobManager, TaskManager(s), SQL Gateway, CORS proxy, and the Studio nginx container. On first load, a Tips & Concepts modal walks you through key features.

python3 -m http.server 8080
# Open studio/index.html — use Direct Gateway mode
helm repo add strlabstudio https://coded-streams.github.io/strlabstudio-operator/charts
helm install strlabstudio-operator strlabstudio/strlabstudio-operator \
  --namespace flinksql-system --create-namespace

kubectl apply -f - <<EOF
apiVersion: codedstreams.io/v1alpha1
kind: StrlabStudio
metadata:
  name: my-studio
  namespace: flink
spec:
  gateway:
    host: flink-sql-gateway
    port: 8083
  jobmanager:
    host: flink-jobmanager
    port: 8081
EOF
💡
First thing every session Open Snippets → CONFIG → ⚡ Recommended Streaming Config and click Insert. Sets runtime mode, parallelism, checkpointing, state TTL, and MiniBatch in one shot.

Connection Modes

ModeBest ForAuthentication
Proxy ModeLocal Docker or Kubernetes clusters. Routes through nginx reverse proxy — zero config required.None
Direct GatewayRemote clusters, Confluent Cloud, Ververica, AWS Managed Flink. Enter gateway URL directly.Bearer token, Basic Auth, or None
Admin Session 🛡Platform operators and administrators. Full cluster job visibility, advanced reports, cross-session oversight.Admin passcode (default: admin1234)
🛡
Existing Sessions — auto-loaded On the connect screen, click Test Connection first — the IDE will auto-populate a dropdown of existing sessions on the gateway so you can reconnect without copy-pasting a UUID.

🏗Architecture

Str:::lab Studio is a static HTML file served by nginx. All SQL execution happens via the Flink SQL Gateway REST API. The browser manages all session state client-side.

flowchart TD B["🌐 Browser\nlocalhost:3030"] N["nginx\nflink-studio container"] GW["Flink SQL Gateway :8083"] JM["Flink JobManager :8081"] TM1["TaskManager 1 · 10 slots"] TM2["TaskManager 2 · 10 slots"] K["Apache Kafka"] S["MinIO / S3"] B-->|"/flink-api/"|N B-->|"/jobmanager-api/"|N N-->|"proxy"|GW N-->|"proxy"|JM GW-->JM JM-->TM1 JM-->TM2 TM1-->K TM2-->K TM1-->S style B fill:#1a2130,stroke:#00d4aa,color:#f0f6fc style N fill:#1a2130,stroke:#4fa3e0,color:#f0f6fc style GW fill:#1a2130,stroke:#9b72cf,color:#f0f6fc style JM fill:#1a2130,stroke:#9b72cf,color:#f0f6fc style TM1 fill:#1a2130,stroke:#3fb950,color:#f0f6fc style TM2 fill:#1a2130,stroke:#3fb950,color:#f0f6fc style K fill:#1a2130,stroke:#f0a500,color:#f0f6fc style S fill:#1a2130,stroke:#f0a500,color:#f0f6fc

State & Session Management

StateStorageSurvives Reload?
Editor tabs & SQL contentlocalStorage (auto-saved)✅ Yes
Query historylocalStorage (auto-saved)✅ Yes
Theme preferencelocalStorage✅ Yes
Admin passcodelocalStorage (encrypted default)✅ Yes
SQL Gateway session handleIn-memory only❌ Reconnect required
Result rowsIn-memory only❌ Re-run query
Job–session mappingIn-memory (session lifetime)❌ Lost on reload
Full workspaceJSON file (export/import)✅ Via import

All Features

Editor

🗂

Multi-Tab Editor

Write SQL across unlimited named tabs. Each tab has isolated editor content, results, logs, and performance metrics. Double-click to rename.

▶️

Run Selection

Highlight any portion of SQL and click ▶ Selection to run only that fragment. Essential for multi-statement tabs.

🔁

Format & Explain

Auto-format normalises whitespace. EXPLAIN Plan shows Flink's logical and physical execution plan before job submission.

🚫

Duplicate Submission Guard

The IDE tracks running INSERT INTO statements. If you attempt to submit the same pipeline twice, execution is blocked with a warning. Check the Job Graph before resubmitting.

Sessions

🔄

Session Persistence

Sessions never expire from the IDE side due to inactivity — the heartbeat keeps them alive. Only explicit disconnect or connection loss ends a session.

📋

Load Existing Sessions

After testing the connection, the IDE fetches all existing sessions from the gateway and presents them in a dropdown — no UUID copy-paste needed.

🔗

Session-Scoped Jobs

Each regular session sees only the jobs it submitted. This prevents clutter from other users' pipelines. Admin sessions see all jobs across all sessions.

🏷

Job Tagging

Every INSERT INTO pipeline automatically tags the Flink job name with the session label via pipeline.name, making jobs traceable in the Flink UI and Str:::lab Studio.

Job Graph

🔀

Live DAG

Real-time directed acyclic graph of any running Flink job. Nodes show operator type, parallelism, and record counts updating live.

🔍

Node Detail Modal

Double-click any node for full metrics: status, records in/out/s, backpressure, JVM heap, subtask table. Includes a live throughput streaming chart.

Cancel Job — with Confirmation

Cancel button shows a confirmation dialog ("Are you sure?") before sending the cancel signal to the JobManager. Cannot be undone after confirmation.

📅

Vertex Timeline

Timeline view shows each operator's state transitions (CREATED → RUNNING → FINISHED) as a Gantt-style chart. Click any bar for subtask status detail.

Performance Monitoring

📊

Per-Query Timing

Every statement timed from submission to first result. Bar charts show relative query durations with row count and status.

🔋

Checkpoint Health

Checkpoint tab shows total/completed/failed counts, last checkpoint duration and state size, and a history sparkline. Configures with one snippet.

🖥

Cluster Resources

Cluster tab shows CPU cores, physical memory, JVM heap, Flink version, task manager count, and slot utilisation gauges from the JobManager REST API.

📈

Live Throughput

Throughput tab shows records in/out per second as animated sparklines with min/max labels. Toggle ▶ Live for auto-refresh every 2 seconds.

Reports

📄

Session Report (Standard)

Generate a PDF report of your session: queries executed, timing history, job list, cluster info, and checkpoint metrics. Includes the session ID and timestamp.

🛡

Admin Report — Technical

Deep technical report: all active sessions, per-session job attribution, vertex-level metrics, JVM heap and CPU stats, checkpoint durations. For engineering teams.

📊

Admin Report — Business / Management

Self-explaining metrics report with plain-language annotations. Focuses on resource utilisation, job health percentages, and throughput summaries. No SQL jargon.

🖨

Job Graph Report

Per-job PDF report from the Job Graph tab — captures the DAG topology, vertex states, record counts, backpressure levels, and checkpoint history for a single job.

Themes & UX

Project Manager

Save, load, run, and delete named Str:::lab projects. Each project stores all editor tabs, SET configuration, catalog context, and metadata. Export to JSON for sharing or backup. Tracks storage usage per project.

UDF Manager

Browse all registered functions, register JAR-based or Python UDFs, create SQL UDFs without writing a single line of Java, and insert professional UDF templates directly into the editor.

💡

Tips & Concepts Modal

On first connect, a paginated modal walks through IDE features, Flink architecture, window types, state, connectors, and performance tips — like IntelliJ's "Tip of the Day".

🎨

6 Themes

Dark (default), Light, Monokai, Dracula, Nord, Contrast. Preference persisted across sessions.

🛡Admin Session

The Admin Session mode provides a privileged view of the entire Flink cluster — across all users and sessions. Access it from the connect screen by selecting the Admin mode and entering the admin passcode.

Capabilities

  • Full cluster job visibility — see ALL jobs running on the cluster, not just the ones submitted in this session.
  • Per-session activity — the Sessions Activity modal shows each registered session's submitted jobs, query count, and running job status.
  • Cross-session job attribution — jobs are tagged with their session name (visible in the Job Graph select dropdown).
  • Cancel any job — admins can cancel any running job on the cluster, with a confirmation dialog.
  • Admin-grade PDF reports — choose between a Technical Report (engineering detail) or Business/Management Report (self-explanatory metrics). Both carry the admin's name and session ID.
  • Change admin passcode — click the 🛡 ADMIN badge in the topbar to open Admin Settings where you can update your display name and change the passcode.
🔒
Default passcode The default admin passcode is admin1234. Change it immediately after first login via the Admin Settings modal (click the 🛡 badge in the topbar). The new passcode is persisted in browser localStorage.

Session isolation model

FeatureRegular SessionAdmin Session
Jobs visible in Job GraphOnly jobs submitted in this sessionAll jobs on the cluster
Cancel jobOnly own jobsAny job on the cluster
Session Activity modalNot availableFull cross-session breakdown
Report typeStandard session reportTechnical or Business/Management report
Job taggingTagged with session nameTagged as admin session

💡Tips & Concepts Modal

On first connection, Str:::lab Studio displays a paginated Tips & Concepts modal — inspired by JetBrains "Tip of the Day". It covers 16 topics across 6 categories:

CategoryTopics Covered
Getting StartedRunning your first query, keyboard shortcuts
IDE TipsMulti-tab workflows, run selection, workspace export, Performance tab, Job Graph
Flink ArchitectureJobManager + TaskManagers, slots, dataflow graphs, SQL Gateway
Flink ConceptsWatermarks, Tumble/Hop/Session windows, state, checkpointing, exactly-once, connectors
Performance TipsMiniBatch aggregation, avoiding duplicate pipeline submissions
AdminAdmin session capabilities and cluster visibility

Users can navigate tips with Prev/Next buttons or click category tags to jump directly to that section. A "Don't show on startup" checkbox suppresses the modal on future connections. The modal can be reopened at any time via the topbar — the ℹ About button also includes a tips launcher.

Snippets Library

30+ built-in SQL templates accessible from the toolbar. Click any snippet to insert into the active editor tab.

TagSnippets
SETUPCommon Flink SQL Mistakes, Use Default Catalog, Create In-Memory Catalog, Use Hive Catalog
CONFIG⚡ Recommended Streaming Config (all-in-one), Set Runtime Mode, Set Parallelism, Set State TTL
DDLShow Catalogs, Show Databases, Show Tables, Show Full Tables, Describe Table, Show Create Table, Check If Table Exists, Show Jobs
WINDOWTumble Window, Hop Window, Session Window, Cumulate Window
CONNECTORKafka Source, Datagen Source, Elasticsearch Sink, MinIO / S3 Sink, Print Sink
PERFEnable MiniBatch, Local-Global Aggregation, Idle Source Timeout
PATTERNTop-N per Group (ROW_NUMBER), Deduplication, Temporal Join, Interval Join

🔌Connector JARs

🚨
Version mismatch = NoClassDefFoundError Connector JAR versions must match your Flink runtime exactly. Every INSERT job will fail with a classloading error if the version is wrong.
Flink VersionKafka Connector JAR
1.20.xflink-sql-connector-kafka-3.3.0-1.20.jar
1.19.x ← recommendedflink-sql-connector-kafka-3.3.0-1.19.jar
1.18.xflink-sql-connector-kafka-3.3.0-1.18.jar
1.17.xflink-sql-connector-kafka-3.2.0-1.17.jar
1.16.xflink-sql-connector-kafka-3.1.0-1.16.jar
.\scripts\download-connectors.ps1
# Auto-detects Flink version from running container.
# Override: .\scripts\download-connectors.ps1 -FlinkVersion 1.19.1
./scripts/download-connectors.sh
# Override: ./scripts/download-connectors.sh --flink-version 1.19.1
FROM flink:1.19.1-scala_2.12-java11
USER root
RUN apt-get update && apt-get install -y libgomp1 && rm -rf /var/lib/apt/lists/*
COPY ./connectors/*.jar /opt/flink/lib/
USER flink

🔀Full Pipeline Demo

A complete multi-stage streaming pipeline built entirely in Str:::lab Studio: datagen source → Kafka → enrichment with branching → windowed aggregation → live preview.

flowchart LR DG["⚙ datagen"] TE["trade-events\nKafka"] EN["enrich + route\nJob 2"] ET["enriched-trades\nKafka"] AL["risk-alerts\nKafka"] AG["TUMBLE 10s\nJob 3"] TS["trade-summary\nKafka"] PR["print\nstdout"] DG-->|"INSERT\nJob 1"|TE TE-->EN EN-->|"Branch A"|ET EN-->|"Branch B\nHIGH+CRITICAL"|AL ET-->AG AG-->TS AG-->PR style DG fill:#1a2130,stroke:#9b72cf,color:#f0f6fc style TE fill:#1a2130,stroke:#f0a500,color:#f0f6fc style EN fill:#1a2130,stroke:#00d4aa,color:#f0f6fc style ET fill:#1a2130,stroke:#f0a500,color:#f0f6fc style AL fill:#1a2130,stroke:#e05c5c,color:#f0f6fc style AG fill:#1a2130,stroke:#00d4aa,color:#f0f6fc style TS fill:#1a2130,stroke:#f0a500,color:#f0f6fc style PR fill:#1a2130,stroke:#3fb950,color:#f0f6fc
0

Session Config — Snippets → ⚡ Recommended Streaming Config

SET 'execution.runtime-mode' = 'streaming';
SET 'parallelism.default' = '1';
SET 'pipeline.operator-chaining' = 'false';
SET 'execution.checkpointing.interval' = '10000';
1

DDL Tab — Create all source and sink tables

CREATE TEMPORARY TABLE trade_gen (
  trade_id STRING, symbol STRING, side STRING,
  quantity DOUBLE, price DOUBLE, exchange STRING,
  event_time TIMESTAMP(3),
  WATERMARK FOR event_time AS event_time - INTERVAL '5' SECOND
) WITH ('connector' = 'datagen', 'rows-per-second' = '5');
2

Producer Tab — Start datagen → Kafka (Job 1)

INSERT INTO trade_events_kafka
SELECT trade_id, symbol, side, quantity, price, exchange, event_time
FROM trade_gen;

Job Graph tab auto-opens. Studio tags the Flink job as [session-name] Str:::lab Studio Pipeline.

3

Enrichment Tab — Branching pipeline (Job 2)

EXECUTE STATEMENT SET BEGIN
INSERT INTO enriched_trades_kafka
SELECT trade_id, symbol, side,
  ROUND(quantity * price, 2) AS trade_value,
  CASE WHEN quantity * price >= 100000 THEN 'CRITICAL'
       WHEN quantity * price >=  10000 THEN 'HIGH'
       ELSE 'LOW' END AS risk_level, event_time
FROM trade_events_source;

INSERT INTO branch_alerts_sink
SELECT trade_id, symbol, ROUND(quantity * price, 2) AS trade_value, event_time
FROM trade_events_source WHERE quantity * price >= 10000;
END;
4

Aggregation Tab — Windowed summary (Job 3)

INSERT INTO trade_summary_kafka
SELECT window_start, window_end, symbol, side,
  COUNT(*) AS trade_count,
  ROUND(SUM(quantity * price), 2) AS total_volume
FROM TABLE(TUMBLE(TABLE enriched_trades_source,
  DESCRIPTOR(event_time), INTERVAL '10' SECOND))
GROUP BY window_start, window_end, symbol, side;
5

Preview Tab — Live SELECT from Kafka (rows stream into Results)

SELECT trade_id, symbol, side, trade_value, risk_level, event_time
FROM enriched_trades_view;
💡
Always create a separate consumer-mode view to preview live data. Sink tables can't be queried. Create a duplicate table with scan.startup.mode = 'earliest-offset' and a consumer group ID for SELECT queries.

☸️Str:::lab Studio Kube Operator

The Str:::lab Studio Kube Operator manages Str:::lab Studio deployments as first-class Kubernetes CRDs. Deploy and lifecycle-manage Studio instances with a single kubectl apply. Built with Kopf (Python).

Repository: github.com/coded-streams/strlabstudio-operator

flowchart TD CR["StrlabStudio CR\nkubectl apply"] OP["Str:::lab Studio\nKube Operator\nKopf controller"] DEP["Deployment\nnginx pods"] SVC["Service\nClusterIP/NodePort/LB"] POD["nginx Pod\nIDE + reverse proxy"] GW["Flink SQL Gateway"] JM["Flink JobManager"] CR-->|"reconcile"|OP OP-->|"creates / owns"|DEP OP-->|"creates / owns"|SVC DEP-->POD POD-->GW POD-->JM style CR fill:#1a2130,stroke:#00d4aa,color:#f0f6fc style OP fill:#1a2130,stroke:#9b72cf,color:#f0f6fc style DEP fill:#1a2130,stroke:#4fa3e0,color:#f0f6fc style SVC fill:#1a2130,stroke:#4fa3e0,color:#f0f6fc style POD fill:#1a2130,stroke:#3fb950,color:#f0f6fc style GW fill:#1a2130,stroke:#f0a500,color:#f0f6fc style JM fill:#1a2130,stroke:#f0a500,color:#f0f6fc

CRD Reference — StrlabStudio

Short name: kubectl get fss  ·  Group: codedstreams.io/v1alpha1

FieldTypeRequiredDefaultDescription
spec.gateway.hoststringFlink SQL Gateway hostname
spec.gateway.portinteger8083SQL Gateway port
spec.jobmanager.hoststringlocalhostJobManager REST hostname
spec.jobmanager.portinteger8081JobManager REST port
spec.imagestringcodedstreams/strlabstudio:latestIDE Docker image
spec.replicasinteger1Pod count (1–10)
spec.service.typestringClusterIPClusterIP, NodePort, or LoadBalancer
spec.service.portinteger80Service port
spec.service.nodePortintegerNodePort (30000–32767)
spec.resources.requestsobject{cpu:50m, memory:64Mi}Pod resource requests
spec.resources.limitsobject{cpu:200m, memory:128Mi}Pod resource limits
helm repo add strlabstudio https://coded-streams.github.io/strlabstudio-operator/charts
helm repo update
helm install strlabstudio-operator strlabstudio/strlabstudio-operator \
  --namespace flinksql-system --create-namespace

kubectl get pods -n flinksql-system
kubectl get crd strlabstudio.codedstreams.io
apiVersion: codedstreams.io/v1alpha1
kind: StrlabStudio
metadata:
  name: dev-studio
  namespace: flink
spec:
  gateway:
    host: flink-sql-gateway
    port: 8083
  jobmanager:
    host: flink-jobmanager
    port: 8081
apiVersion: codedstreams.io/v1alpha1
kind: StrlabStudio
metadata:
  name: dev-studio
  namespace: flink
spec:
  gateway:
    host: flink-sql-gateway
  jobmanager:
    host: flink-jobmanager
  service:
    type: NodePort
    nodePort: 30303
apiVersion: codedstreams.io/v1alpha1
kind: StrlabStudio
metadata:
  name: prod-studio
  namespace: flink
spec:
  image: codedstreams/strlabstudio:v1.0.22
  replicas: 2
  gateway:
    host: flink-sql-gateway
    port: 8083
  jobmanager:
    host: flink-jobmanager
    port: 8081
  service:
    type: LoadBalancer
    port: 80
  resources:
    requests: {cpu: "100m", memory: "128Mi"}
    limits:   {cpu: "500m", memory: "256Mi"}

🚀Deployment Options

🐳

Docker Compose

Full stack in one command — Studio, Flink cluster, SQL Gateway, Kafka.

docker compose up -d
📄

Standalone HTML

Serve studio/index.html from any static server. Use Direct Gateway mode.

python3 -m http.server 8080
☸️

Str:::lab Studio Kube Operator

CRD-based deployment via Helm. GitOps-friendly. No PVC required.

helm install strlabstudio-operator \
  strlabstudio/strlabstudio-operator
☁️

Cloud Flink (Remote)

Connect to Confluent Cloud, Ververica Cloud, or AWS Managed Flink with Direct Gateway + bearer token auth.

Project Manager

The Project Manager lets you save, load, run, and organise your Str:::lab work as named projects. Access it from the topbar via the ⬡ Projects button. Projects are stored in browser localStorage and can be exported and imported as JSON files.

What a project stores

FieldDescription
SQL tabsAll editor tab titles and SQL content, snapshotted at save time
SET configAll SET ... statements from query history — runtime mode, checkpointing, mini-batch, etc.
Catalog / DatabaseActive USE CATALOG and USE context at save time
MetadataName, description, created/modified timestamps, run count, version tag

Three tabs

📁

Projects tab

Lists all saved projects with size, tab count, last modified, and run count. Actions per project: Load (restore to editor), ▶ Run (load + execute), ⬇ Export (download JSON), 🗑 Delete. Search bar filters by name or description.

New Project tab

Create a named project with optional description. Checkboxes control whether to snapshot current editor tabs and current SET config from history. The active project is shown in the topbar.

🗄

Storage tab

Shows a progress bar of localStorage usage (5 MB limit). Per-project storage breakdown. Buttons to export all, import from file, or clear all projects. Export before clearing.

Workflow

  1. Write your SQL across tabs. Run SET statements to configure the session.
  2. Open ⬡ Projects → + New Project. Name it, tick snapshot tabs + config, click ⬡ Create Project.
  3. The project appears in the list with an ACTIVE badge. The topbar shows the project name.
  4. Continue working. Click 💾 Save Current State (modal footer) to update the active project.
  5. To restore: open ⬡ Projects → Load on any project. This replaces the editor tabs.
  6. To share: click on any project row to download a .json file. Import on any other Studio instance.
Storage limit: Browser localStorage is limited to ~5 MB. Each project is typically 2–20 KB. Export projects you want to keep permanently — localStorage is cleared when browser data is cleared. For large SQL files, keep scripts in version control and use projects for session config and metadata only.

UDF Manager

The UDF Manager is a dedicated feature for the complete User-Defined Function lifecycle — from building to uploading to registering to using — without leaving Str:::lab Studio. Access it from the topbar via the ⨍ UDFs button. It has five tabs.

What are UDFs?

UDFs extend Flink SQL with custom logic that built-in functions cannot express. Flink supports four types:

f(x)

Scalar (UDF)

One row in → one value out. Data masking, classification, format conversion. Can be written in pure SQL (no JAR), Java, Scala, or Python.

Table (UDTF)

One row in → many rows out. Used with CROSS JOIN LATERAL TABLE(...) to explode arrays or parse delimited fields.

Aggregate (UDAGG)

Many rows in → one value out. Maintains an accumulator per group. Supports retraction for streaming tables. Example: VWAP.

Async Table

Non-blocking external enrichment. Issues many concurrent requests without blocking the pipeline. Ideal for REST API or HBase lookups.

📚 Library Tab — Browse & Quick-Insert

Click ⟳ Refresh to run SHOW FUNCTIONS and SHOW USER FUNCTIONS against the active session. Results split into User-Defined (blue) and Built-in. Use the search bar and type filter. Click any function name to insert it at the cursor in the editor with parentheses ready.

⬆ Upload JAR Tab — Deploy Directly to the Cluster

Str:::lab Studio can upload a JAR file directly to the Flink JobManager via the same REST endpoint that the Flink Web UI uses: POST /jars/upload. No SSH, no manual file copying. Once uploaded, the JAR is available to all TaskManagers immediately.

  1. Drag-and-drop your .jar file onto the drop zone, or click to browse.
  2. Click ⬆ Upload JAR to Flink Cluster — a live progress bar tracks the upload.
  3. The JARs on cluster list refreshes automatically. Click Use → on any entry to pre-fill the Register tab.
  4. Optionally delete JARs from the cluster using the Delete button on each row.
Prerequisites: The Flink Web UI port (8081) must be proxied at /jobmanager-api/ in your nginx config. If you see a 404 error, check that the proxy is configured. Self-hosted Docker clusters using the included docker-compose.yml have this configured automatically.

⬡ Maven / Gradle Tab — Dependency & Build Config

Maven and Gradle resolve your UDF's dependencies and package everything into a single deployable JAR. This tab generates the correct build configuration and commands so you never have to guess at dependency scopes or version strings.

<dependencies>

    <!--
      Flink APIs — scope="provided"
      These JARs are already on the cluster. DO NOT bundle them.
      "provided" means: compile against them, but exclude from the output JAR.
    -->
    <dependency>
        <groupId>org.apache.flink</groupId>
        <artifactId>flink-table-api-java</artifactId>
        <version>${flink.version}</version>
        <scope>provided</scope>
    </dependency>
    <dependency>
        <groupId>org.apache.flink</groupId>
        <artifactId>flink-table-common</artifactId>
        <version>${flink.version}</version>
        <scope>provided</scope>
    </dependency>

    <!-- Your own third-party deps — these ARE bundled (default scope) -->
    <dependency>
        <groupId>com.google.guava</groupId>
        <artifactId>guava</artifactId>
        <version>32.1.3-jre</version>
    </dependency>

</dependencies>

<!-- Maven Shade Plugin: creates the shaded (fat) JAR -->
<plugin>
    <groupId>org.apache.maven.plugins</groupId>
    <artifactId>maven-shade-plugin</artifactId>
    <version>3.5.1</version>
    <executions>
        <execution>
            <phase>package</phase>
            <goals><goal>shade</goal></goals>
            <configuration>
                <shadedArtifactAttached>true</shadedArtifactAttached>
                <shadedClassifierName>shaded</shadedClassifierName>
            </configuration>
        </execution>
    </executions>
</plugin>

<!-- Build command: -->
<!-- mvn clean package -DskipTests -->
<!-- Output: target/my-udfs-1.0.0-shaded.jar -->
plugins {
    id 'java'
    // Shadow plugin: creates the fat JAR
    id 'com.github.johnrengelman.shadow' version '8.1.1'
}

dependencies {
    /*
     * compileOnly = Maven "provided"
     * These JARs are already on the Flink cluster.
     * DO NOT use 'implementation' for Flink APIs — it bundles them
     * into your JAR and causes ClassLoader conflicts.
     */
    compileOnly "org.apache.flink:flink-table-api-java:${flinkVersion}"
    compileOnly "org.apache.flink:flink-table-common:${flinkVersion}"

    // Your own third-party deps — use 'implementation' (bundled)
    implementation 'com.google.guava:guava:32.1.3-jre'
}

shadowJar {
    archiveClassifier = 'shaded'
    mergeServiceFiles()
    exclude 'META-INF/*.SF', 'META-INF/*.DSA', 'META-INF/*.RSA'
}

// Build command:
// ./gradlew shadowJar
// Output: build/libs/my-udfs-1.0.0-shaded.jar
ConceptExplanation
provided / compileOnlyFlink's own JARs (flink-table-api-java, flink-streaming-java, etc.) exist on every TaskManager. Mark them provided so they compile your code but are not bundled into the output JAR. Bundling them causes ClassCastException at runtime.
Shaded / fat JARYour UDF JAR must contain all third-party dependencies (e.g. Guava, Jackson, HTTP client) inside a single file. Maven Shade Plugin and Gradle Shadow Plugin achieve this. The cluster just needs one file.
RelocationIf two JARs on the classpath contain the same class (dependency conflict), use Maven Shade <relocation> or Gradle Shadow relocate() to rename one package. Common with Guava and Jackson.
Flink version matchThe flink-table-api-java version in your pom.xml must match the cluster's Flink version exactly. A mismatch causes AbstractMethodError or silent wrong behaviour.
Python UDFsPython UDFs don't use Maven or Gradle. They require PyFlink installed on all TaskManagers (pip install apache-flink==<version>). Register them with LANGUAGE PYTHON pointing to a module path.
⬡ Maven / Gradle tab in the IDE: Open ⨍ UDFs → ⬡ Maven / Gradle to get a generated pom.xml or build.gradle — fill in your group ID, artifact ID, Flink version, Java version, and any extra dependencies, then copy the config and the build commands. The tab explains exactly which file to upload after building.

+ Register Tab — Activate a JAR Function

After uploading a JAR (via the Upload JAR tab or manually), use this tab to write the CREATE FUNCTION statement that makes the class available as a SQL function.

  1. Enter the function name (what you will call in SQL).
  2. Enter the fully-qualified class path (e.g. com.yourcompany.udf.MaskEmail).
  3. Choose Language: Java, Scala, or Python.
  4. Choose Scope: Temporary (session only), Temporary System (all catalogs), or Permanent (catalog-persisted).
  5. Click ⚡ Execute Registration to send the CREATE FUNCTION to the gateway, or Insert into Editor to review first.

✎ SQL UDF Tab — No JAR, No Java

Write a UDF entirely in Flink SQL. No JAR, no build step. Supported in Flink 1.17+.

CREATE TEMPORARY FUNCTION classify_risk(score DOUBLE)
RETURNS STRING LANGUAGE SQL AS $$
  CASE
    WHEN score >= 0.8 THEN 'CRITICAL'
    WHEN score >= 0.5 THEN 'WARNING'
    WHEN score >= 0.2 THEN 'LOW'
    ELSE 'NORMAL'
  END
$$;

SELECT event_id, score, classify_risk(score) AS tier
FROM risk_events;
CREATE TEMPORARY FUNCTION fmt_bytes(b BIGINT)
RETURNS STRING LANGUAGE SQL AS $$
  CASE
    WHEN b >= 1073741824 THEN CONCAT(CAST(b/1073741824 AS STRING), ' GB')
    WHEN b >= 1048576    THEN CONCAT(CAST(b/1048576 AS STRING),    ' MB')
    WHEN b >= 1024       THEN CONCAT(CAST(b/1024 AS STRING),       ' KB')
    ELSE CONCAT(CAST(b AS STRING), ' B')
  END
$$;

SELECT node_id, fmt_bytes(bytes_transferred) AS readable_size
FROM network_events;
CREATE TEMPORARY FUNCTION safe_region(region STRING, fallback STRING)
RETURNS STRING LANGUAGE SQL AS $$
  CASE
    WHEN region IS NULL OR TRIM(region) = '' THEN fallback
    ELSE UPPER(TRIM(region))
  END
$$;

SELECT event_id, safe_region(region, 'UNKNOWN') AS normalised_region
FROM events;

⊞ Templates Tab — Production-Ready Examples

GroupTemplates included
Scalar FunctionsSQL scalar (no JAR), Java scalar with open/close lifecycle, Python scalar with PyFlink decorator
Table Functions (UDTF)Java SplitTags CSV exploder, SQL JSON array UNNEST pattern
Aggregate Functions (UDAGG)Volume-weighted average price (VWAP) with full accumulator and retraction support
Async & Lookup FunctionsNon-blocking async HTTP enrichment UDF, temporal lookup join with JDBC caching
Utility & Best PracticesSHOW FUNCTIONS reference, temporary vs permanent scope guide, full best-practices template

Complete JAR UDF Workflow

1
Write the UDF in Java/Scala/Python. Extend ScalarFunction, TableFunction, or AggregateFunction.
2
Generate build config — open ⬡ Maven / Gradle, fill in your details, copy the generated pom.xml or build.gradle.
3
Build the shaded JAR: mvn clean package -DskipTests or ./gradlew shadowJar.
4
Upload to cluster — open ⬆ Upload JAR, drag the shaded JAR, click Upload. Live progress bar. No SSH.
5
Register — open + Register UDF, fill in function name and class path, click ⚡ Execute Registration.
6
Use in SQL — open 📚 Library, click Refresh, click the function name to insert at cursor. Write your pipeline.

Demo — Fraud Detection Pipeline with SQL UDFs

The UDF Manager ships with a complete demo SQL file: udf-fraud-detection-demo.sql. It demonstrates all five tabs using a real-time payment fraud detection scenario — 50 transactions per second, 7 SQL UDFs, 4 running Flink jobs, 5 Kafka topics.

TabContentUDFs used
Tab 1 — SetupSession config: runtime mode, checkpointing, mini-batch
Tab 2 — Register UDFs7 CREATE TEMPORARY FUNCTION statements (SQL, no JAR)All 7 defined here
Tab 3 — Tablesdatagen source + 4 Kafka sinks + 1 Kafka source (cleansed)
Tab 4 — VerifySHOW USER FUNCTIONS + live datagen SELECT with all UDFsAll 7 previewed
Tab 5 — Pipelines 1 & 2PII masking → Kafka; risk classification → Kafkamask_card, mask_email, normalize_currency, classify_risk, amount_band, velocity_flag
Tab 6 — Pipelines 3 & 4TUMBLE(1 min) fraud KPI window; SLA scoring pipelineclassify_risk, amount_band, sla_tier, velocity_flag
Tab 7 — Live MonitorAll 7 UDFs in one SELECT — enable ✦ Colour Describe for row coloringAll 7 applied
-- Register all 7 SQL UDFs — no JAR, no Java required

CREATE TEMPORARY FUNCTION classify_risk(score DOUBLE)
RETURNS STRING LANGUAGE SQL AS $$
  CASE WHEN score >= 0.80 THEN 'CRITICAL'
       WHEN score >= 0.55 THEN 'HIGH'
       WHEN score >= 0.30 THEN 'MEDIUM'
       ELSE 'LOW' END $$;

CREATE TEMPORARY FUNCTION mask_card(card STRING)
RETURNS STRING LANGUAGE SQL AS $$
  CASE WHEN card IS NULL OR CHAR_LENGTH(card) < 8 THEN '****-****-****-****'
       ELSE CONCAT(SUBSTRING(card,1,4),'-****-****-',
                   SUBSTRING(card,CHAR_LENGTH(card)-3,4)) END $$;

CREATE TEMPORARY FUNCTION mask_email(email STRING)
RETURNS STRING LANGUAGE SQL AS $$
  CASE WHEN email IS NULL OR POSITION('@' IN email) <= 2 THEN '**@***.***'
       ELSE CONCAT(SUBSTRING(email,1,2),'***',
                   SUBSTRING(email,POSITION('@' IN email))) END $$;

CREATE TEMPORARY FUNCTION amount_band(amt DOUBLE)
RETURNS STRING LANGUAGE SQL AS $$
  CASE WHEN amt >= 10000 THEN 'WHALE'
       WHEN amt >= 2500  THEN 'LARGE'
       WHEN amt >= 500   THEN 'MEDIUM'
       WHEN amt >= 50    THEN 'SMALL'
       ELSE 'MICRO' END $$;

-- Verify all 7 are registered:
SHOW USER FUNCTIONS;
-- Pipeline 1: Cleanse raw events — mask PII, normalise fields
-- UDFs: mask_card(), mask_email(), normalize_currency()
INSERT INTO cleansed_sink
SELECT
    txn_id,
    customer_id,
    mask_card(card_number)           AS card_masked,
    mask_email(email)                AS email_masked,
    ROUND(amount_usd, 2)             AS amount_usd,
    normalize_currency(currency_raw) AS currency,
    merchant_id,
    CASE merchant_country
        WHEN 0 THEN 'US'  WHEN 1 THEN 'EU'
        WHEN 2 THEN 'UK'  WHEN 3 THEN 'APAC'
        WHEN 4 THEN 'LATAM' ELSE 'MEA'
    END                              AS merchant_country,
    CASE channel_raw
        WHEN 0 THEN 'ONLINE' WHEN 1 THEN 'MOBILE'
        WHEN 2 THEN 'POS'    ELSE 'ATM'
    END                              AS channel,
    ROUND(risk_score, 4)             AS risk_score,
    latency_ms,
    event_ts
FROM raw_transactions;
-- Pipeline 3: TUMBLE(1 min) fraud KPI aggregation
-- UDFs used inside COUNT(CASE ...) aggregations
INSERT INTO fraud_kpi_sink
SELECT
    TUMBLE_START(event_ts, INTERVAL '1' MINUTE)  AS window_start,
    TUMBLE_END(event_ts,   INTERVAL '1' MINUTE)  AS window_end,
    merchant_country,
    channel,
    COUNT(*)                                      AS total_txns,
    ROUND(SUM(amount_usd), 2)                     AS total_volume_usd,
    ROUND(AVG(risk_score), 4)                     AS avg_risk_score,
    -- UDF inside aggregation: count CRITICAL transactions per window
    COUNT(CASE WHEN classify_risk(risk_score) = 'CRITICAL' THEN 1 END) AS critical_count,
    COUNT(CASE WHEN classify_risk(risk_score) = 'HIGH'     THEN 1 END) AS high_count,
    -- UDF inside aggregation: count whale transactions per window
    COUNT(CASE WHEN amount_band(amount_usd) = 'WHALE' THEN 1 END)     AS whale_count,
    ROUND(
      CAST(COUNT(CASE WHEN classify_risk(risk_score) IN ('CRITICAL','HIGH') THEN 1 END) AS DOUBLE)
      / NULLIF(COUNT(*), 0) * 100.0
    , 2)                                          AS fraud_rate_pct
FROM cleansed_source
GROUP BY TUMBLE(event_ts, INTERVAL '1' MINUTE), merchant_country, channel;
-- Tab 7: All 7 UDFs in a single live SELECT
-- Enable "✦ Colour Describe" — risk_tier and sla_tier drive row coloring:
--   🔴 Red    = CRITICAL  🟡 Yellow = HIGH/MEDIUM  🟢 Green = LOW
SELECT
    txn_id,
    customer_id,
    card_masked,
    ROUND(amount_usd, 2)                AS amount_usd,
    currency,
    merchant_country,
    channel,
    ROUND(risk_score, 3)                AS risk_score,
    classify_risk(risk_score)           AS risk_tier,
    amount_band(amount_usd)             AS amount_band,
    velocity_flag(latency_ms)           AS velocity,
    sla_tier(latency_ms, risk_score)    AS sla_tier,
    latency_ms,
    event_ts
FROM cleansed_source;
Also in Tips & Concepts: The UDFs category (💡 Tips → UDFs) includes a 3-step auto-advancing walkthrough (Write → Register → Use), a SQL UDF quick-start, and a production best-practices summary — all with runnable SQL code examples.

📝Changelog

v1.0.22 March 2026 current
  • Brand rename — product renamed from FlinkSQL Studio to Str:::lab Studio across all UI, docs, localStorage keys (strlabstudio_*), export filenames, and Kubernetes CRDs; Apache Flink trademark attribution added to footer
  • PDF report heading fixedwindow.open interceptor rewrites any residual brand references in the generated PDF before printing, including the report H1 heading and all inline text
  • LinkedIn banner — 1200×627px SVG brand image generated for social media (Str:::lab Studio wordmark with feature pills and mock terminal)
v1.0.21 March 2026
  • ⬡ Project Manager — create, load, run, save, and delete named Str:::lab projects; stores SQL tabs, SET config, catalog context, run count; export/import as JSON; per-project storage breakdown; name uniqueness enforced on create and import
  • Tooltips on all Project Manager buttons, checkboxes, and tab headers
  • JAR Upload fix — FormData now sends application/x-java-archive Content-Type (fixes Flink HTTP 500); configurable JobManager URL override field for non-standard ports
  • Maven / Gradle tab — generates complete pom.xml / build.gradle with correct provided/compileOnly scopes, shade/shadow plugin, and annotated build commands
v1.0.20 March 2026
  • ⬡ Project Manager — create, load, run, save, and delete Str:::lab projects; each project stores SQL tabs, SET config, catalog/database context, and metadata; export/import as JSON; storage usage tracking with per-project breakdown
  • JAR Upload fix — FormData now sends application/x-java-archive Content-Type on the jarfile part (fixes HTTP 500 from Flink JobManager); configurable JobManager URL override field added for direct/remote clusters on non-standard ports
  • _getJmBase() rewrite — correctly resolves JobManager URL in proxy mode (nginx swap), direct mode (port swap to 8081), and remote mode (user override field); no longer crashes with red error on tab open when not connected
  • Maven / Gradle tab — generates complete pom.xml / build.gradle with correct provided/compileOnly scopes, shade/shadow plugin, and build commands; explains why Flink JARs must not be bundled
v1.0.19 March 2026
  • Admin Session — privileged connect mode with full cluster job visibility, admin badge in topbar, session activity modal, technical and business/management PDF reports
  • Admin passcode management — default admin1234, changeable via Admin Settings modal after connecting
  • Job scoping — regular sessions only see their own jobs; admin sees all cluster jobs with session attribution
  • Duplicate submission guard — INSERT INTO / pipeline queries blocked if same statement already running
  • Job taggingpipeline.name SET injected automatically before INSERT statements with session label prefix
  • Cancel job confirmation modal — explicit "Are you sure?" dialog before sending cancel signal to JobManager
  • Tips & Concepts modal — 16 tips across 6 categories: IDE features, Flink architecture, window types, state, connectors, performance, admin
  • Load existing sessions — connect screen auto-fetches sessions from gateway after Test Connection for dropdown selection
  • Session persistence — heartbeat keeps sessions alive indefinitely; no idle expiry unless explicitly disconnected
v1.0.18March 2026
  • Results display fixedNOT_READY / EOS handling; Kafka SELECT rows reliably stream into Results tab
  • Mouse wheel zoom on Job Graph (20%–300%) with zoom-toward-cursor and % indicator
  • Catalog context fix — sidebar refresh no longer hijacks active USE CATALOG context
  • Named workspace export, session isolation, Cancel Job button, full session export
  • ⚡ Recommended Streaming Config snippet, INSERT auto-redirect to Job Graph
v1.0.17February 2026
  • Job Graph DAG — drag-to-pan, animated edges, fault highlighting, node double-click detail
  • Tab rename (double-click), per-session workspace isolation
v1.0.1 – v1.0.16December 2025 – January 2026
  • Initial build: SQL editor, session management, Results / Log / Operations tabs
  • Snippet library, query history, catalog browser, performance tracking panel
  • Dark/light theme toggle, workspace export/import, Job Graph tab first visual pipeline monitoring