12 posts tagged with "Bug Fix"

Airflow PostgresHook Multi-Statement SQL Silently Drops Results? Split by Semicolon and Execute One by One

June 14, 2026 · 6 min read

When an Airflow DAG reads a .sql template file as a single string and passes it to PostgresHook.get_pandas_df(), prior SELECT results are silently dropped — the DAG reports "SQL query returned no results", but copying the same SQL into psql returns data normally.

Encountered this while building AI Analytics — an LLM-powered analytics pipeline where an Airflow DAG reads multi-query report templates from .sql files and executes them.

TL;DR

PostgresHook.get_pandas_df(sql) internally calls pandas.read_sql(sql, conn) → psycopg2 cursor.execute(sql). When sql is a single string with multiple ;-separated SELECTs, the DBAPI only exposes the cursor of the last result set — prior query results are silently dropped with no error. Fix: split by top-level semicolons into a list[str] and call get_pandas_df per statement, or pass the list directly so DbApiHook runs them sequentially.

Symptom

The DAG task executing shop_monthly_overview.sql reports "SQL query returned no results":

sql_count = 1   ← template clearly contains 4 queries
result = "❌ SQL query returned no results"

But the same SQL pasted into psql against the same database with the same parameters returns data for all 4 SELECTs.

Reproduction

Verify get_pandas_df behavior with multi-statement SQL directly inside the Airflow container:

from airflow.providers.postgres.hooks.postgres import PostgresHook

hook = PostgresHook(postgres_conn_id="postgres_default")

# Three SELECTs concatenated into one string
sql = "SELECT 1 AS a; SELECT 2 AS b; SELECT 99 AS c WHERE 1=0;"

df = hook.get_pandas_df(sql)
print(df.columns.tolist())  # ['c']  ← only got the last statement's columns
print(df)                    # Empty  ← the last statement itself returns 0 rows

Expected three result sets, got only the last one (SELECT 99 ... WHERE 1=0, 0 rows). The first two completely disappear with no error or warning.

Root Cause

The call chain is PostgresHook.get_pandas_df → DbApiHook.get_pandas_df → pandas.io.sql.read_sql → psycopg2 cursor.execute(sql).

The DBAPI protocol (PEP 249) allows execute to accept a string with multiple ;-separated statements. PostgreSQL executes all of them, but the cursor only exposes the last result set — this is inherent PostgreSQL wire protocol behavior, not an Airflow or pandas bug.

┌────────────────────────────────────────────────────┐
│ SELECT 1;  ← executed, result set 1 dropped at once │
│ SELECT 2;  ← executed, result set 2 dropped at once │
│ SELECT 99 WHERE 1=0;  ← executed, result set 3 exposed │
└────────────────────────────────────────────────────┘
                       ↓
            pandas.read_sql only fetches result set 3

The root cause in our code is task_execute_sql reading the entire .sql file as a single string and passing it to get_pandas_df:

# ❌ Problematic code
sql_text = open(sql_path).read()  # full text with 4 SELECTs
df = pg_hook.get_pandas_df(sql_text)  # only gets the last result

Why does psql return data? Because the psql frontend actively iterates through all result sets and prints them one by one, while a DBAPI cursor does not.

Solution

Option A (Recommended): Split by top-level semicolon, execute one by one

Suitable for .sql template files — they contain comments, quotes, and multiple queries that need robust splitting.

def split_sql_statements(sql: str) -> list:
    """
    Split SQL by top-level semicolons, correctly handling:
    - Semicolons inside single-quoted strings ('a;b' is not split)
    - SQL-standard '' escape ('it''s' is not split)
    - Semicolons inside -- line comments (-- note; not split is not split)
    """
    statements = []
    buf = []
    i, n = 0, len(sql)
    in_quote = False

    while i < n:
        ch = sql[i]

        # Inside a single-quoted string
        if in_quote:
            buf.append(ch)
            if ch == "'":
                # '' = literal single quote, does not end the string
                if i + 1 < n and sql[i + 1] == "'":
                    buf.append(sql[i + 1])
                    i += 2
                    continue
                in_quote = False
            i += 1
            continue

        # Top level
        if ch == "'":
            in_quote = True
            buf.append(ch)
        elif ch == '-' and i + 1 < n and sql[i + 1] == '-':
            # Line comment, swallow to end of line (; inside is not a split point)
            while i < n and sql[i] != '\n':
                buf.append(sql[i])
                i += 1
            continue
        elif ch == ';':
            stmt = ''.join(buf).strip()
            if stmt:
                statements.append(stmt)
            buf = []
            i += 1
            continue
        else:
            buf.append(ch)
        i += 1

    # Trailing block without a final semicolon
    stmt = ''.join(buf).strip()
    if stmt:
        statements.append(stmt)

    return statements


# Caller
sql_text = open(sql_path).read()
statements = split_sql_statements(sql_text)

# Execute one by one, collect all results
all_results = []
for idx, stmt in enumerate(statements, start=1):
    df = pg_hook.get_pandas_df(stmt)
    if not df.empty:
        all_results.append({
            "sql_index": idx,
            "sql": stmt,
            "data": df.to_dict("records"),
            "columns": df.columns.tolist(),
            "row_count": len(df),
        })

Option B: Pass a list directly to DbApiHook

Airflow's DbApiHook.run and get_records accept list[str] and execute sequentially — but get_pandas_df return behavior in list mode is inconsistent across providers. For production, Option A gives you full control.

Why not `sqlparse.split`?

Community answers often recommend sqlparse.split(sqlparse.format(sql, strip_comments=True)), but strip_comments=True discards comments. If your downstream processor depends on metadata in comments (e.g. -- dimension: shop), you lose context. A hand-rolled splitter preserves the original comment text and gives you control.

Caveats

Do not use sql.split(';') — it will mis-cut semicolons inside quoted strings like WHERE name = 'a;b', and inside -- comment; line comments
split_sql_statements only handles single-quoted strings and -- line comments; if your SQL uses /* block comments */ or dollar-quoted strings ($$...$$), extend the splitter
After the fix, the semantics of sql_index for downstream processors change (1-based sequential index); audit all df.iloc[sql_index] style usages
If your SQL is program-generated rather than file-read, the safer pattern is to build a list at generation time rather than split later
A related trap: if you've also hit issues with SQL expressions being silently parameterized in Drizzle ORM, see Drizzle sql template mixing parameterized values with SQL expressions — same family of "the framework did a transformation you didn't expect" bugs

FAQ

How do I execute multiple SQL statements in Airflow PostgresHook?

Pass list[str] instead of a single string. DbApiHook.get_pandas_df and run accept sql as a list and execute sequentially; a single string with semicolon-separated statements causes psycopg2 to return only the last result set. For production, split yourself and call per-statement so you control result aggregation and sql_index.

Why does get_pandas_df only return the last result for multi-statement SQL?

pandas.io.sql.read_sql calls psycopg2 cursor.execute with the full string; the DBAPI protocol only exposes the cursor of the last result set for multi-statement execution, and prior SELECT results are dropped by the server immediately, with no error or warning. psql returns data because its frontend actively iterates all result sets, while a DBAPI cursor does not.

How do I split SQL by semicolon safely with comments and quotes?

Scan character by character and split only at top-level semicolons outside single-quoted strings and -- line comments. Single-quote literals use the SQL-standard '' escape; do not use str.split(';'), it will mis-cut semicolons inside comments and strings. If you use sqlparse.split, note that strip_comments=True discards the original comment text.

CCLEE

Independent developer, 24 years in e-commerce, focused on grounding AI in real business scenarios.

Work with me

WordPress REST API Image Upload Returns 405? Check Your Hostinger CDN

May 26, 2026 · 4 min read

CCLEE

While building a WooCommerce product import tool for a client, POST /wp-json/wp/v2/media would succeed for the first few images, then suddenly return 405 Not Allowed for all subsequent requests.

TL;DR

Hostinger CDN (hcdn) blocks POST /wp-json/wp/v2/media requests by default. The response headers server: hcdn and x-hcdn-request-id are the smoking gun. Disable CDN or contact Hostinger support to whitelist /wp-json/* POST requests.

The Problem

Uploading images to WordPress Media Library via REST API:

curl -X POST 'https://example.com/wp-json/wp/v2/media' \
  -u 'user:app_password' \
  -H 'Content-Disposition: attachment; filename="product-01.jpg"' \
  -H 'Content-Type: image/jpeg' \
  --data-binary @image.jpg

The first 2-4 images return 201 Created, then all subsequent requests fail with:

<html>
<head><title>405 Not Allowed</title></head>
<body>
<center><h1>405 Not Allowed</h1></center>
<hr><center>nginx</center>
</body>
</html>

This "partial success" pattern is misleading — it looks like rate limiting, but the real cause is entirely different.

Root Cause

Using curl -v to inspect the full response headers revealed:

< HTTP/2 405
< server: hcdn
< x-hcdn-request-id: cfc5ad1198938cd9f1e02ce71ed0ae61-kul-edge1

Key findings:

server: hcdn — This is Hostinger's custom CDN (hcdn), not the origin nginx server
x-hcdn-request-id — CDN edge node ID (kul-edge1 = Kuala Lumpur), confirming the request was blocked at the CDN layer before reaching WordPress

Hostinger CDN's default security rules block POST requests to /wp-json/wp/v2/media. The initial successes were likely due to CDN rule cold-start or cache misses.

Solution

Option 1: Disable CDN (Quick Fix)

Go to Hostinger hPanel → Website → CDN → Disable.

This takes effect immediately but removes CDN acceleration. Suitable for staging environments or emergency fixes.

Option 2: Contact Hostinger Support to Whitelist API Paths (Recommended)

Submit a support ticket requesting to whitelist POST requests to /wp-json/*. Hostinger's Manage panel currently doesn't offer custom CDN rule configuration — you must go through support.

Option 3: Add Retry Logic in Code (Defensive Measure)

Even with correct CDN configuration, retry logic handles occasional CDN throttling:

import time
import random

def upload_image(url, image_bytes, filename, auth, max_retries=3):
    for attempt in range(max_retries):
        resp = httpx.post(
            url,
            content=image_bytes,
            headers={
                "Content-Disposition": f'attachment; filename="{filename}"',
                "Content-Type": "image/jpeg",
            },
            auth=auth,
            timeout=30,
        )
        if resp.status_code != 405:
            return resp
        delay = 3 * (attempt + 1) + random.uniform(0, 2)
        time.sleep(delay)
    resp.raise_for_status()

Troubleshooting Journey

This issue led down several dead ends. Here's the full排查 path for reference:

Hypothesis	Action	Result
WP plugin blocking	Disabled Speed Optimizer / Auto Upload Images	Still 405, ruled out
Rate limiting	Added 2-5s delay between uploads + retry	Still 405, ruled out
REST API disabled	GET `/wp-json/wp/v2/settings`	Returned normally, ruled out
Auth credentials	WC Test Connection	Succeeded, ruled out
CDN blocking	curl -v to inspect response headers	`server: hcdn` confirmed CDN blocking

The turning point was using curl -v and spotting server: hcdn — only then did we realize the requests never reached WordPress.

Important Notes

After disabling CDN, DNS cache may take a few minutes to refresh — don't retry immediately
If your site is on Hostinger and uses REST API for batch operations, test CDN behavior before going live
WooCommerce WC API (/wc/v3/products) uses different authentication (Consumer Key) and is typically unaffected; this mainly impacts WP REST API (/wp-json/wp/v2/*) write operations

FAQ

Why does WordPress REST API image upload return 405 Not Allowed?

Check the server field in response headers. If it shows hcdn (Hostinger CDN) or another CDN identifier, the request is being blocked at the CDN layer before reaching WordPress. Disable the CDN or contact your hosting provider to whitelist the endpoint.

How to tell if 405 comes from CDN or WordPress?

Use curl -v and inspect response headers: a server value of hcdn, cloudflare, or other CDN identifiers indicates CDN-level blocking; a server value of nginx/apache with X-WP-* or X-RateLimit-* headers means the request reached WordPress.

Encountered this issue while building a WooCommerce product import tool for a client. If you're also developing with Hostinger + WordPress and running into REST API issues, reach out.

CCLEE

Independent developer, 24 years in e-commerce, focused on grounding AI in real business scenarios.

Work with me

WordPress Block Theme Changes Not Taking Effect? FSE Development Troubleshooting Guide

April 17, 2026 · 7 min read

CCLEE

Encountered these five issues repeatedly while developing WordPress Block Themes for clients. Each one took significant debugging time. This guide covers the root causes and provides ready-to-use solutions.

TL;DR

Five issues ranked by frequency: file changes not applying (database cache overrides files), block nesting errors (unclosed comments), child theme content not rendering (missing post-content block), SVG icons disappearing (WP_Filesystem polluted by plugins), and WP-CLI mail failures (SMTP plugins don't hook in CLI). Each scenario includes copy-paste diagnostic commands.

Embedding a Timeline in SITE123 Event Pages? First Dodge These 5 Platform Limits

April 10, 2026 · 5 min read

CCLEE

TL;DR

Embedding a self-hosted timeline into a SITE123 event page hits 5 platform walls: Custom Code can't target a page or position, scripts run before DOM is ready, selectors hit hidden elements, float layout pushes adjacent blocks out of place, and platform cache injects the script twice. Fix: JS DOM manipulation + horizontal fishbone layout + DOMContentLoaded wrapper.

Fix React List Key Duplication Causing DOM Errors

March 19, 2026 · 2 min read

CCLEE

Encountered this issue while building an AI Agent chat interface. Here's the root cause and solution.

TL;DR

Date.now() millisecond timestamps can duplicate within the same millisecond. When used as React list keys, this causes DOM errors. Fix by adding a random suffix or using crypto.randomUUID().

Debugging Frontend Deployment Not Updating on Production

March 7, 2026 · 3 min read

TL;DR

Frontend code pushed to Git but new feature missing on production? The root cause is usually outdated build artifacts on the server. Compare local and server dist/ directory timestamps to confirm, then run npm run build on the server.

Fix jose Library JWT Key Format Error in Node.js v24

March 7, 2026 · 2 min read

Encountered this issue while upgrading Node.js version for an enterprise SaaS system. Documenting the root cause and solution.

TL;DR

Node.js v24 changed Web Crypto API implementation. The jose library now requires keys to be KeyObject or CryptoKey type. Wrap your key with crypto.createSecretKey() to fix.

Fix Node.js Environment Variables Reading as undefined

March 7, 2026 · 2 min read

TL;DR

Module-level const URL = process.env.SERVICE_URL executes before dotenv loads, resulting in undefined. Use getter functions const getUrl = () => process.env.SERVICE_URL for lazy evaluation.

Fix Docusaurus i18n Links Losing Language Prefix

March 3, 2026 · 3 min read

TL;DR

In Docusaurus i18n sites, using <a href="/docs/xxx"> absolute paths in English pages redirects to Chinese. Solution: Use <Link to="/docs/xxx"> for page components, and relative paths ./xxx in MD documents.

Fix Docusaurus 3.9 Translation File Validation Error

March 1, 2026 · 2 min read

TL;DR

Docusaurus 3.9 has strict requirements for translation file format. Values in code.json and current.json must be { "message": "xxx" } object format, not plain strings or nested objects.

TL;DR​

Symptom​

Reproduction​

Root Cause​

Solution​

Option A (Recommended): Split by top-level semicolon, execute one by one​

Option B: Pass a list directly to DbApiHook​

Why not sqlparse.split?​

Caveats​

Caveats

FAQ​

How do I execute multiple SQL statements in Airflow PostgresHook?​

Why does get_pandas_df only return the last result for multi-statement SQL?​

How do I split SQL by semicolon safely with comments and quotes?​

TL;DR​

The Problem​

Root Cause​

Solution​

Option 1: Disable CDN (Quick Fix)​

Option 2: Contact Hostinger Support to Whitelist API Paths (Recommended)​

Option 3: Add Retry Logic in Code (Defensive Measure)​

Troubleshooting Journey​

Important Notes

FAQ​

Why does WordPress REST API image upload return 405 Not Allowed?​

How to tell if 405 comes from CDN or WordPress?​

TL;DR​

TL;DR​

TL;DR​

TL;DR​

TL;DR​

TL;DR​

TL;DR​

TL;DR​

TL;DR

Symptom

Reproduction

Root Cause

Solution

Option A (Recommended): Split by top-level semicolon, execute one by one

Option B: Pass a list directly to DbApiHook

Why not `sqlparse.split`?

Caveats

FAQ

How do I execute multiple SQL statements in Airflow PostgresHook?

Why does get_pandas_df only return the last result for multi-statement SQL?

How do I split SQL by semicolon safely with comments and quotes?

TL;DR

The Problem

Root Cause

Solution

Option 1: Disable CDN (Quick Fix)

Option 2: Contact Hostinger Support to Whitelist API Paths (Recommended)

Option 3: Add Retry Logic in Code (Defensive Measure)

Troubleshooting Journey

FAQ

Why does WordPress REST API image upload return 405 Not Allowed?

How to tell if 405 comes from CDN or WordPress?

TL;DR

TL;DR

TL;DR

TL;DR

TL;DR

TL;DR

TL;DR

TL;DR