Saving a Post Returns a 502? Synchronous `media_handle_sideload()` Blows Past the Edge Proxy Timeout

On a WordPress-based editorial workflow I worked on, incoming posts often referenced external images: URLs pointing at files on somebody else's server. To keep every asset tidy in one place, I wrote code that ran on save: fetch each remote image and import it into the media library with media_handle_sideload(). It is a very convenient function. One call and it handles everything: downloading the file, copying it into uploads, and generating every thumbnail size registered by the theme and plugins.

One detail turned out to be fatal: all of that ran synchronously, inside the save request itself. As long as posts carried one or two images, nobody noticed anything. The moment posts started arriving with several images at once, the reports came in: a user hits save, the spinner grinds on, and what comes back is a 502 page.

A 502 on save is the most terrifying kind of error for a user, not because of the technical details but because of what it appears to mean: the post looks like it failed to save. Some people panicked, some hammered the save button, some rewrote their work from scratch. Yet every time I checked the database, the post had almost always been saved correctly. It was not the save that died. It was the response.

This is roughly what the offending code looked like:

add_action('save_post', function ($post_id) {
    $remote_urls = extract_remote_image_urls($post_id);
 
    foreach ($remote_urls as $url) {
        // Download + copy into uploads + generate
        // every registered thumbnail size. Synchronously.
        $tmp = download_url($url);
        media_handle_sideload(
            ['name' => basename($url), 'tmp_name' => $tmp],
            $post_id
        );
    }
});

Why this happens

The math is simple and brutal. Sideloading one image means an outbound HTTP request to download it, a write into the uploads folder, and then a resize for every registered thumbnail size. And image resizing is not cheap CPU work. A single large image can take several seconds on its own. Multiply by N images per post and M thumbnail sizes per image, and one save request can run for tens of seconds to minutes.

Meanwhile, in front of the server sits an edge proxy with a hard timeout. The proxy does not care that PHP is still busy resizing; once the limit passes, it cuts the connection and hands the user a 502. PHP itself usually keeps running to completion behind the scenes, which is exactly why the post was saved even though the user saw an error. The root cause was not a bug in any one function. It was an architectural decision: seconds-to-minutes of bulk media work stuffed into a single request a human is waiting on, behind a proxy with a finite supply of patience.

The fix

The fix is one word: decouple. The save request has to return as fast as possible, and the heavy work has to move behind the scenes. I split it into three steps.

First, on save, do only the cheap thing. Store the list of remote URLs in post meta and let the content render those images hotlinked for now. The post displays normally, the images are still visible, and the save response is instant again:

add_action('save_post', function ($post_id) {
    $remote_urls = extract_remote_image_urls($post_id);
 
    if (empty($remote_urls)) {
        return;
    }
 
    // Cheap: remember the URLs, keep hotlinking them for now.
    update_post_meta($post_id, '_pending_image_imports', $remote_urls);
 
    // Heavy work happens later, outside this request.
    wp_schedule_single_event(time() + 10, 'import_post_images', [$post_id]);
});

Second, the real import runs as an async job via wp_schedule_single_event(). The handler sideloads the images one by one, swaps the URLs in the content and meta as each one finishes, and if it starts approaching the PHP time limit, it saves its progress and schedules itself again. Third, the job has to be idempotent: every imported attachment gets a meta entry recording its source URL, and before sideloading, the handler first checks whether that URL was already imported. If so, skip it. That way the job can die mid-run and be re-run at any time without duplicating images:

add_action('import_post_images', function ($post_id) {
    $started = time();
    $pending = get_post_meta($post_id, '_pending_image_imports', true);
 
    if (empty($pending)) {
        return;
    }
 
    foreach ($pending as $index => $url) {
        // Idempotent: skip URLs already imported, keyed by a
        // source-url meta stored on the attachment.
        if (find_attachment_by_source_url($url)) {
            unset($pending[$index]);
            continue;
        }
 
        // Close to the PHP time limit? Save progress and re-schedule.
        if (time() - $started > 20) {
            update_post_meta($post_id, '_pending_image_imports', array_values($pending));
            wp_schedule_single_event(time() + 10, 'import_post_images', [$post_id]);
            return;
        }
 
        $tmp = download_url($url);
        $attachment_id = media_handle_sideload(
            ['name' => basename($url), 'tmp_name' => $tmp],
            $post_id
        );
 
        if (is_wp_error($attachment_id)) {
            continue; // Stays pending, a later run retries it.
        }
 
        update_post_meta($attachment_id, '_source_url', $url);
        replace_image_url_in_post($post_id, $url, wp_get_attachment_url($attachment_id));
 
        unset($pending[$index]);
        update_post_meta($post_id, '_pending_image_imports', array_values($pending));
    }
 
    if (empty($pending)) {
        delete_post_meta($post_id, '_pending_image_imports');
    } else {
        // A download failed: keep the rest pending and retry later.
        update_post_meta($post_id, '_pending_image_imports', array_values($pending));
        wp_schedule_single_event(time() + 60, 'import_post_images', [$post_id]);
    }
});

Note that the pending list is updated after every single image, not once at the end. That is what makes the job resumable: if the process dies on image three, the next run starts at image four instead of from zero. In the production version I also added a per-URL retry cap, so a single dead link cannot keep the job spinning forever. The end result: saves return in a few hundred milliseconds, images are visible from the start as hotlinks, and a few minutes later everything has been replaced by local copies from the media library, with no human waiting on any of it.

The takeaway

The checklist I took home from this incident:

Never do bulk media work inside a request a user is waiting on. Downloads and thumbnail generation are seconds-to-minutes work, not milliseconds.
Save fast, process later. The response only needs to confirm the data is safe, not wait for every derivative to be generated.
Hotlink-then-replace is an acceptable intermediate state. Images served from someone else's server for two minutes beats a save that dies with a 502.
Async jobs must be resumable and idempotent, because sooner or later they will be interrupted. Design them so re-running is always safe.

And when a user reports a 502 but the data turns out to be saved, I now know exactly where to look: some heavy work is hitching a ride on the wrong request.