Skip to content

FileExistsError during extraction with Netgear firmware #754

Closed
@AndrewFasano

Description

@AndrewFasano

Describe the bug
During extraction of at least 29 NETGEAR firmware images, unblob may try creating the same output file twice triggering an exception. As a result, some files that should be extracted are not.

To Reproduce
Steps to reproduce the behavior:

  1. Download a sample firmware to trigger the bug with: wget https://www.downloads.netgear.com/files/GDC/M4100/M4100-V10.0.2.20.zip
  2. Launch unblob with command unblob -v M4100-V10.0.2.20.zip
  3. See error:
2024-02-10 23:39.13 [error    ] Unknown error happened while extracting chunk pid=2295991
Traceback (most recent call last):
  File "/unblob/unblob/processing.py", line 607, in _extract_chunk
    if result := chunk.extract(inpath, extract_dir):
  File "/unblob/unblob/models.py", line 115, in extract
    return self.handler.extract(inpath, outdir)
  File "/unblob/unblob/models.py", line 452, in extract
    return self.EXTRACTOR.extract(inpath, outdir)
  File "/unblob/unblob/handlers/archive/cpio.py", line 384, in extract
    parser.dump_entries(fs)
  File "/unblob/unblob/handlers/archive/cpio.py", line 215, in dump_entries
    fs.carve(entry.path, self.file, entry.start_offset, entry.size, mode=entry.mode & 0o777)
  File "/unblob/unblob/file_utils.py", line 511, in carve
    carve(safe_path, file, start_offset, size, mode=mode)
  File "/unblob/unblob/file_utils.py", line 294, in carve
    with carve_path.open("xb") as f:
  File "/usr/lib/python3.10/pathlib.py", line 1119, in open
    return self._accessor.open(self, mode, buffering, encoding, errors,
FileExistsError: [Errno 17] File exists: '/tmp/tmp1151iav4/M4100_V10.0.2.20.zip_extract/m4100v10.0.2.20.stk_extract/1201148-2097967.lzma_extract/lzma.uncompressed_extract/lib/libthread_db-1.0.so'

Expected behavior
This error should not be raised, instead additional files should be extracted. I made a simpel change in file_utils.py's carve method (see below) to return early if the target file already exists and with this change an extra 75 files are created in [extract_dir]/m4100v10.0.2.20.stk_extract/1201148-2097967.lzma_extract/lzma.uncompressed_extract. I doubt this is the right fix, but it shows that this bug prevents some files from being extracted.

Environment information:

  • OS: Ubuntu 22.04
  • Docker
Linux b4935d734f27 6.2.2 #3 SMP PREEMPT_DYNAMIC Wed Mar  8 12:03:22 EST 2023 x86_64 x86_64 x86_64 GNU/Linux

DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=22.04
DISTRIB_CODENAME=jammy
DISTRIB_DESCRIPTION="Ubuntu 22.04.3 LTS"

The following executables found installed, which are needed by unblob:
    7z                          ✓
    debugfs                     ✓
    jefferson                   ✓
    lz4                         ✓
    lziprecover                 ✓
    lzop                        ✓
    sasquatch                   ✓
    sasquatch-v4be              ✓
    simg2img                    ✓
    ubireader_extract_files     ✓
    ubireader_extract_images    ✓
    unar                        ✓
    zstd                        ✓

Additional context
I found this bug while doing some large-scale evaluations of filesystems produced by binwalk and unblob using fw2tar.

My (likely-incorrect) patch that results in additional files being created:

diff --git a/unblob/file_utils.py b/unblob/file_utils.py
index 21e887b..3db4b98 100644
--- a/unblob/file_utils.py
+++ b/unblob/file_utils.py
@@ -291,6 +291,9 @@ def carve(carve_path: Path, file: File, start_offset: int, size: int):
     """Extract part of a file."""
     carve_path.parent.mkdir(parents=True, exist_ok=True)

+    if carve_path.exists():
+        print(f"Warning not replacing {carve_path}")
+        return
     with carve_path.open("xb") as f:
         for data in iterate_file(file, start_offset, size):
             f.write(data)

After fixing this, I got another error along the same vein in file_utils which I patched with:

diff --git a/unblob/file_utils.py b/unblob/file_utils.py
index 21e887b..3db4b98 100644
--- a/unblob/file_utils.py
+++ b/unblob/file_utils.py

@@ -579,7 +582,8 @@ class FileSystem:
         if safe_link:
             dst = safe_link.dst.absolute_path
             self._ensure_parent_dir(dst)
-            dst.symlink_to(src)
+            if not dst.exists():
+                dst.symlink_to(src)

     def create_hardlink(self, src: Path, dst: Path):
         """Create a new hardlink dst to the existing file src."""

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions