Skip to content

Symlink Troubles #761

Closed
Closed
@AndrewFasano

Description

@AndrewFasano

I believe there are a handful of bugs in how unblob handles symlinks. I'm not too confident about my understanding of unblob internals so I wanted to talk through these instead of just proposing fixes.

I've encountered all these failures with this firmware image: https://legacyfiles.us.dlink.com/DCS-5009L/REVA/FIRMWARE/DCS-5009L_REVA_FIRMWARE_1.00.B1.zip. I'm comparing the unblob output to that generated by running binwalk.

Issue 1: CPIO extractor calls create_symlink with backwards arguments - the source of a symlink is the filename (entry.path) while the destination is where the link points to (I think). Without this fix all the binaries in this firmware that symlink to /bin/busybox are missing in the output archive (i.e., /sbin/arp) as _get_checked_link in file_utils.py sees that /bin/busybox already exists and skips (checking the destination, not the source, since the arguments are swapped).

diff --git a/unblob/handlers/archive/cpio.py b/unblob/handlers/archive/cpio.py
index eacb30a..fecb5dc 100644
--- a/unblob/handlers/archive/cpio.py
+++ b/unblob/handlers/archive/cpio.py
@@ -223,7 +223,7 @@ class CPIOParserBase:
                         self.file[entry.start_offset : entry.start_offset + entry.size]
                     ).decode("utf-8")
                 )
-                fs.create_symlink(src=link_path, dst=entry.path)
+                fs.create_symlink(src=entry.path, dst=link_path)
             elif (
                 stat.S_ISCHR(entry.mode)
                 or stat.S_ISBLK(entry.mode)

Issue 2: create_symlink creates symlinks backwards, they currently point from dst to src. The call to _get_checked_link has it right and correctly checks if the src argument already exists. But, the code to actually create the link would previously create a link from dst -> src instead of from src -> dst. This would raise a FileExistsError if Issue 1 is fixed with no other changes.

diff --git a/unblob/file_utils.py b/unblob/file_utils.py
index 167357d..c6b8677 100644
--- a/unblob/file_utils.py
+++ b/unblob/file_utils.py
@@ -593,12 +593,13 @@ class FileSystem:
             # but they are relocatable
             src = self._path_to_root(dst.parent) / chop_root(src)

-        safe_link = self._get_checked_link(src=dst.parent / src, dst=dst)
+        safe_link = self._get_checked_link(src=src, dst=dst)

         if safe_link:
-            dst = safe_link.dst.absolute_path
-            self._ensure_parent_dir(dst)
-            dst.symlink_to(src)
+            src = safe_link.src.absolute_path
+            self._ensure_parent_dir(src)
+            logger.warning(f"Creating symlink {dst} -> {src}")
+            src.symlink_to(dst)

     def create_hardlink(self, src: Path, dst: Path):
         """Create a new hardlink dst to the existing file src."""

Issue 3: False positives in path traversal detection. This firmware contains a CPIO archive with some valid, non-malicious symlinks, but they're incorrectly flagged as potential path traversals and skipped during extraction. This one has me pretty confused and I'm not sure what the best fix would be.

If I run cpio -itv on the CPIO archive, I see a valid symlink for init to busybox:

lrwxrwxrwx   1 501      501            15 May 14  2014 /sbin/init -> ../bin/busybox

But during unblob extraction, I get a warning and the symlink is skipped:

2024-02-12 05:20.25 [warning  ] Potential path traversal through link Skipped. link_path=sbin/init path=../bin/busybox

This happens for ~20 binaries in this firmware that have relative symlink to busybox that involve .. as part of the paths. I believe this is in part caused by the FSLink constructor failing to create self.dst based on the self.src path, which can maybe be fixed with:

diff --git a/unblob/file_utils.py b/unblob/file_utils.py
index c6b8677..e03d4ec 100644
--- a/unblob/file_utils.py
+++ b/unblob/file_utils.py
@@ -425,7 +425,7 @@ class _FSPath:

 class _FSLink:
     def __init__(self, *, root: Path, src: Path, dst: Path) -> None:
-        self.dst = _FSPath(root=root, path=dst)
+        self.dst = _FSPath(root=root, path=root / src.parent / dst)
         self.src = _FSPath(root=root, path=src)
         self.is_safe = self.dst.is_safe and self.src.is_safe

With this fix, all the files except 2 (e.g., etc_ro/ppp/peers/3g -> /etc/3g )are then created successfully - these remaining 2 have end up with .. in the src field incorrectly after the body of the is_absolute check in create_symlink which incorrectly checks src instead of dst. This can be fixed with:

diff --git a/unblob/file_utils.py b/unblob/file_utils.py
index e03d4ec..5db58d4 100644
--- a/unblob/file_utils.py
+++ b/unblob/file_utils.py
@@ -587,11 +587,21 @@ class FileSystem:
         """Create a symlink dst with the link/content/target src."""
         logger.debug("creating symlink", file_path=dst, link_target=src, _verbosity=3)

-        if src.is_absolute():
-            # convert absolute paths to dst relative paths
-            # these would point to the same path if self.root would be the real root "/"
-            # but they are relocatable
-            src = self._path_to_root(dst.parent) / chop_root(src)
+        if dst.is_absolute():
+            # If the symlink destination is absolute, we need to make it relative to the root
+            # so it can be safely created in the extraction directory.
+            # If the resulting path points to outside of the extraction directory, we skip it.
+            dst = self.root / chop_root(dst)
+            if not is_safe_path(self.root, dst):
+                self.record_problem(
+                    LinkExtractionProblem(
+                        problem="Potential path traversal through symlink",
+                        resolution="Skipped.",
+                        path=str(dst),
+                        link_path=str(src),
+                    )
+                )
+                return

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions