GNU Tar through 1.35 allows file overwrite via directory traversal in crafted TAR archives, with a certain two-step process. First, the victim must extract an archive that contains a ../ symlink to a critical directory. Second, the victim must extract an archive that contains a critical file, specified via a relative pathname that begins with the symlink name and ends with that critical files name. Here, the extraction follows the symlink and overwrites the critical file. This bypasses the protection mechanism of Member name contains .. that would occur for a single TAR archive that attempted to specify the critical file via a ../ approach. For example, the first archive can contain x -> ../../../../../home/victim/.ssh and the second archive can contain x/authorized_keys. This can affect server applications that automatically extract any number of user-supplied TAR archives, and were relying on the blocking of traversal. This can also affect software installation processes in which tar xf is run more than once (e.g., when installing a package can automatically install two dependencies that are set up as untrusted tarballs instead of official packages).
Weakness
The product uses external input to construct a pathname that should be within a restricted directory, but it does not properly neutralize “../” sequences that can resolve to a location that is outside of that directory.
Extended Description
This allows attackers to traverse the file system to access files or directories that are outside of the restricted directory.
The “../” manipulation is the canonical manipulation for operating systems that use “/” as directory separators, such as UNIX- and Linux-based systems. In some cases, it is useful for bypassing protection schemes in environments for which “/” is supported but not the primary separator, such as Windows, which uses “" but can also accept “/”.
Potential Mitigations
- Assume all input is malicious. Use an “accept known good” input validation strategy, i.e., use a list of acceptable inputs that strictly conform to specifications. Reject any input that does not strictly conform to specifications, or transform it into something that does.
- When performing input validation, consider all potentially relevant properties, including length, type of input, the full range of acceptable values, missing or extra inputs, syntax, consistency across related fields, and conformance to business rules. As an example of business rule logic, “boat” may be syntactically valid because it only contains alphanumeric characters, but it is not valid if the input is only expected to contain colors such as “red” or “blue.”
- Do not rely exclusively on looking for malicious or malformed inputs. This is likely to miss at least one undesirable input, especially if the code’s environment changes. This can give attackers enough room to bypass the intended validation. However, denylists can be useful for detecting potential attacks or determining which inputs are so malformed that they should be rejected outright.
- When validating filenames, use stringent allowlists that limit the character set to be used. If feasible, only allow a single “.” character in the filename to avoid weaknesses such as CWE-23, and exclude directory separators such as “/” to avoid CWE-36. Use a list of allowable file extensions, which will help to avoid CWE-434.
- Do not rely exclusively on a filtering mechanism that removes potentially dangerous characters. This is equivalent to a denylist, which may be incomplete (CWE-184). For example, filtering “/” is insufficient protection if the filesystem also supports the use of “" as a directory separator. Another possible error could occur when the filtering is applied in a way that still produces dangerous data (CWE-182). For example, if “../” sequences are removed from the “…/…//” string in a sequential fashion, two instances of “../” would be removed from the original string, but the remaining characters would still form the “../” string.
References