What are file formats known to be unsafe?

What file formats are known to be unsafe?

  • I recently found out pdf's can contain viruses, and from the impression I got its more than a buffer overflow error (I heard it may visit urls automatically but the person sounded unsure) What are some formats that I should be wary of until everything is patched? I remember at one point there was something in a vb6 project that would execute code on project loadup (without running). That was dangerous.

  • Answer:

    Some formats can be called inherently insecure due to their complexity and their history of use as attack vectors. Adobe PDF and MS Office files come to mind. Any kind of binary executable is certainly problematic unless sandboxing is deployed. But in general it depends on the application that is used to open the file, not the file itself. Even simple formats that cannot embed executable code can be parsed by an application in the wrong way, leading to bugs and potential vulnerabilities. Similarly, the sandboxing application may have bugs that allow executable code to escalate its priviledges, so I would rate sandboxed executables about as dangerous as complex file formats. It may be possible to have relatively secure file formats by using a data format that can be checked automatically, using an automatically generated parser that does not need any information about the file type except the used grammar. I think the ASN.1 format is a candidate for this. But this kind of technology is used almost nowhere.

acidzombie24 at Information Security Visit the source

Was this solution helpful to you?

Other answers

In theory, any format that requires complicated processing or allows embedding of other formats (especially Flash) can be dangerous. The most relevant issues right now are however: Any Microsoft Office files (not so much because of Office vulnerabilities but because these files can embed Flash and exploit its vulnerabilities) PDF files Obviously, any files that can execute by themselves (executables and batch files of all kinds). "Batch" files on Windows are not only *.bat files but also JavaScript files *.js, Visual Basic Script files *.vbs, Windows Script files *.wsf and PowerShell with its various file extensions. Archive files (mostly ZIP or RAR) because these are commonly used to compress file types mentioned above and to sneak past filters.

Wladimir Palant

The PDF problem is probably a reference to an old problem whereby pre-installed PDF plugins would automatically execute JavaScript specified in the URL fragment. There is no comprehensive list of file formats that are dangerous. Not only is this blacklisting, it also ignores http://en.wikipedia.org/wiki/Polyglot_%28computing%29: The term is sometimes applied to programs that are valid in more than one language, but do not strictly perform the same function in each. For example, it is possible to construct a http://www.thinkfu.com/blog/gifjavascript-polyglots and an http://lcamtuf.coredump.cx/squirrel/. Any file format that is safe but for which it is possible to write a polyglot with another unsafe language, is potentially unsafe. When a server sends a file, it also sends that file’s MIME type in a Content-Type header. All is well when the Content-Type the server asserts is consistent with the expected context in which that content gets used. What happens when the server does not send a Content-Type? What happens when a file with one Content-Type is sent when a different type is expected? Sadness happens. Some browsers consider the content-type the server asserts to be authoritative and if the content fails to parse as that type, the content is not rendered. Others ignore the server asserted type and try to guess (sniff the content) for its type. This sniffing can take the form of heuristics like the suffix of the file name in the URL that specifies it, the “magic” first couple of bytes of the content, or simply trying to parse the file with different parsers until one fits. The type of parser tried is sometimes constrained by the particular tag (fr’instance content expected by an img tag would only attempt to be parsed according to native image formats supported by the browser.). The problem is further exacerbated by plugins like Java and Flash and by different types of caches and “file save” feature in browsers which may or may not remember what content-type was asserted by the server. Further, any binary file format can potentially escalate privileges by tickling buffer overflows in code that decodes it. If you are trying to serve content from untrusted sources, you need to proxy and normalize it.

Mike Samuel

Related Q & A:

Just Added Q & A:

Find solution

For every problem there is a solution! Proved by Solucija.

  • Got an issue and looking for advice?

  • Ask Solucija to search every corner of the Web for help.

  • Get workable solutions and helpful tips in a moment.

Just ask Solucija about an issue you face and immediately get a list of ready solutions, answers and tips from other Internet users. We always provide the most suitable and complete answer to your question at the top, along with a few good alternatives below.