8.4. Binary Data Types

The bytea data type allows storage of binary strings; see Table 8-6.

Table 8-6. Binary Data Types

NameStorage SizeDescription
bytea4 bytes plus the actual binary stringvariable-length binary string

A binary string is a sequence of octets (or bytes). Binary strings are distinguished from characters strings by two characteristics: First, binary strings specifically allow storing octets of value zero and other "non-printable" octets. Second, operations on binary strings process the actual bytes, whereas the encoding and processing of character strings depends on locale settings.

When entering bytea values, octets of certain values must be escaped (but all octet values may be escaped) when used as part of a string literal in an SQL statement. In general, to escape an octet, it is converted into the three-digit octal number equivalent of its decimal octet value, and preceded by two backslashes. Table 8-7 contains the characters which must be escaped, and gives the alternate escape sequences where applicable.

Table 8-7. bytea Literal Escaped Octets

Decimal Octet ValueDescriptionEscaped Input RepresentationExampleOutput Representation
0zero octet'\\000'SELECT '\\000'::bytea;\000
39single quote'\'' or '\\047'SELECT '\''::bytea;'
92backslash'\\\\' or '\\134'SELECT '\\\\'::bytea;\\

Note that the result in each of the examples in Table 8-7 was exactly one octet in length, even though the output representation of the zero octet and backslash are more than one character.

The reason that you have to write so many backslashes, as shown in Table 8-7, is that an input string written as a string literal must pass through two parse phases in the PostgreSQL server. The first backslash of each pair is interpreted as an escape character by the string-literal parser and is therefore consumed, leaving the second backslash of the pair. The remaining backslash is then recognized by the bytea input function as starting either a three digit octal value or escaping another backslash. For example, a string literal passed to the server as '\\001' becomes \001 after passing through the string-literal parser. The \001 is then sent to the bytea input function, where it is converted to a single octet with a decimal value of 1. Note that the apostrophe character is not treated specially by bytea, so it follows the normal rules for string literals. (See also Section 4.1.2.1.)

Bytea octets are also escaped in the output. In general, each "non-printable" octet is converted into its equivalent three-digit octal value and preceded by one backslash. Most "printable" octets are represented by their standard representation in the client character set. The octet with decimal value 92 (backslash) has a special alternative output representation. Details are in Table 8-8.

Table 8-8. bytea Output Escaped Octets

Decimal Octet ValueDescriptionEscaped Output RepresentationExampleOutput Result
92backslash\\SELECT '\\134'::bytea;\\
0 to 31 and 127 to 255"non-printable" octets\xxx (octal value)SELECT '\\001'::bytea;\001
32 to 126"printable" octetsASCII representationSELECT '\\176'::bytea;~

Depending on the front end to PostgreSQL you use, you may have additional work to do in terms of escaping and unescaping bytea strings. For example, you may also have to escape line feeds and carriage returns if your interface automatically translates these.

The SQL standard defines a different binary string type, called BLOB or BINARY LARGE OBJECT. The input format is different compared to bytea, but the provided functions and operators are mostly the same.