


一、概述(Python Imaging Library Overview)

PIL就是Python Imaging Library的缩写。就是影像库。



The Python Imaging Library adds image processing capabilities to your Python interpreter.


This library provides extensive file format support, an efficient internal representation, and fairly powerful image processing capabilities.


The core image library is designed for fast access to data stored in a few basic pixel formats. It should provide a solid foundation for a general image processing tool.


Let’s look at a few possible uses of this library:


  • Image Archives(图象档案)

The Python Imaging Library is ideal for for image archival and batch processing applications. You can use the library to create thumbnails, convert between file formats, print images, etc.


The current version identifies and reads a large number of formats. Write support is intentionally restricted to the most commonly used interchange and presentation formats.


  • Image Display(图像展示)

The current release includes Tk PhotoImage and BitmapImage interfaces, as well as a Windows DIB interface that can be used with PythonWin and other Windows-based toolkits. Many other GUI toolkits come with some kind of PIL support.

前版本包括Tk PhotoImage和BitmapImage接口,以及一个Windows DIB接口,可以用于PythonWin和其他基于Windows的工具包。许多其他GUI工具包都靠本PIL库支持。

For debugging, there’s also a show method which saves an image to disk, and calls an external display utility.


  • Image Processing(图像处理)

The library contains basic image processing functionality, including point operations, filtering with a set of built-in convolution kernels, and colour space conversions.


The library also supports image resizing, rotation and arbitrary affine transforms.


There’s a histogram method allowing you to pull some statistics out of an image. This can be used for automatic contrast enhancement, and for global statistical analysis.



Using the Image Class (使用图像类)

The most important class in the Python Imaging Library is the Image class, defined in the module with the same name. You can create instances of this class in several ways; either by loading images from files, processing other images, or creating images from scratch.


To load an image from a file, use the open function in the Image module.


>>> import Image
>>> im = Image.open("lena.ppm")

If successful, this function returns an Image object. You can now use instance attributes to examine the file contents.


>>> print im.format, im.size, im.mode
PPM (512, 512) RGB

The format attribute identifies the source of an image. If the image was not read from a file, it is set to None. The size attribute is a 2-tuple containing width and height (in pixels). The mode attribute defines the number and names of the bands in the image, and also the pixel type and depth. Common modes are “L” (luminance) for greyscale images, “RGB” for true colour images, and “CMYK” for pre-press images.


If the file cannot be opened, an IOError exception is raised.


Once you have an instance of the Image class, you can use the methods defined by this class to process and manipulate the image. For example, let’s display the image we just loaded:

>>> im.show()

(The standard version of show is not very efficient, since it saves the image to a temporary file and calls the xv utility to display the image. If you don’t have xv installed, it won’t even work. When it does work though, it is very handy for debugging and tests.)



The following sections provide an overview of the different functions provided in this library.


Reading and Writing Images(读写图片)

The Python Imaging Library supports a wide variety of image file formats. To read files from disk, use the open function in the Image module. You don’t have to know the file format to open a file. The library automatically determines the format based on the contents of the file.


To save a file, use the save method of the Image class. When saving files, the name becomes important. Unless you specify the format, the library uses the filename extension to discover which file storage format to use.


  • Convert files to JPEG(转换成JPEG格式)
import os, sys
import Image

for infile in sys.argv[1:]:
    f, e = os.path.splitext(infile)
    outfile = f + ".jpg"
    if infile != outfile:
        except IOError:
            print "cannot convert", infile

A second argument can be supplied to the save method which explicitly specifies a file format. If you use a non-standard extension, you must always specify the format this way:


  • Create JPEG Thumbnails(创建JPEG缩略图)
import os, sys
import Image

size = 128, 128

for infile in sys.argv[1:]:
    outfile = os.path.splitext(infile)[0] + ".thumbnail"
    if infile != outfile:
            im = Image.open(infile)
            im.save(outfile, "JPEG")
        except IOError:
            print "cannot create thumbnail for", infile

It is important to note that the library doesn’t decode or load the raster data unless it really has to. When you open a file, the file header is read to determine the file format and extract things like mode, size, and other properties required to decode the file, but the rest of the file is not processed until later.


This means that opening an image file is a fast operation, which is independent of the file size and compression type. Here’s a simple script to quickly identify a set of image files:


  • Identify Image Files(识别图像文件)
import sys
import Image

for infile in sys.argv[1:]:
        im = Image.open(infile)
        print infile, im.format, "%dx%d" % im.size, im.mode
    except IOError:
Cutting, Pasting and Merging Images(剪切、粘贴、合并图像)

The Image class contains methods allowing you to manipulate regions within an image. To extract a sub-rectangle from an image, use the crop method.


  • Copying a subrectangle from an image(从图像复制一块小矩形区域)
box = (100, 100, 400, 400)
region = im.crop(box)

The region is defined by a 4-tuple, where coordinates are (left, upper, right, lower). The Python Imaging Library uses a coordinate system with (0, 0) in the upper left corner. Also note that coordinates refer to positions between the pixels, so the region in the above example is exactly 300x300 pixels.

这个区域是4个元素的元祖定义,坐标是(左,上,右,下)。库使用坐标系统(0,0)在左上角。还要注意,坐标参考像素之间的位置,因此该地区在上面的例子中就是300 x300像素。

The region could now be processed in a certain manner and pasted back.


  • Processing a subrectangle, and pasting it back(处理子区域,并粘贴它)
region = region.transpose(Image.ROTATE_180)
im.paste(region, box)

When pasting regions back, the size of the region must match the given region exactly. In addition, the region cannot extend outside the image. However, the modes of the original image and the region do not need to match. If they don’t, the region is automatically converted before being pasted (see the section on Colour Transforms below for details).


Here’s an additional example:

  • Rolling an image(滚动一个图像)
def roll(image, delta):
    "Roll an image sideways"

    xsize, ysize = image.size

    delta = delta % xsize
    if delta == 0: return image

    part1 = image.crop((0, 0, delta, ysize))
    part2 = image.crop((delta, 0, xsize, ysize))
    image.paste(part2, (0, 0, xsize-delta, ysize))
    image.paste(part1, (xsize-delta, 0, xsize, ysize))

    return image

For more advanced tricks, the paste method can also take a transparency mask as an optional argument. In this mask, the value 255 indicates that the pasted image is opaque in that position (that is, the pasted image should be used as is). The value 0 means that the pasted image is completely transparent. Values in-between indicate different levels of transparency.


The Python Imaging Library also allows you to work with the individual bands of an multi-band image, such as an RGB image. The split method creates a set of new images, each containing one band from the original multi-band image. The merge function takes a mode and a tuple of images, and combines them into a new image. The following sample swaps the three bands of an RGB image:


  • Splitting and merging bands(分裂和合并)
r, g, b = im.split()
im = Image.merge("RGB", (b, g, r))

Note that for a single-band image, split returns the image itself. To work with individual colour bands, you may want to convert the image to “RGB” first.


Geometrical Transforms(几何变换)

The Image class contains methods to resize and rotate an image. The former takes a tuple giving the new size, the latter the angle in degrees counter-clockwise.


  • Simple geometry transforms(简单的几何变换)
out = im.resize((128, 128))
out = im.rotate(45) # degrees counter-clockwise

To rotate the image in 90 degree steps, you can either use the rotate method or the transpose method. The latter can also be used to flip an image around its horizontal or vertical axis.


  • Transposing an image(更换一个图像)
out = im.transpose(Image.FLIP_LEFT_RIGHT)
out = im.transpose(Image.FLIP_TOP_BOTTOM)
out = im.transpose(Image.ROTATE_90)
out = im.transpose(Image.ROTATE_180)
out = im.transpose(Image.ROTATE_270)

There’s no difference in performance or result between transpose(ROTATE) and corresponding rotate operations.


A more general form of image transformations can be carried out via the transform method. See the reference section for details.


Colour Transforms (颜色转换)

The Python Imaging Library allows you to convert images between different pixel representations using the convert function.


  • Converting between modes(模式之间的转换)
im = Image.open("lena.ppm").convert("L")

The library supports transformations between each supported mode and the “L” and “RGB” modes. To convert between other modes, you may have to use an intermediate image (typically an “RGB” image).


Image Enhancement (图像增强)

The Python Imaging Library provides a number of methods and modules that can be used to enhance images.


Filters (过滤器)

The ImageFilter module contains a number of pre-defined enhancement filters that can be used with the filter method.


  • Applying filters(应用过滤器)
import ImageFilter
out = im.filter(ImageFilter.DETAIL)
Point Operations (点操作)

The point method can be used to translate the pixel values of an image (e.g. image contrast manipulation). In most cases, a function object expecting one argument can be passed to the this method. Each pixel is processed according to that function:


  • Applying point transforms(应用点转换)
# multiply each pixel by 1.2
out = im.point(lambda i: i * 1.2)

Using the above technique, you can quickly apply any simple expression to an image. You can also combine the point and paste methods to selectively modify an image:


  • Processing individual bands(处理块)
# split the image into individual bands
source = im.split()

R, G, B = 0, 1, 2

# select regions where red is less than 100
mask = source[R].point(lambda i: i < 100 and 255)

# process the green band
out = source[G].point(lambda i: i * 0.7)

# paste the processed band back, but only where red was < 100
source[G].paste(out, None, mask)

# build a new multiband image
im = Image.merge(im.mode, source)

Note the syntax used to create the mask:


imout = im.point(lambda i: expression and 255)

Python only evaluates the portion of a logical expression as is necessary to determine the outcome, and returns the last value examined as the result of the expression. So if the expression above is false (0), Python does not look at the second operand, and thus returns 0. Otherwise, it returns 255.


Enhancement (增强)

For more advanced image enhancement, you can use the classes in the ImageEnhance module. Once created from an image, an enhancement object can be used to quickly try out different settings.


You can adjust contrast, brightness, colour balance and sharpness in this way.


  • Enhancing images(增强图像)
import ImageEnhance

enh = ImageEnhance.Contrast(im)
enh.enhance(1.3).show("30% more contrast")
Image Sequences (图像序列)

The Python Imaging Library contains some basic support for image sequences (also called animation formats). Supported sequence formats include FLI/FLC, GIF, and a few experimental formats. TIFF files can also contain more than one frame.

库包含一些基本的支持图像序列(也称为动画格式)。支持序列格式包括FLI /方法,GIF,和一些实验格式。TIFF文件也可以包含多个框架。

When you open a sequence file, PIL automatically loads the first frame in the sequence. You can use the seek and tell methods to move between different frames:


  • Reading sequences(阅读序列)
import Image

im = Image.open("animation.gif")
im.seek(1) # skip to the second frame

    while 1:
        # do something to im
except EOFError:
    pass # end of sequence

As seen in this example, you’ll get an EOFError exception when the sequence ends.


Note that most drivers in the current version of the library only allow you to seek to the next frame (as in the above example). To rewind the file, you may have to reopen it.


The following iterator class lets you to use the for-statement to loop over the sequence:


  • A sequence iterator class(一个序列迭代器类)
class ImageSequence:
    def __init__(self, im):
        self.im = im
    def __getitem__(self, ix):
            if ix:
            return self.im
        except EOFError:
            raise IndexError # end of sequence

for frame in ImageSequence(im):
    # ...do something to frame...
Postscript Printing(打印)

The Python Imaging Library includes functions to print images, text and graphics on Postscript printers. Here’s a simple example:


  • Drawing Postscript(画打印)
import Image
import PSDraw

im = Image.open("lena.ppm")
title = "lena"
box = (1*72, 2*72, 7*72, 10*72) # in points

ps = PSDraw.PSDraw() # default is sys.stdout

# draw the image (75 dpi)
ps.image(box, im, 75)

# draw centered title
ps.setfont("HelveticaNarrow-Bold", 36)
w, h, b = ps.textsize(title)
ps.text((4*72-w/2, 1*72-h), title)

More on Reading Images(更多阅读图片)

As described earlier, the open function of the Image module is used to open an image file. In most cases, you simply pass it the filename as an argument:


im = Image.open("lena.ppm")

If everything goes well, the result is an Image object. Otherwise, an IOError exception is raised.


You can use a file-like object instead of the filename. The object must implement read, seek and tell methods, and be opened in binary mode.


  • Reading from an open file(从一个打开的文件阅读)
fp = open("lena.ppm", "rb")
im = Image.open(fp)

To read an image from string data, use the StringIO class:


  • Reading from a string(从一个字符串阅读)
import StringIO

im = Image.open(StringIO.StringIO(buffer))

Note that the library rewinds the file (using seek(0)) before reading the image header. In addition, seek will also be used when the image data is read (by the load method). If the image file is embedded in a larger file, such as a tar file, you can use the ContainerIO or TarIO modules to access it.


  • Reading from a tar archive(从一个tar存档阅读)
import TarIO

fp = TarIO.TarIO("Imaging.tar", "Imaging/test/lena.ppm")
im = Image.open(fp)
Controlling the Decoder(控制译码器)

Some decoders allow you to manipulate the image while reading it from a file. This can often be used to speed up decoding when creating thumbnails (when speed is usually more important than quality) and printing to a monochrome laser printer (when only a greyscale version of the image is needed).


The draft method manipulates an opened but not yet loaded image so it as closely as possible matches the given mode and size. This is done by reconfiguring the image decoder.


  • Reading in draft mode(阅读草稿模式)
im = Image.open(file)
print "original =", im.mode, im.size

im.draft("L", (100, 100))
print "draft =", im.mode, im.size

This prints something like:

original = RGB (512, 512)
draft = L (128, 128)

Note that the resulting image may not exactly match the requested mode and size. To make sure that the image is not larger than the given size, use the thumbnail method instead.



The Python Imaging Library handles raster images; that is, rectangles of pixel data.



An image can consist of one or more bands of data. The Python Imaging Library allows you to store several bands in a single image, provided they all have the same dimensions and depth.


To get the number and names of bands in an image, use the getbands method.


  • Mode(模式)

The mode of an image defines the type and depth of a pixel in the image. The current release supports the following standard modes:


1 (1-bit pixels, black and white, stored with one pixel per byte)


L (8-bit pixels, black and white)


P (8-bit pixels, mapped to any other mode using a colour palette)


RGB (3x8-bit pixels, true colour)

3 x8-bit像素,真正的颜色

RGBA (4x8-bit pixels, true colour with transparency mask)

4 x8-bit像素,真实的颜色和透明蒙版

CMYK (4x8-bit pixels, colour separation)

4 x8-bit像素,分色

YCbCr (3x8-bit pixels, colour video format)

3 x8-bit像素,彩色视频格式

I (32-bit signed integer pixels)


F (32-bit floating point pixels)


PIL also provides limited support for a few special modes, including LA (L with alpha), RGBX (true colour with padding) and RGBa (true colour with premultiplied alpha). However, PIL doesn’t support user-defined modes; if you to handle band combinations that are not listed above, use a sequence of Image objects.

库还提供了有限的支持几个特殊的模式,包括LA(有alpha的L), RGBX(真正的色彩填充)和RGBa(真彩色自左乘α)。然而,库不支持用户定义的模式;如果你处理不上面列出的乐队组合,使用一个图像序列对象。

You can read the mode of an image through the mode attribute. This is a string containing one of the above values.


Size (大小)

You can read the image size through the size attribute. This is a 2-tuple, containing the horizontal and vertical size in pixels.


Coordinate System (坐标系统)

The Python Imaging Library uses a Cartesian pixel coordinate system, with (0,0) in the upper left corner. Note that the coordinates refer to the implied pixel corners; the centre of a pixel addressed as (0, 0) actually lies at (0.5, 0.5).


Coordinates are usually passed to the library as 2-tuples (x, y). Rectangles are represented as 4-tuples, with the upper left corner given first. For example, a rectangle covering all of an 800x600 pixel image is written as (0, 0, 800, 600).

坐标通常是通过到库集合(x, y),矩形表示为4元组给出的左上角。例如,一个矩形覆盖所有800 x600的像素图像写成(0,0,800,600)。

Palette (调色板)

The palette mode (“P”) uses a colour palette to define the actual colour for each pixel.


Info (信息)

You can attach auxiliary information to an image using the info attribute. This is a dictionary object.


How such information is handled when loading and saving image files is up to the file format handler (see the chapter on Image File Formats). Most handlers add properties to the info attribute when loading an image, but ignore it when saving images.


Filters (过滤器)

For geometry operations that may map multiple input pixels to a single output pixel, the Python Imaging Library provides four different resampling filters.


Pick the nearest pixel from the input image. Ignore all other input pixels.


Use linear interpolation over a 2x2 environment in the input image. Note that in the current version of PIL, this filter uses a fixed input environment when downsampling.

使用线性插值在输入图像2 x2的环境。注意,在库的当前版本中,这个过滤器将采样时使用一个固定的输入环境。

Use cubic interpolation over a 4x4 environment in the input image. Note that in the current version of PIL, this filter uses a fixed input environment when downsampling.

使用三次插值4 x4环境输入图像。注意,在库的当前版本中,这个过滤器将采样时使用一个固定的输入环境。

(New in PIL 1.1.3). Calculate the output pixel value using a high-quality resampling filter (a truncated sinc) on all pixels that may contribute to the output value. In the current version of PIL, this filter can only be used with the resize and thumbnail methods.


Note that in the current version of PIL, the ANTIALIAS filter is the only filter that behaves properly when downsampling (that is, when converting a large image to a small one). The BILINEAR and BICUBIC filters use a fixed input environment, and are best used for scale-preserving geometric transforms and upsamping.




JPEG(全称是Joint Photographic Experts Group)是常见的一种图像格式,它由联合照片专家组开发并命名为"ISO 10918-1",JPEG仅仅是一种俗称而已。



JPEG是Joint Photographic Experts Group(联合图像专家组)的缩写,文件后辍名为".jpg"或".jpeg",是最常用的图像文件格式,由一个软件开发联合会组织制定,是一种有损压缩格式,能够将图像压缩在很小的储存空间,图像中重复或不重要的资料会被丢失,因此容易造成图像数据的损伤。尤其是使用过高的压缩比例,将使最终解压缩后恢复的图像质量明显降低,如果追求高品质图像,不宜采用过高压缩比例。

但是JPEG压缩技术十分先进,它用有损压缩方式去除冗余的图像数据,在获得极高的压缩率的同时能展现十分丰富生动的图像,换句话说,就是可以用最少的磁盘空间得到较好的图像品质。而且 JPEG是一种很灵活的格式,具有调节图像质量的功能,允许用不同的压缩比例对文件进行压缩,支持多种压缩级别,压缩比率通常在10:1到40:1之间,压缩比越大,品质就越低;相反地,压缩比越小,品质就越好。


JPEG格式是目前网络上最流行的图像格式,是可以把文件压缩到最小的格式,在 Photoshop软件中以JPEG格式储存时,提供11级压缩级别,以0—10级表示。其中0级压缩比最高,图像品质最差。即使采用细节几乎无损的10 级质量保存时,压缩比也可达 5:1。以BMP格式保存时得到4.28MB图像文件,在采用JPG格式保存时,其文件仅为178KB,压缩比达到24:1。经过多次比较,采用第8级压缩为存储空间与图像质量兼得的最佳比例。PG文件的优点是体积小巧,并且兼容性好。

  • 格式应用



  • 格式类型

JPEG2000作为JPEG的升级版,其压缩率比JPEG高约30%左右,同时支持有损和无损压缩。JPEG2000格式有一个极其重要的特征在于它能实现 [1] 渐进传输,即先传输图像的轮廓,然后逐步传输数据,不断提高图像质量,让图像由朦胧到清晰显示。此外,JPEG2000还支持所谓的"感兴趣区域" 特性,可以任意指定影像上感兴趣区域的压缩质量,还可以选择指定的部分先解压缩


  • 压缩标准



1.顺序式编码(Sequential Encoding)


2.递增式编码(Progressive Encoding)

当图像传输的时间较长时,可将图像分数次处理,以从模糊到清晰的方式来传送图像(效果类似GIF在网络上的传 输)。

3.无有损编码(Lossless Encoding)

4.阶梯式编码(Hierarchical Encoding)


在Independent JPEG Group所提供的源码上,有jpegtran程序,就提供了优化Huffman [2] ,转成渐进式,镜射,旋转这些无损耗转换。


