What is pdftoppm
pdftoppm is a command-line tool available in Ubuntu to convert pdf to images. It comes pre-installed in Ubuntu 12.04 and above. In this post, you will learn how to use this tool to convert the pages of your pdf files into png, jpg and tiff files.
Usage
The general syntax for using pdftoppm is:
pdftoppm [options] PdfFileName.pdf ImageName
For example:
pdftoppm -png file.pdf img
To convert the pdf file to images, go to the folder where your pdf file is placed. Right-click and then select “Open in Terminal” in that directory.
After opening terminal, you will enter the commands to convert the pdf file to images. There are several options for you:
1. Convert the whole pdf file
To convert to png file, you will use -png. If you want to convert to jpg, you will use -jpeg. If file.pdf is the pdf file and I want the name of my images to start with page numbers at the end, I will use:
For png
pdftoppm -png file.pdf image
For jpg
pdftoppm -jpeg file.pdf image
2. Convert the first page only
To convert one page only, you use the -singlefile option.
For png
pdftoppm -png -singlefile file.pdf image
For jpg
pdftoppm -jpeg -singlefile file.pdf image
3. Convert any one page
The -f option is used to specify the starting page to start converting. If you want to convert a single page, let’s say page number 25, you can specify -f 25 along with -singlefile option (-singlefile is for one page) and it will convert the page 25 only.
For png
pdftoppm -f 25 -png -singlefile file.pdf image
For jpg
pdftoppm -f 25 -jpeg -singlefile file.pdf image
4. Convert a range of pages
As I told above, the -f option is used to specify the starting page. The -l option is used to specify the last page. For example, if I want to convert from page number 5 to page number 10, I would write -f 5 -l 10
For png
pdftoppm -png -f 5 -l 10 file.pdf name
For jpg
pdftoppm -jpeg -f 5 -l 10 file.pdf name
5. Convert only odd pages or even pages
You will use -o to convert the odd pages only and you will use -e to convert even pages only. There is something weird about this option. If you use -e to get the even-numbered pages, it would convert the odd-numbered pages and if you use -o to get the odd-numbered pages, it would give the even-numbered pages. It would make sense if we assume that it starts counting pages from 0 internally. But, however, I told you how it works practically. Let me know in the comments why you think this happens or if this is just a bug.
Example of odd
pdftoppm -png -o file.pdf name
Example of even
pdftoppm -jpeg -e file.pdf name
6. Adjust the quality of output images
You can adjust the dpi of the output images using the -r option. For example, if I want jpg with 600dpi, I would use:
pdftoppm -jpeg -r 600 -singlefile file.pdf image
7. Adjust the height and width of the output images
You can specify a custom height and width for the output images. To specify a custom height, use the -scale-to-x option and to specify the width, use the -scale-to-y option. For example, if you want your image to have a height of 1600 and a width of 900, you would use
pdftoppm -jpeg -scale-to-x 900 -scale-to-y 1600 file.pdf image
To specify the width only and let the height adjust automatically according to the aspect ratio of the image, specify the -scale-to-y to -1 and -scale-to-x to your width. For example, if I want to make the width 920 and height adjust automatic, I will use:
pdftoppm -jpeg -scale-to-x 920 -scale-to-y -1 file.pdf image
Similarly, you can specify a height in -scale-to-y and set -scale-to-x to -1 to make width automatic.
8. Convert PDFs in bulk
You can use a simple trick to convert many pdf to images at once. We will create a bash file and execute it.
- Put your pdf files in one folder
- Right Click and select “Open in Terminal”
- Create a new bash file using the following command.
>bulk.sh
- Use the following command to edit the bash file
gedit bulk.sh
- Enter the following code in the bash file and save it.
#!/bin/bash
for i in *.pdf; do
pdftoppm -png "$i" "${i%.pdf*}"
done
- Change the permissions of the bash file using the following command
chmod a+x bulk.sh
- Execute the bash file
./bulk.sh
We are just using a for loop to execute the command for all the files in the directory. You can use the options which you studied above in this loop. But remember that the options will be applied to all the files.
8. Use the cropbox
If you find that some useless information is appearing on your pages like shown in the screenshot below, you can use the cropbox option to get rid of it.
pdftoppm -png -cropbox file.pdf image
9. Get black and white or grayscale images
To get black and white images, use the option -mono
pdftoppm -png -mono file.pdf image
To get grayscale images, use the option -gray
pdftoppm -jpeg -gray file.pdf image
10. Get images in tiff format
To get images in tiff format, use -tiff option instead of -png or -jpeg
pdftoppm -tiff file.pdf image
11. Get images in ppm format
To get images in ppm format, do not specify any file type option. The default output is ppm.
pdftoppm file.pdf image
I hope this helped you. For more information, subscribe.