Optimize

Optimize inFile by getting rid of redundant page resources like embedded fonts and images and write the result to outFile maxing out PDF compression. Have a look at some examples.

Usage

  1. pdfcpu optimize [-v(erbose)|vv] [-stats csvFile] [-upw userpw] [-opw ownerpw] inFile [outFile]

Flags

namedescriptionrequired
verboseturn on loggingno
vvverbose loggingno
statsCSV output fileno
upwuser passwordno
opwowner passwordno

Arguments

namedescriptionrequireddefault
inFilePDF input fileyes
outFilePDF output filenoinFile_new.pdf

Stats

The name of a CSV file name.This command appends one CSV line with stats about memory usage, PDF object usage and other useful information for debugging.Optimize a group of PDF input files and consolidate stats into the same CSV file for comparison.

The following shows a stats file with its header line and a single stats line:

  1. cat stats.csv
  2. name;version;author;creator;producer;src_size (bin|text);src_bin:imgs|fonts|other;dest_size (bin|text);dest_bin:imgs|fonts|other;linearized;hybrid;xrefstr;objstr;pages;objs;missing;garbage;R_Version;R_Extensions;R_PageLabels;R_Names;R_Dests;R_ViewerPrefs;R_PageLayout;R_PageMode;R_Outlines;R_Threads;R_OpenAction;R_AA;R_URI;R_AcroForm;R_Metadata;R_StructTreeRoot;R_MarkInfo;R_Lang;R_SpiderInfo;R_OutputIntents;R_PieceInfo;R_OCProperties;R_Perms;R_Legal;R_Requirements;R_Collection;R_NeedsRendering;P_LastModified;P_Resources;P_MediaBox;P_CropBox;P_BleedBox;P_TrimBox;P_ArtBox;P_BoxColorInfo;P_Contents;P_Rotate;P_Group;P_Thumb;P_B;P_Dur;P_Trans;P_Annots;P_AA;P_Metadata;P_PieceInfo;P_StructParents;P_ID;P_PZ;P_SeparationInfo;P_Tabs;P_TemplateInstantiated;P_PresSteps;P_UserUnit;P_VP;
  3. test.pdf;1.2;;;;6 KB (67.4% | 32.6%); 0.0% | 0.0% | 100.0%;5 KB (86.6% | 13.4%); 0.0% | 0.0% | 100.0%;false;false;false;false;2;15;;;false;false;false;false;false;false;false;false;false;false;false;false;false;false;false;false;false;false;false;false;false;false;false;false;false;false;false;false;false;true;false;false;false;false;false;false;false;false;false;false;false;false;false;false;false;false;false;false;false;false;false;false;false;false;false

Examples

Optimize test.pdf and write the result to test_new.pdf:

  1. pdfcpu optimize test.pdf
  2. writing test_new.pdf ...

Optimize test.pdf and write the result to test_opt.pdf:

  1. pdfcpu optimize test.pdf test_opt.pdf
  2. writing test_opt.pdf ...

Optimize test.pdf, write the result to test_opt.pdf, append stats to stats.csv and produce logging on standard out:

  1. pdfcpu optimize -verbose -stats stats.csv test.pdf test_opt.pdf
  2. stats will be appended to stats.csv
  3. INFO: 2019/02/20 23:20:12 reading upc.pdf..
  4. INFO: 2019/02/20 23:20:12 PDF Version 1.5 conforming reader
  5. INFO: 2019/02/20 23:20:12 validating
  6. INFO: 2019/02/20 23:20:12 optimizing fonts & images
  7. STATS: 2019/02/20 23:20:12 XRefTable:
  8. *************************************************************************************************
  9. HeaderVersion: 1.2
  10. has 2 pages
  11. XRefTable:
  12. Size: 13
  13. Root object: (11 0 R)
  14. Info object: (12 0 R)
  15. ID object: [<81C4A57DF6A1E411BD62885083B053CD> <81C4A57DF6A1E411BD62885083B053CD>]
  16. XRefTable with 13 entres:
  17. 0: f next= 0 generation=65535
  18. 1: offset= 16 generation=0 pdfcpu.Dict type=Page
  19. <<
  20. <Contents, (2 0 R)>
  21. <Parent, (3 0 R)>
  22. <Resources, (4 0 R)>
  23. <Type, Page>
  24. >>
  25. 2: offset= 102 generation=0 pdfcpu.StreamDict
  26. <<
  27. <Filter, LZWDecode>
  28. <Length, 2652>
  29. >>
  30. 3: offset= 5117 generation=0 pdfcpu.Dict type=Pages
  31. <<
  32. <Count, 2>
  33. <Kids, [(1 0 R) (8 0 R)]>
  34. <MediaBox, [0 0 595.27 841.89]>
  35. <Type, Pages>
  36. >>
  37. 4: offset= 2828 generation=0 pdfcpu.Dict
  38. <<
  39. <ColorSpace, <<
  40. <CS1, DeviceRGB>
  41. >>>
  42. <Font, <<
  43. <G1F18, (6 0 R)>
  44. <G1F3, (5 0 R)>
  45. <G1F6, (7 0 R)>
  46. >>>
  47. <ProcSet, [PDF Text]>
  48. >>
  49. 5: offset= 4942 generation=0 pdfcpu.Dict type=Font subType=Type1
  50. <<
  51. <BaseFont, Helvetica>
  52. <Encoding, <<
  53. <BaseEncoding, WinAnsiEncoding>
  54. <Differences, [45 minus]>
  55. <Type, Encoding>
  56. >>>
  57. <Name, G1F3>
  58. <Subtype, Type1>
  59. <Type, Font>
  60. >>
  61. 6: offset= 4761 generation=0 pdfcpu.Dict type=Font subType=Type1
  62. <<
  63. <BaseFont, Helvetica-Bold>
  64. <Encoding, <<
  65. <BaseEncoding, WinAnsiEncoding>
  66. <Differences, [45 minus]>
  67. <Type, Encoding>
  68. >>>
  69. <Name, G1F18>
  70. <Subtype, Type1>
  71. <Type, Font>
  72. >>
  73. 7: offset= 4578 generation=0 pdfcpu.Dict type=Font subType=Type1
  74. <<
  75. <BaseFont, Helvetica-Oblique>
  76. <Encoding, <<
  77. <BaseEncoding, WinAnsiEncoding>
  78. <Differences, [45 minus]>
  79. <Type, Encoding>
  80. >>>
  81. <Name, G1F6>
  82. <Subtype, Type1>
  83. <Type, Font>
  84. >>
  85. 8: offset= 2964 generation=0 pdfcpu.Dict type=Page
  86. <<
  87. <Contents, (9 0 R)>
  88. <Parent, (3 0 R)>
  89. <Resources, (10 0 R)>
  90. <Type, Page>
  91. >>
  92. 9: offset= 3051 generation=0 pdfcpu.StreamDict
  93. <<
  94. <Filter, LZWDecode>
  95. <Length, 1316>
  96. >>
  97. 10: offset= 4441 generation=0 pdfcpu.Dict
  98. <<
  99. <ColorSpace, <<
  100. <CS1, DeviceRGB>
  101. >>>
  102. <Font, <<
  103. <G1F18, (6 0 R)>
  104. <G1F3, (5 0 R)>
  105. <G1F6, (7 0 R)>
  106. >>>
  107. <ProcSet, [PDF Text]>
  108. >>
  109. 11: offset= 5218 generation=0 pdfcpu.Dict type=Catalog
  110. <<
  111. <Pages, (3 0 R)>
  112. <Type, Catalog>
  113. >>
  114. 12: offset= 5272 generation=0 pdfcpu.Dict
  115. <<
  116. <Author, ()>
  117. <CreationDate, (D:20150122062117)>
  118. <Creator, ()>
  119. <Keywords, ()>
  120. <Producer, ()>
  121. <Subject, ()>
  122. <Title, (Test)>
  123. >>
  124. Empty free list.
  125. Total pages: 2
  126. Fonts for page 1:
  127. obj prefix Fontname Subtype Encoding Embedded ResourceIds
  128. #5 Helvetica Type1 Custom false G1F3
  129. #6 Helvetica-Bold Type1 Custom false G1F18
  130. #7 Helvetica-Oblique Type1 Custom false G1F6
  131. Fonts for page 2:
  132. obj prefix Fontname Subtype Encoding Embedded ResourceIds
  133. #5 Helvetica Type1 Custom false G1F3
  134. #6 Helvetica-Bold Type1 Custom false G1F18
  135. #7 Helvetica-Oblique Type1 Custom false G1F6
  136. Fontobjects:
  137. obj prefix Fontname Subtype Encoding Embedded ResourceIds
  138. #5 Helvetica Type1 Custom false G1F3
  139. #6 Helvetica-Bold Type1 Custom false G1F18
  140. #7 Helvetica-Oblique Type1 Custom false G1F6
  141. Fonts:
  142. obj prefix Fontname Subtype Encoding Embedded ResourceIds
  143. #5 Helvetica Type1 Custom false G1F3
  144. #6 Helvetica-Bold Type1 Custom false G1F18
  145. #7 Helvetica-Oblique Type1 Custom false G1F6
  146. Duplicate Fonts:
  147. No image info available.
  148. writing test_opt.pdf ...
  149. INFO: 2019/02/20 23:20:12 writing to a.pdf
  150. STATS: 2019/02/20 23:20:12 0 original empty xref entries:
  151. STATS: 2019/02/20 23:20:12 0 original redundant font entries:
  152. STATS: 2019/02/20 23:20:12 0 original redundant image entries:
  153. STATS: 2019/02/20 23:20:12 0 original redundant info entries:
  154. STATS: 2019/02/20 23:20:12 0 original objectStream entries:
  155. STATS: 2019/02/20 23:20:12 0 original xrefStream entries:
  156. STATS: 2019/02/20 23:20:12 0 original linearization entries:
  157. STATS: 2019/02/20 23:20:12 XRefTable:
  158. *************************************************************************************************
  159. HeaderVersion: 1.2
  160. has 2 pages
  161. XRefTable:
  162. Size: 15
  163. Root object: (11 0 R)
  164. Info object: (12 0 R)
  165. ID object: [<81C4A57DF6A1E411BD62885083B053CD> <e4fcab0bb584b4b8d4f5fad43fd63b03>]
  166. XRefTable with 15 entres:
  167. 0: f next= 0 generation=65535
  168. 1: c => obj:13[0] generation=0
  169. <<
  170. <Contents, (2 0 R)>
  171. <Parent, (3 0 R)>
  172. <Resources, (4 0 R)>
  173. <Type, Page>
  174. >>
  175. 2: offset= 102 generation=0 pdfcpu.StreamDict
  176. <<
  177. <Filter, LZWDecode>
  178. <Length, 2652>
  179. >>
  180. 3: c => obj:13[7] generation=0
  181. <<
  182. <Count, 2>
  183. <Kids, [(1 0 R) (8 0 R)]>
  184. <MediaBox, [0 0 595.27 841.89]>
  185. <Type, Pages>
  186. >>
  187. 4: c => obj:13[1] generation=0
  188. <<
  189. <ColorSpace, <<
  190. <CS1, DeviceRGB>
  191. >>>
  192. <Font, <<
  193. <G1F18, (6 0 R)>
  194. <G1F3, (5 0 R)>
  195. <G1F6, (7 0 R)>
  196. >>>
  197. <ProcSet, [PDF Text]>
  198. >>
  199. 5: c => obj:13[2] generation=0
  200. <<
  201. <BaseFont, Helvetica>
  202. <Encoding, <<
  203. <BaseEncoding, WinAnsiEncoding>
  204. <Differences, [45 minus]>
  205. <Type, Encoding>
  206. >>>
  207. <Name, G1F3>
  208. <Subtype, Type1>
  209. <Type, Font>
  210. >>
  211. 6: c => obj:13[4] generation=0
  212. <<
  213. <BaseFont, Helvetica-Bold>
  214. <Encoding, <<
  215. <BaseEncoding, WinAnsiEncoding>
  216. <Differences, [45 minus]>
  217. <Type, Encoding>
  218. >>>
  219. <Name, G1F18>
  220. <Subtype, Type1>
  221. <Type, Font>
  222. >>
  223. 7: c => obj:13[3] generation=0
  224. <<
  225. <BaseFont, Helvetica-Oblique>
  226. <Encoding, <<
  227. <BaseEncoding, WinAnsiEncoding>
  228. <Differences, [45 minus]>
  229. <Type, Encoding>
  230. >>>
  231. <Name, G1F6>
  232. <Subtype, Type1>
  233. <Type, Font>
  234. >>
  235. 8: c => obj:13[5] generation=0
  236. <<
  237. <Contents, (9 0 R)>
  238. <Parent, (3 0 R)>
  239. <Resources, (10 0 R)>
  240. <Type, Page>
  241. >>
  242. 9: offset= 3051 generation=0 pdfcpu.StreamDict
  243. <<
  244. <Filter, LZWDecode>
  245. <Length, 1316>
  246. >>
  247. 10: c => obj:13[6] generation=0
  248. <<
  249. <ColorSpace, <<
  250. <CS1, DeviceRGB>
  251. >>>
  252. <Font, <<
  253. <G1F18, (6 0 R)>
  254. <G1F3, (5 0 R)>
  255. <G1F6, (7 0 R)>
  256. >>>
  257. <ProcSet, [PDF Text]>
  258. >>
  259. 11: offset= 5218 generation=0 pdfcpu.Dict type=Catalog
  260. <<
  261. <Pages, (3 0 R)>
  262. <Type, Catalog>
  263. >>
  264. 12: offset= 5272 generation=0 pdfcpu.Dict
  265. <<
  266. <Author, ()>
  267. <CreationDate, (D:20190220232012+01'00')>
  268. <Creator, ()>
  269. <Keywords, ()>
  270. <ModDate, (D:20190220232012+01'00')>
  271. <Producer, (pdfcpu v0.1.21)>
  272. <Subject, ()>
  273. <Title, ()>
  274. >>
  275. 13: offset=nil generation=0 pdfcpu.ObjectStreamDict
  276. <<
  277. <Filter, FlateDecode>
  278. <First, 45>
  279. <Length, 327>
  280. <N, 8>
  281. <Type, ObjStm>
  282. >>
  283. object stream count:8 size of objectarray:0
  284. 14: offset=nil generation=0 pdfcpu.XRefStreamDict
  285. <<
  286. <Filter, FlateDecode>
  287. <ID, [<81C4A57DF6A1E411BD62885083B053CD> <e4fcab0bb584b4b8d4f5fad43fd63b03>]>
  288. <Index, [0 14]>
  289. <Info, (12 0 R)>
  290. <Length, 63>
  291. <Root, (11 0 R)>
  292. <Size, 15>
  293. <Type, XRef>
  294. <W, [1 2 2]>
  295. >>
  296. Empty free list.
  297. Total pages: 2
  298. Fonts for page 1:
  299. obj prefix Fontname Subtype Encoding Embedded ResourceIds
  300. #5 Helvetica Type1 Custom false G1F3
  301. #6 Helvetica-Bold Type1 Custom false G1F18
  302. #7 Helvetica-Oblique Type1 Custom false G1F6
  303. Fonts for page 2:
  304. obj prefix Fontname Subtype Encoding Embedded ResourceIds
  305. #5 Helvetica Type1 Custom false G1F3
  306. #6 Helvetica-Bold Type1 Custom false G1F18
  307. #7 Helvetica-Oblique Type1 Custom false G1F6
  308. Fontobjects:
  309. obj prefix Fontname Subtype Encoding Embedded ResourceIds
  310. #5 Helvetica Type1 Custom false G1F3
  311. #6 Helvetica-Bold Type1 Custom false G1F18
  312. #7 Helvetica-Oblique Type1 Custom false G1F6
  313. Fonts:
  314. obj prefix Fontname Subtype Encoding Embedded ResourceIds
  315. #5 Helvetica Type1 Custom false G1F3
  316. #6 Helvetica-Bold Type1 Custom false G1F18
  317. #7 Helvetica-Oblique Type1 Custom false G1F6
  318. Duplicate Fonts:
  319. No image info available.
  320. STATS: 2019/02/20 23:20:12 Timing:
  321. STATS: 2019/02/20 23:20:12 read : 0.001s 28.7%
  322. STATS: 2019/02/20 23:20:12 validate : 0.000s 4.5%
  323. STATS: 2019/02/20 23:20:12 optimize : 0.000s 1.1%
  324. STATS: 2019/02/20 23:20:12 write : 0.002s 48.8%
  325. STATS: 2019/02/20 23:20:12 total processing time: 0.003s
  326. STATS: 2019/02/20 23:20:12 Original:
  327. STATS: 2019/02/20 23:20:12 File Size : 6 KB (5884 bytes)
  328. STATS: 2019/02/20 23:20:12 Total Binary Data : 4 KB (3968 bytes) 67.4%
  329. STATS: 2019/02/20 23:20:12 Total Text Data : 2 KB (1916 bytes) 32.6%
  330. STATS: 2019/02/20 23:20:12 Breakup of binary data:
  331. STATS: 2019/02/20 23:20:12 images : 0.000000 Bytes (0 bytes) 0.0%
  332. STATS: 2019/02/20 23:20:12 fonts : 0.000000 Bytes (0 bytes) 0.0%
  333. STATS: 2019/02/20 23:20:12 other : 4 KB (3968 bytes) 100.0%
  334. STATS: 2019/02/20 23:20:12 Optimized:
  335. STATS: 2019/02/20 23:20:12 File Size : 5 KB (5034 bytes)
  336. STATS: 2019/02/20 23:20:12 Total Binary Data : 4 KB (4358 bytes) 86.6%
  337. STATS: 2019/02/20 23:20:12 Total Text Data : 676.000000 Bytes (676 bytes) 13.4%
  338. STATS: 2019/02/20 23:20:12 Breakup of binary data:
  339. STATS: 2019/02/20 23:20:12 images : 0.000000 Bytes (0 bytes) 0.0%
  340. STATS: 2019/02/20 23:20:12 fonts : 0.000000 Bytes (0 bytes) 0.0%
  341. STATS: 2019/02/20 23:20:12 other : 4 KB (4358 bytes) 100.0%