DOMPDF + justification + extended ASCII chars

If in your documents you use the so called extended ASCII characters, those with a code >= 128 and try to make a PDF print using DOMPDF, and you like to have your document justified, you are in trouble. There seems to be a problem right now with this combination. At least when you want to stick with the open source PDF rendering engine, R&OS CPDF.

The problem is that non only you cannot produce a decent justification, but your text can easly span beyond the border of the paper. This is also true for table cells, where text can flow over the next cell.

This seems to derive from an incorrect mapping between the extended characters and the numbering used by the AFM file. The problem is described in the R&OS CPDF FAQ along with a possible workaround.

Following the workaround proposed in the FAQ, I have tried to make it work under DOMPDF. The workaround says to add a second argument to the selectFont() method that specifies the correct mapping. A grep shows that there are 4 occurrences of this call, in the following files: cpdf_adapter.cls.php and page_cache.cls.php. I therefore proceeded to make the following change, from

$this->_pdf->selectFont($font);

to

$this->_pdf->selectFont($font,
array('encoding'=>'WinAnsiEncoding',
'differences'=>self::$diff));

Once I have written down the mapping, it worked well in the test that I have done. So what is the mapping? Here it is:

static $diff = array (
130 => 'quotesinglbase',
131 => 'florin',
132 => 'quotedblright',
133 => 'ellipsis',
134 => 'dagger',
135 => 'daggerdbl',
136 => 'circumflex',
137 => 'perthousand',
// 138 => '{Underscore}',
139 => 'guilsinglleft',
140 => 'OE',
145 => 'quoteleft',
146 => 'quoteright',
147 => 'quotedblleft',
148 => 'quotedblright',
149 => 'bullet',
150 => 'endash',
151 => 'emdash',
152 => 'tilde',
153 => 'trademark',
// 154 => '{Underscore}',
155 => 'guilsinglright',
156 => 'oe',
159 => 'Ydieresis',
// 160 => '{Nonbreaking space}',
161 => 'exclamdown',
162 => 'cent',
163 => 'sterling',
164 => 'currency',
165 => 'yen',
166 => 'brokenbar',
167 => 'section',
168 => 'dieresis',
169 => 'copyright',
170 => 'ordfeminine',
171 => 'guillemotleft',
172 => 'logicalnot',
// 173 => '{Soft hyphen}',
174 => 'registered',
175 => 'macron',
176 => 'degree',
177 => 'plusminus',
178 => 'twosuperior',
179 => 'threesuperior',
180 => 'acute',
181 => 'mu',
182 => 'paragraph',
183 => 'periodcentered',
184 => 'cedilla',
185 => 'onesuperior',
186 => 'ordmasculine',
187 => 'guillemotright',
188 => 'onequarter',
189 => 'onehalf',
190 => 'threequarters',
191 => 'questiondown',
192 => 'Agrave',
193 => 'Aacute',
194 => 'Acircumflex',
195 => 'Atilde',
196 => 'Adieresis',
197 => 'Aring',
198 => 'AE',
199 => 'Ccedilla',
200 => 'Egrave',
201 => 'Eacute',
202 => 'Ecircumflex',
203 => 'Edieresis',
204 => 'Igrave',
205 => 'Iacute',
206 => 'Icircumflex',
207 => 'Idieresis',
208 => 'Eth',
209 => 'Ntilde',
210 => 'Ograve',
211 => 'Oacute',
212 => 'Ocircumflex',
213 => 'Otilde',
214 => 'Odieresis',
215 => 'multiply',
216 => 'Oslash',
217 => 'Ugrave',
218 => 'Uacute',
219 => 'Ucircumflex',
220 => 'Udieresis',
221 => 'Yacute',
222 => 'Thorn',
223 => 'germandbls',
224 => 'agrave',
225 => 'aacute',
226 => 'acircumflex',
227 => 'atilde',
228 => 'adieresis',
229 => 'aring',
230 => 'ae',
231 => 'ccedilla',
232 => 'egrave',
233 => 'eacute',
234 => 'ecircumflex',
235 => 'edieresis',
236 => 'igrave',
237 => 'iacute',
238 => 'icircumflex',
239 => 'idieresis',
240 => 'eth',
241 => 'ntilde',
242 => 'ograve',
243 => 'oacute',
244 => 'ocircumflex',
245 => 'otilde',
246 => 'odieresis',
247 => 'divide',
248 => 'oslash',
249 => 'ugrave',
250 => 'uacute',
251 => 'ucircumflex',
252 => 'udieresis',
253 => 'yacute',
254 => 'thorn',
255 => 'ydieresis'
);

15 Responses to “DOMPDF + justification + extended ASCII chars”

  1. Régis says:

    Very good.
    I’m using this help in 0.5.3 (in Brazil) and it’s works.

    Thanks :)

  2. Toni says:

    Hello!
    I’ve been reading your post “Printing with DOMPDF” and I found it very usefull.
    Otherwise, the part wich really interests me in is this one (DOMPDF + justification + extended ASCII chars), but I can’t manage it to work.
    What I do is:
    Open cpdf_adapter.cls.php, replace
    $this->_pdf->selectFont($font);

    with

    $this->_pdf->selectFont($font,
    array(‘encoding’=>’WinAnsiEncoding’,
    ‘differences’=>self::$diff));

    and add your $diff array at the begining of the same document.

    Then I try to do the same in page_cache.cls.php, but I can’t find the line $this->_pdf->selectFont($font);.

    What am I missing?

    Please please please, help me! Your help will be trully apreciated!

    Thanks in advance! :)

  3. rod says:

    hello,

    on which version did you do that?
    I tried with v0.5.2, the latest release, and it does not work.

    any idea?
    Best regards

  4. @rod
    Version 0.5.1 (http://dompdf.googlecode.com/files/dompdf-0.5.1.zip)

    I haven’t tried it yet, but there is an upcoming new version, which is still in alpha, that might solve this problem. Look at http://code.google.com/p/dompdf/downloads/list

    Cheers

  5. Régis says:

    I tried with v0.6.0 alpha 2 and it’s working very good!
    Luca, the alpha don’t resolve this!

  6. Maicon says:

    yes very good

  7. Mike says:

    I cant get this work either, done everything above but still no joy (version 0.5.1)

  8. @mike – If you provide an example I can use it to test it with the latest code. I have also done some work to make DOMPDF work with another PDF library (TCPDF) and I would like to see how well it works.

  9. Mike says:

    well I have made the string replacement but can’t find it on page.cache and then added the static diff array to the top of the cpdf_adapter.cls.php but it makes no difference the right side just isn’t aligning correctly when I justify it. Each line is somewhere between 0 and 10 pixels out.

    Its not much i know but you can really notice it.

  10. Mike says:

    I have found the solution , there are some issues in the way the code generates the spacing for each line. It will only work for the first and last line, not the rest. . will explain when get this working

  11. Mike says:

    ok add something like this

    if (substr($text, -1) == “-” || substr($text, -1) == “.” ){
    $this->_lines[$this->_cl]["wc"] += 1;
    }else{
    $this->_lines[$this->_cl]["justify"] = -2;
    }

    to

    block_frame_decorator.cls.php on lne 192

    AND

    add this
    $spacing = ($width – ($line["w"] + $line["justify"])) / ($line["wc"] -2);

    to block_frame_reflower.cls.php line 431.

    basically the maths is wrong for it. It hasn’t accounted for the extra white spaces on each line that occur.

    they will be a tidy way of doing this and it probably involves trimming the white spaces for each line but this highlghts the problem

  12. Mike says:

    in fact this is slightly better

    if ($frame->get_node()->nodeName == “#text”)
    $text = trim($frame->get_text());
    $this->_lines[$this->_cl]["wc"] = count(preg_split(“/\s+/”, $text ));
    $this->_lines[$this->_cl]["w"] += $w;

    if (substr($text, -1) != “-” && substr($text, -1) != “.” ){
    $this->_lines[$this->_cl]["justify"] = -2;
    }

    and then

    $spacing = ($width – ($line["w"] + $line["justify"])) / ($line["wc"] -1);

    (to sort out line breaks and fullstops.

  13. Albert says:

    Hello. Luca.
    I am using dompdf with the library CPDF with the modifications that you propose. It works well, but I meet a serious problem. If the last character of the line is a special, the following lines of the paragraph disappear.
    Can you help me to find a solution?
    With this failure, dompdf is not valid since the lines disappear in a random way.
    Thank you very much.

  14. Albert says:

    With the version 0.6 and windows there is solved the problem of the loss of lines by the following change

    text_frame_reflower.cls.php line 153 $offset = strlen($str);

  15. Álvaro says:

    Hi,
    I’m not sure if this is the same problem related with this post, but here it goes:
    - Words inside table cells and/or divs are splited the wrong way… For exemple… at the end of the line a word like ‘computer’ is splitted com (line break here)
    puted.
    I’m using domPDF 0.5.1.
    Any help would be appreciated.
    Thanxs in advance :)
    Álvaro

    P.S. – I can send an url that create a PDF with this errors as well.

Leave a Reply