[vbox-dev] Unicode errors in PDF manual 7.0.x

Konstantin Vlasov flint at flint-inc.ru
Wed Aug 9 01:34:19 GMT 2023


Hi,

I've noticed that the PDF manual for the current VBox version contains a few 
issues. Mostly they fall into the following categories:

- sequences "fi" are replaced with the ligature fi;
- same with "fl" replaced with fl;
- many Unicode symbols apparently were converted using the incorrect encoding, 
like "â˘A¸S" instead of a short dash symbol, or "¢a" instead of some spaces 
(probably, non-breaking spaces);
- some internal tag that doesn't look like it should be there (see below).


While the first two could, in principle, be considered a feature rather than a 
bug (I've seen some services where these replacements were deliberate), the 
encoding garbage is obviously wrong. Here are a few examples for you to see 
them by yourselves:

a) page 156:
> VBoxManage modifyvm < uuid | vmname > [ --teleporter= on | off ]
> [ --teleporter-port=port ] [ --teleporter-address= address | empty ]
> [ --teleporter-password=password ] [ --teleporter-password-file= filename
> | stdin ] [ --cpuid-portability-level=level ] [ --cpuid-set=leaf [ :subleaf ]
> eax¢aebx¢aecx¢aedx ] [ --cpuid-remove=leaf [ :subleaf ] ]
> [ --cpuid-remove-all ]
(instead of spaces inside "eax ebx...")

b) page 189, description of the option "--nic<N>=none"; all the dash lines after 
the value names:
> none â˘A¸S No networking present
> null â˘A¸S Not connected to the host system
etc.

c) page 432, names with diacritics:
> Viktor SzathmÃ˛ary
instead of
> Viktor Szathmáry

and so on.


Unfortunately, I'm not sure what could be the reason for this. I noticed this
only after I have built my own version of VBox from sources, and decided to
compare my PDF with the officially distributed one. If that's of any help, I was
building everything in a Windows 10 machine, using, among others:

* MiKTeX 23.5 package
* DocBook XML DTD 4.5
* DocBook XSL Stylesheets 1.71.0

Also I had to make some changes in doc/manual/Config.kmk (otherwise the build
failed miserably), but the changes were about file path formats, like c:/...,
file:///c:/... and so on; nothing to do with encodings. But if you wish, I can 
send them to you.


And as for the final issue, check the page 471, 17.2.26 DocBook XML DTD License. In 
the middle of the text you'll see the line:

> $Id: user_ThirdParty.xml 155244 2023-01-17 14:15:46Z bird $

I don't think it was intended to be there.


All this was checked with version 7.0.10. I also took a quick look at 7.0.0, and 
all the same issues are present there. However, in 6.1.46 I could not find 
anything of the sort. Therefore, it seems, the breaking change happened 
somewhere during the switch to the major version 7.


P. S. I should have probably created a ticket in the bugtracker, but I can't 
login into my Oracle account for some reason, and the recovery emails are not 
delivered. Maybe because of the "ru" domain...


-- 
Bye.                                    With best regards,
                                        Konstantin Vlasov.



More information about the vbox-dev mailing list