-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Automatic glyph scaling based on unicode width and following character content #4969
Comments
@Tyriar Hmm, while this would help as a general fix it still kinda feels like patching things at the wrong end. I also have some doubts that endusers will accept it - a misaligned glyph will disturb the reading flow a lot up to a point where ppl might not recognize its meaning anymore (ppl are used to read smaller/out-of-raster glyphs as symbols with a different meaning, like ™ has its very own meaning beside being "TM"). For the char/glyph at hand - does it get the correct width with the newer unicode versions @PerBothner included in the grapheme addon? Edit: The situation with glyphs depending on the font creators' goodwill even for monospace fonts (since unicode does not impose strict widths) is really a nuisance - de facto every single monospace font would have to be tested upfront for problematic glyphs dropping out of the monospace grid, it is a really wicked situation. (We could at least build a test script, maybe even with a visual grid output to check for wcwidth --> glyph width conformance of a certain font.) |
The character |
Isn't just making it wide on our side going to cause weird things to happen if conpty or the system thinks it's a narrow character? I end up having these discussions every year and want to just fix it once and for all. An example of another discussion I have is they don't like our beautiful custom powerline glyphs and want a square with a question mark to show instead, despite it being a PUA unicode region 😕 |
Yupp, we cannot fix it by just setting wcwidth to 2 in xterm.js - it would have to be in line with what the OS / app thinks about that char. Btw thats the reason why I think that the terminal interface needs sequences to announce the used unicode version + extensions (+/- grapheme clustering, +/- bidi etc.), so app and terminal side can adopt or refuse that. The issue here - neither of them (app/OS side, TE side) have easy to use building blocks for that currently. It is almost impossible for cmdline devs to properly prepare their apps for that in C/C++, unless they have explicitly linked against a certain libicu version (thats for linux, idk by which lib macos integrates unicode support, for windows I am completely clueless). What we'd need (a perfect world scenario):
We hardly would ever get close to this, esp. with no official authority interested in this... |
FYI the corresponding Windows Terminal issue: microsoft/terminal#16779 |
@lhecker Yes that would halfway solve the broader issue (it would solve the issue at hand though), I also suggested something similar for PUA in the past. But you are right - it should be used for any ambiguous widths. It could also level out differences between unicode versions, at least for newly specced codepoints, that were unspecced before (which is the majority of changes between unicode versions). It still would not help with "active processing" changes, like additions to grapheme handling... |
For ref - link to my PUA suggestion |
Perhaps this can somehow help in this issue. I asked a long time ago why not delegate specifying the size of glyphs to the user side/application: https://gitlab.freedesktop.org/terminal-wg/specifications/-/issues/23 Perhaps there are some pitfalls in this that I am not aware of. A few days ago I started designing/implementing rendering of the vtm text buffer in the GUI to fully understand the viability of this concept. |
As I've mentioned before, I believe the best "fix" is to specify escape sequences or a mode where cursor addressing is in terms of "logical" characters and "logical" lines, independent of character width and wrapping. I.e. the application doesn't know or care how wide a character is or when/if a line gets wrapped. This would also allow use of variable-width fonts, which is more-or-less required by many writing systems. |
If the application doesn't care about the width of the characters, then the terminal itself arranges/places/lays out the glyphs (in its own way, even relying on font metrics). If an application is interested in precise cursor movement and glyph placement, then the app should be able to explicitly somehow specify the required layout/widths/etc. |
@o-sdn-o Yes I remember your proposal. It is really sophisticated in what it can express while still being attached to a grid system. I dont know yet, how that would work in detail, e.g. whats the default assumption, if not explicit fragment size is given? Wouldn't that need a sane default/fallback, thus we are again at the current wcwidth madness? @PerBothner Giving up the strict grid adherence would prolly work for 90% of all cmdline apps, that dont care for a specific screen state or certain alignment. Those are mostly apps just printing to the normal buffer. But more complicated apps on the alternate buffer still would need a more fine-grained control of the screen state (like all curses apps), so I dont see this yet as a general way to overcome the grid system in general, unless the model provides more "layouting primitives" (which opens Pandora's box of CSS-like requirements in the terminal). my stance Suggestion, how to overcome the issue at hand (rough outline):
With this app and TE side would be able to deal with most wcwidth related issues, as long as their wcwidth maps/unicode version is close enough. Somewhat offtopic addressing the broader issue:
The sequence for selecting unicode version is a less pressing issue, as long as unicode versions stay mostly backwards-compatible (which is the case since v10) and the TEs manage to stick to recent versions. Also everything above still does not deal yet with font related issues, like when a font gets a glyph size wrong. But I think this is a separate topic and not the real cause behind the issue above. |
In the absence of explicit specification of glyph sizes, a terminal operating in a GUI environment can set glyph sizes according to its own logic, for example, using the metrics of the font used. In the case of a headless terminal, in the absence of an explicit specification of glyph sizes, it must inject into the output stream additional instructions with glyph sizes that will be used by the TE instance operating in the GUI, which is the final layer in this output chain. The headless TE is reflowing on its side, so it must inject the cluster sizes into the output stream. Shells/Readlines must inject cluster sizes into the output stream as they reflow and position the cursor. TUI-apps must inject cluster sizes into the output stream as they render their UI and position the cursor. Legacy applications are left to rely on the GUI terminal option switch for logic between wcwidths/font metrics, or whatever. Meta data about the current cluster size should exist as a common cell-wise SGR attribute, like the background color, it is either set or missing. We need to wait for this to be implemented in a prototype (vtm's GUI front-end) so that anyone can play with it and better understand the need for such complications in the terminal ecosystem. |
This is a little more time sensitive on our end than trying to standardize something across all terminals like a new escape sequence and definitely beyond the scope of the time I have to dedicate to this problem. I just want a simple innocuous flag that is sufficient for passing GB18030 certification and not needing to revisit this problem every year.
@lhecker This would mostly solve it on Windows only, not sure what OS they use. FWIW the glyph on my setup actually overlaps the 3rd cell as well, so even this would technically not solve the overlapping problem for the keen observer.
@jerch I put together a little prototype that just rescales horizontally, not vertically, and I think it looks decent (ignore the emoji, we wouldn't allow that to happen if the following cell is free): |
@Tyriar Yes, looks like a good compromise to get a less disturbing output w'o waiting for the long and exhaustive spec iteration steps. 👍 |
We have a report related to GB18030 compliance (Chinese government standard) that calls out the rendering of the
Ⅻ
char overlapping with the follow character. This is because we think it's a narrow width char but render it exactly as the font renders.We could handle this case and solve the problem of overlapping character once and for all by scaling down the glyph to fit in the cell. This is what Windows Terminal's old renderer did:
https://github.com/microsoft/terminal/blob/release-1.19/src/renderer/dx/CustomTextLayout.cpp#L795-L836
Of course when talking about overlapping we can't ignore emoji which overlap but often contain a space after as this is what most terminals do. If we started scaling down all emoji I'm sure we would get many bug reports complaining about it. To handle this particular case generically I think we should also consider the following character, if is just whitespace then render normally without scaling,
The text was updated successfully, but these errors were encountered: