Skip to content

Conversation

@GeneralUser01
Copy link
Contributor

The way images are handled for the gif file format when pasting and receiving one in the chat was here changed to support playing the animation. By default they do not play but whenever an animation is not running this is indicated by a play-icon, thereby differentiating them from still images on a glance. Whether the animation is paused is toggled by left-clicking it and reset by middle-clicking it. Saving a gif image file from log is also supported just as it is for other image file formats.

Additionally, one can also toggle video controls for animations via the log context menu on them. This enables the following features:

  • Jumping to any point of the animation via the video bar
  • Viewing the current time and total time of the animation
  • Switching caching of all frames on or off
  • Switching loop mode between "Unchanged", "Loop" and "No loop"
  • Traversing frame-by-frame backwards or forward
  • Changing and resetting the playback speed

To turn on caching most notably boosts performance when jumping to frames that are sequentially far apart due to QMovie only being able to switch frame in order from start to end and around when caching is off, though it usually plays fine regardless except for when playing in reverse in an animation with many frames. Reverse playback is also implemented here, so when decreasing the speed to less than zero it will play at that speed in reverse as expected, taking a speed-step that's twice as big if the speed would otherwise become zero so that the animation only pauses when it's not playing. As for loop mode, "Unchanged" is to use the in-built setting in the gif image file for how many times it is to repeat, whereas "Loop" and "No loop" override this behavior accordingly.

The main limitation is the character limit on text messages for images. Currently this is usually set to 128 KB, which is very small for a gif image file and on top of this requires the data to be 33% smaller before being sent in base64 encoding. This limit should be at least ten times larger in order to fit many gif image files, or this should apply to another limit specifically for animations. Other than that, the functionality for pasting images from the clipboard itself, instead of by a file path from it, could not be implemented for gif image files due to not being able to get compatible formatted data from the MIME data received.

Here are workarounds that can be applied to these limitations:

  • Configure the server to increase or disable the limit on image message length
  • Save copied gif images to file and then copy and paste those files

Lastly, if settings were to be added to this feature, then here are some suggestions for those settings:

  • Play animations by default
  • Show video controls by default
  • Cache all frames by default
  • Specify the default loop mode
  • Specify the default playback speed

Checks

@Hartmnt
Copy link
Member

Hartmnt commented Nov 26, 2024

Thank you very much for this contribution! Very cool!

Please note that this feature - with utmost certainty - will not land in the 1.5.x branch of Mumble, but in 1.6.x. With that being said, we are currently trying to get another 1.5.x release out and lack time to review this properly right now. There will be plenty of time for getting this merged, though!

I have not looked into the details of this PR yet, but since it is touching the user interface I would like you to take a look at the Mumble accessibility checklist and double check the application can still be used with accessibility in mind with your change applied.

Also take a look at the CI pipelines which are currently failing (you can disregard the Windows pipeline for now as we need to fix it in master right now).
The other thing is, we are currently requiring all PRs to be auto-formatted using clang-format in version 10. This is currently also failing for your PR. It will be enough to format everything once all the changes are reviewed and ready.

Again, please be patient for a proper review of this :)

@Hartmnt Hartmnt added client ui feature-request This issue or PR deals with a new feature labels Nov 26, 2024
@GeneralUser01
Copy link
Contributor Author

GeneralUser01 commented Nov 27, 2024

Thank you for being open to merge this PR and for the feedback!

Since the interactive aspects of this feature are part of custom text objects in the log, keyboard-related controls such as tabbing to navigate apply to user messages but it is unclear how this should be able to interact with items in messages if at all. As for contrasts which is another accessibility concern, is there any standard used in this project to test with to check if a contrast is high enough?

The only part of the CI pipelines that I'm unsure about how to resolve are the translations. There are more texts than what this feature brings that have no translation, so how is this to be handled?

@Hartmnt
Copy link
Member

Hartmnt commented Nov 27, 2024

Since the interactive aspects of this feature are part of custom text objects in the log, keyboard-related controls such as tabbing to navigate apply to user messages but it is unclear how this should be able to interact with items in messages if at all. As for contrasts which is another accessibility concern, is there any standard used in this project to test with to check if a contrast is high enough?

Currently, the individual text messages (objects) are not accessible. This is a TODO on our part. However, I am concerned that your new text objects containing a video player may create focus traps when tab-navigating. Another aspect that needs to be looked at is using a screen reader or text-to-speech for chat messages. It would be bad for example, if the TTS would read the entire base64 content of the video. Severe things like that. Don't worry too much about the contrast. When we look at this in detail, we can still talk about that.

The only part of the CI pipelines that I'm unsure about how to resolve are the translations. There are more texts than what this feature brings that have no translation, so how is this to be handled?

There is a script in the repository under the scripts folder called updatetranslations.py. If you run this from the repo root, it automatically creates a translation commit. But I would advice to do this at the very end, as your changed translation strings may need to be updated.

@GeneralUser01
Copy link
Contributor Author

Regarding focus traps, these are absent in my custom text objects since they are drawn using QPainter as usual and interaction with them is handled through a mouse press event on the document, hence there are no other widgets involved. Otherwise the controls would not be integrated with the document and that would among other things require moving all of these widgets depending on scrolling. While it seems this may work since my text objects already occupy their required space in the document, I'm unsure if making widgets act as part of the document overlaying my text objects is the more efficient and/or comprehensive approach.

I suspect there are other ways to add keybindings to this, as well as support for screen readers via connected UI elements somewhere, but with interactive items in the document this would require further time to investigate and implement. How much this is of interest and required for any given feature is unclear. When it comes to text-to-speech at least, it has a limit on how long messages it reads aloud, so no image would likely pass that check.

Does running the script for creating a translation commit at the very end mean running it when all of my changes are ready for now or after a proper review?

@Hartmnt
Copy link
Member

Hartmnt commented Nov 29, 2024

(you can disregard the Windows pipeline for now as we need to fix it in master right now).

The master branch Windows CI should now be working again, so please rebase your branch against current master.

Regarding focus traps, these are absent in my custom text objects since they are drawn using QPainter as usual and interaction with them is handled through a mouse press event on the document, hence there are no other widgets involved. Otherwise the controls would not be integrated with the document and that would among other things require moving all of these widgets depending on scrolling. While it seems this may work since my text objects already occupy their required space in the document, I'm unsure if making widgets act as part of the document overlaying my text objects is the more efficient and/or comprehensive approach.

I don't really get what you are trying to communicate here. Just try to navigate through the Mumble main window with TAB and SHIFT+TAB to see, if you Gifs somehow break navigation. I might consider requiring interacting with the Gifs with the keyboard (at least play/pause). But I am not sure, yet. Maybe a setting to always/never autoplay GIFs would be enough. For now, just make sure that you can reach every existing UI element by pressing TAB and SPACE+TAB.

When it comes to text-to-speech at least, it has a limit on how long messages it reads aloud, so no image would likely pass that check.

Nah, this is not an option. The TTS should skip GIFs (and images), or say something like "(animated) image" but not attempt to read it. I will test this as soon as I have time for this PR, but I just wanted to let you know that this kind of stuff should be considered.

Does running the script for creating a translation commit at the very end mean running it when all of my changes are ready for now or after a proper review?

I would say both, because the proper review might require you to change some stuff around. This is just a suggestion though. If you feel confident in your git abilities, you might as well create the translation commit now and keep updating the previous commit. You would probably have to drop and recreate the translation commit a few times, so adding it last is always easier in my opinion.

@GeneralUser01
Copy link
Contributor Author

The master branch Windows CI should now be working again, so please rebase your branch against current master.

Okay, I will rebase my branch soon, as well as apply adjustments which among other things would resolve the CI pipeline checks.

I don't really get what you are trying to communicate here. Just try to navigate through the Mumble main window with TAB and SHIFT+TAB to see, if you Gifs somehow break navigation. I might consider requiring interacting with the Gifs with the keyboard (at least play/pause). But I am not sure, yet. Maybe a setting to always/never autoplay GIFs would be enough.

I meant to convey that GIFs are not automatically a part of the tab-navigation, so they cause no focus traps, but it is so far unspecified if they should be able to receive focus and how focus can then be supported, whether it's through more widgets or something else.
I have added keybindings to the chat now, so by having it in focus and pressing RETURN (goes forward) or BACKSPACE (goes backwards) one can select an animation (highlighted accordingly) and then do any of the following actions:

  • SPACE/K - Play/pause the animation
  • Q - Reset the animation
  • V - Show/hide video controls
  • C - Turn caching of all frames on/off
  • O - Switch to the previous loop mode
  • L - Switch to the next loop mode
  • , - Jump to the preceding frame
  • . - Jump to the next frame
  • N - Jump 5 frames backwards
  • M - Jump 5 frames forward
  • -/S - Decrease playback speed by 5%
  • +/D - Increase playback speed by 5%
  • W - Decrease playback speed by 25%
  • E - Increase playback speed by 25%
  • R - Reset playback speed

If I get into implementing any settings then turning on/off that GIFs autoplay would be the first one, but GIFs can now (once this change is pushed) be interacted with via keyboard anyway.

For now, just make sure that you can reach every existing UI element by pressing TAB and SPACE+TAB.

Yes, I have done this and I know the regular tab-navigation, where TAB goes forward, SHIFT+TAB goes backwards and both wrap around, but what is SPACE+TAB supposed to do? I tried this too but it didn't appear to do anything that had to do with the tab-navigation.

Nah, this is not an option. The TTS should skip GIFs (and images), or say something like "(animated) image" but not attempt to read it. I will test this as soon as I have time for this PR, but I just wanted to let you know that this kind of stuff should be considered.

I agree that relying on TTS settings and message length is not enough, though it makes any issues involving TTS less commonly encountered. However, the GIFs are sent the same way as the regular images and because of it both would be skipped by TTS, as well as making them go by the image message limit instead of the text message limit, therefore TTS would not be an issue with this feature. There is also the screen reader and more perhaps, so indeed the compatibility with all of the affected accessibility features should be considered.
Altough it's unrelated to this feature, the TTS tries to read purely visual content if it's ASCII art in a code-block (pre element) though it somewhat depends on the arrangement of characters. If there was a way to set conditions for what TTS attempts to read, then that would help with excluding such content from TTS completely.

I would say both, because the proper review might require you to change some stuff around. This is just a suggestion though. If you feel confident in your git abilities, you might as well create the translation commit now and keep updating the previous commit. You would probably have to drop and recreate the translation commit a few times, so adding it last is always easier in my opinion.

Sure, I agree that I might as well commit the translation now, finishing most of it if not all of it sooner rather than later. I have updated/replaced commits before so that would not be too troublesome. I will commit the translation and update it if needed.

Copy link
Member

@Krzmbrzl Krzmbrzl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dislike that we are hand-wavy parsing HTML documents. This more or less guarantees that it will break at some point in the future simply because this is a drastic simplification of how HTML is structured. We already have a dependency to Poco::XML, so I would suggest using that for parsing and querying purposes.

Furthermore, I believe the code associated with the animation logic should be outsourced into a dedicated class or something like that. It is way too large to just embed into the regular logic of handling log events.

Note: This was more of a high-level review (i.e. I didn't a full read through all changes)

Comment on lines 101 to 106
static const int videoBarHeight = 4;
static const int underVideoBarHeight = 20;
static const int cacheOffsetX = -170;
static const int loopModeOffsetX = -130;
static const int frameTraversalOffsetX = -90;
static const int speedOffsetX = -30;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hardcoding these values will likely make the UI break with certain display configurations (i.e. different resolution, different scaling factors, whatnot)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since all of the positions and sizes of elements in the video controls have hardcoded values these would not become misaligned. I have added a minimum width for when the video controls are active, instead of relying on the in-built width of the given content, ensuring that the video controls have enough space. I have also tested different scaling factors and indeed the UI works then too. These constants make it easier to use and change values across multiple functions, so that buttons are shown and clickable in the same location for example.

@GeneralUser01 GeneralUser01 force-pushed the gif branch 4 times, most recently from 586d545 to dff6f33 Compare April 2, 2025 23:20
@Hartmnt
Copy link
Member

Hartmnt commented Apr 19, 2025

Would you consider this PR ready for review, yet?

@GeneralUser01 GeneralUser01 force-pushed the gif branch 2 times, most recently from e3ee001 to 433a844 Compare May 23, 2025 18:43
@GeneralUser01
Copy link
Contributor Author

GeneralUser01 commented May 23, 2025

I have fixed the last details and consider this PR ready for review now :)

Checks were failing regarding a submodule, which was accidentally included in a commit after rebasing, but this has been resolved now.

@GeneralUser01 GeneralUser01 force-pushed the gif branch 3 times, most recently from 8894709 to 3f19ede Compare May 30, 2025 16:37
The way images are handled for the `gif` file format when pasting and receiving one in the chat was here changed to support playing the animation. By default they do not play but whenever an animation is not running this is indicated by a play-icon, thereby differentiating them from still images on a glance. Whether the animation is paused is toggled by left-clicking it and reset by middle-clicking it. Saving a `gif` image file from log is also supported just as it is for other image file formats.

Additionally, one can also toggle video controls for animations via the log context menu on them. This enables the following features:

- Jumping to any point of the animation via the video bar
- Viewing the current time and total time of the animation
- Switching caching of all frames on or off
- Switching loop mode between "Unchanged", "Loop" and "No loop"
- Traversing frame-by-frame backwards or forward
- Changing and resetting the playback speed

Alternatively, animations can be controlled via keybindings as follows:

- ENTER - Select the next animation
- BACKSPACE - Select the previous animation
- CTRL+0-9 - Select animation by the index entered
- ESCAPE - Deselect animations
- SPACE/K - Play/pause
- Q - Reset playback
- V - Show/hide video controls
- C - Turn caching of all frames on/off
- O - Switch to the previous loop mode
- L - Switch to the next loop mode
- , - Jump to the preceding frame
- . - Jump to the next frame
- N - Jump 5 frames backwards
- M - Jump 5 frames forward
- H - Jump 1 second backwards
- J - Jump 1 second forward
- F - Jump 5 seconds backwards
- G - Jump 5 seconds forward
- -/S - Decrease playback speed by 5%
- +/D - Increase playback speed by 5%
- W - Decrease playback speed by 25%
- E - Increase playback speed by 25%
- R - Reset playback speed

To turn on caching most notably boosts performance when jumping to frames that are sequentially far apart due to `QMovie` only being able to switch frame in order from start to end and around when caching is off, though it usually plays fine regardless except for when playing in reverse in an animation with many frames. Reverse playback is also implemented here, so when decreasing the speed to less than zero it will play at that speed in reverse as expected, taking a speed-step that's twice as big if the speed would otherwise become zero so that the animation only pauses when it's not playing. As for loop mode, "Unchanged" is to use the in-built setting in the `gif` image file for how many times it is to repeat, whereas "Loop" and "No loop" override this behavior accordingly.

The main limitation is the character limit on text messages for images. Currently this is usually set to 128 KB, which is very small for a `gif` image file and on top of this requires the data to be 33% smaller before being sent in base64 encoding. This limit should be at least ten times larger in order to fit many `gif` image files, or this should apply to another limit specifically for animations. Other than that, the functionality for pasting images from the clipboard itself, instead of by a file path from it, could not be implemented for `gif` image files due to not being able to get compatible formatted data from the MIME data received.

Here are workarounds that can be applied to these limitations:

- Configure the server to increase or disable the limit on image message length
- Save copied `gif` images to file and then copy and paste those files

Lastly, if settings were to be added to this feature, then here are some suggestions for those settings:

- Play animations by default
- Show video controls by default
- Cache all frames by default
- Specify the default loop mode
- Specify the default playback speed
…s adjustments

Made the following changes:
General:
- Move functions into more suitable classes, where `clearDocument` (renamed to `clear`) is placed in the `LogTextBrowser` class and `toggleVideoControls` is placed in the `AnimationTextObject` class
- Add support for `width` and `height` attributes applied to animations
- Add support for use with WEBP, MNG, (A)PNG and AVIF, which are other image formats that may contain animations. WEBP and MNG are included in Qt's additional library "Qt Image Formats", though only WEBP is bundled with official builds of Qt.

CustomElements.cpp:
- Reorganize video controls into a dedicated class, as well as related rather generic functions
- Return default value after `switch` in `loopModeToString`, which would make compilers warn here if not all values of the enum are handled
- Also clear the counter and focus index for custom items when the document is cleared
- Select the last custom item if the number entered (CTRL+[series of 0-9] with the log in focus) is too high
- Set the playback speed to the value it had at the previous reset if the speed is reset again while at the original speed
- Add fullscreen mode, useable from the UI and via keybindings where F toggles it and Escape exits it
- Add keybinding to invert the playback speed with X
- Switch keybindings for jumping 5 frames backwards or forward to B and N as well as for jumping 5 seconds backwards or forward to U and I
- Account for scrolling on both axes when getting the position of a click
- Make video bar background brighter when caching is on to further distinguish it
- Fix text object size so that it reliably changes the intrinsic size (layout) by applying the max size first (retroactive note: this was a side effect of not updating the layout manually for custom text objects which was properly fixed in the later commit "FIX(client,images): Leave scrolling position unchanged when exiting full screen")
- Scale the animation to a min width for the video controls to ensure they fit
- Use a regular still image for images that may contain an animation but do not have multiple frames

MainWindow.cpp:
- Use `enum class` instead of `enum` in new code, making it more type safe and specific in scope
- Use function pointers instead of the `SLOT` macro for new actions in the log's context menu
- Save the given image in another format if specified with the file extension for animations as well
- Use precise image detection for context menu options via `formatAt` instead of `cursorAtPosition`, enabling direct detection of a text object at the given position, instead of looking for its detectable half which does not work when there are multiple such text objects in a row, and with more accuracy than just somewhere within its vertical space

Log.cpp:
- Reorganize the log routine for animations, where code having to do with creating and running an animation is placed in its own class, `AnimationTextObject`, while getting data and inserting animations is left to the `Log` class
- Make the log routine for animations generic for all custom text objects, enabling more to be added alongside each other as well as regular HTML in the same message
- Return early from `htmlWithCustomTextObjects` if the text to log is empty or too small to contain any tag
- Use more stable specific reference points when parsing tags containing custom text object data
- Still images use the given image format instead of JPG where possible, enabling transparency and maintaining the highest image quality (may still be reduced if the file size exceeds the given image message limit). The default image format is now PNG to enable transparency even if the format was not detected or supported, where PNG is in the base support for image formats in Qt.
- Retroactively resolved merge conflict: Include upstream change for images to increase the max width from 600 to 1600 and max height from 400 to 1000
…orrectly

When finding animations in a message where some of the images use image formats supporting animation, the search would incorrectly reuse the previous text object data and omit the invalid animation, breaking the loop's progress and missing the previous HTML along with the invalid animation. This change corrects it by only continuing the search for text object data at the start of the `while`-loop and by including the current `img`-tag if it does not contain a valid animation when inserting the previous HTML (custom text objects require that they are inserted separately in the document, hence this approach is used instead of something like replacing parts of a string and then setting the entire string).
…t objects to navigate

When using keybindings to navigate custom text objects in the log, bound-checks were missing for moving to the next or previous item when there are no items. This change adds these bound-checks as well as universal use of `[]` instead of `at`.
…th animations

Given that PNG images may contain an animation and optionally then use the file extension APNG, this change accepts either file extension. The list of file extensions for animations were also slightly rearranged to put WEBP next to AVIF.
… header

Enables getting the file extension from a file header, i.e. the initial content of a file. This works when having an image in its original format, such as when downloading an image or decoding one from base64, but not when having a `QImage` due to it having read the data into its own format. File formats can then be treated differently even without a file path, which could be used to both set up custom objects for certain formats and to preserve the original format if that is supported in a resulting image after it is processed as a `QImage`.
Detection of SVG, ICO and CUR was added and that of several other image formats was corrected. File extensions are now consistently stored in lowercase instead in accordance with how Qt provides supported image formats.
…mated images

The implementation of file extension association with animations was here changed from a static list to a dynamic one based on the read support in Qt for image formats supporting animation. If another such image format is added then the only part of the application that may require an update to support it is the function `isFileExt` in the class `Log` detecting the file extension based on the file header.
…d fallback to cover too large images in other formats

Due to Mumble having a very limited image message length by default, the default image format is set to JPG again. The fallback for images now also includes images that could be written in the given format but are too large at any quality setting, trying those too in the default format if it did not work at all otherwise.

Notably this means that transparency will be lost (filled with black) when copying images directly (without a file path) since these cannot have their original format made known when only exposed as a `QImage`. The quality may be reduced too but would likely be worse if it is even shown at all in another format with the same given image message length limit.
…ng an image

The prompt for saving an image was here changed from having one filter with a few suggested image formats to multiple filters where there is one for each format with write support as well as one for the format of the data as-is if it is not already included. The filter for the current image's format is selected by default, which is made known when saving an animation and is otherwise JPG.
…ull screen

When using keybindings to exit full screen the old position and size would be used, resulting in scrolling to near the top of the document instead of to where the text object interacted with is. Since you will have already scrolled to the text object's position in the document when entering full screen this may be ignored when exiting full screen. This change also tidies up some code regarding the text objects, including updating the layout when their size change.
By default custom text objects are ignored by Copy and Paste operations in Qt. This change adds support for text objects of animated images when copying selected content from the log. This change also tidies up some code, making a capture of `this` more specific.
The following parts of the video controls can now be scrolled:
- The video bar, jumping 1 second forward or backwards
- The playback speed buttons, adjusting the speed by 5%
- The frame-by-frame buttons, switching to the next or previous frame
- The loop mode, switching to the next or previous mode

These now scrollable controls are in intentionally limited areas so as to avoid accidentally interfering with regular scrolling of the log. When in full screen the view may be scrolled as well to jump 1 second forward or backwards.
This change consistently uses the full name for functions that don't return or set a position or size. This is to improve readability.
This change replaces all occurrences of "Animation" regarding animated images in the name of enums, classes and functions with "ImageAnimation". This is to improve readability.
…andling more uniform for pasted images

This change correctly includes the message for when the server prohibits images for translation. It also makes errors where a file cannot be read handled in a more uniform manner with the other image-pasting errors, only replacing the error cause in the error message.
…th `QTextDocument`

This change removes the dependency on `LogTextBrowser` and enables adding support for animated images, as well as any other custom text objects if they were added in the future, using only the given document.
…text objects

Since percent encoding is used on top of base64 encoding specifically for avatars for some reason, this change handles it by checking for and decoding the percent encoding first, thereby supporting both formats with only slightly more code.
…deo controls are shown

This change makes each document have their own switch for showing video controls on animated images instead of one for the entire application. This makes the then inappropriate video controls never appear in tooltips and keeps the state separate between documents.
This change makes it clear and consistent when you have reached the end of an animated image by simply limiting jumps by frame to the beginning or end, just as when jumping by time.
…`nullptr` is not a valid input

This is done to improve readability, making it more clear where passing `nullptr` works.
… switch in context menu

This is done to accommodate the context menu of the rich text editor.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

client feature-request This issue or PR deals with a new feature ui

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants