Skip to content

Commit

Permalink
version 24.10.29
Browse files Browse the repository at this point in the history
  • Loading branch information
ddennedy committed Oct 30, 2024
1 parent 699047e commit 9e639f9
Show file tree
Hide file tree
Showing 8 changed files with 112 additions and 16 deletions.
53 changes: 53 additions & 0 deletions _posts/2024-10-29-new-release-241029.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
---
layout: post
title: "New Version 24.10: Whisper to a Scream"
author: Dan Dennedy
category: blog
---

Version 24.10.29 is now available for [**DOWNLOAD**]({{ "/download/" | prepend: site.baseurl | prepend: site.url }})!

### Speech to Text

Shotcut gets its first AI based on [OpenAI's Whisper](https://openai.com/index/whisper/),
courtesy of the [whisper.cpp](https://github.com/ggerganov/whisper.cpp) project.
This is available through **Subtitles > Speech to Text** menu item or button: <kbd>![Speech to Text icon](https://d2t917e3b1b2xy.cloudfront.net/original/3X/8/2/8280d8783625e9b235728767f702e2e80abe3714.png)</kbd>.

- Our builds include a basic model that has decent speed and accuracy but not a big size. (You can think of the model as the brain.)
- You can [download](https://huggingface.co/ggerganov/whisper.cpp/tree/main) a bigger and better better brain (model) in `ggml` format and configure it in the **Speech to Text** dialog, but it will be slower.
- The dialog creates two jobs that appear in the **Jobs** panel: one to export audio and another to convert to text.
- The results are added to the **Subtitles** panel as a new top-level Subtitle Track.
- Currently, the only GPU our build supports is Apple Silicon. Otherwise, it is heavily multi-threaded on the CPU.
- Known quirk: subtitle items sometimes start earlier than expected. Timing is provided by the model and tool, and we lack the skills and resources to improve this.
- Expect there to be occasional errors. Like humans and non-ideal conditions, it is not perfect. We will not take action on bug reports about some piece of audio not converting to the expected text.
- OpenAI has made some [warnings](https://huggingface.co/openai/whisper-large#evaluated-use) about the usage of their Whisper models:
> In particular, we caution against using Whisper models to transcribe recordings of individuals taken without their consent.... We recommend against use in high-risk domains like decision-making contexts, where flaws in accuracy can lead to pronounced flaws in outcomes.
### Transition Improvements

- **Ripple Delete** a transition restores the entirety of the clips included in the transition.
- **Lift** (non-ripple delete) a transition no longer leaves a gap; the gap is filled with the adjacent clips.
- Moving an adjacent clip away increases the transition duration instead of detaching and leaving a gap.

### Other Improvements

- Removed the **Export > Video > Resample** button. Now, there are simply ignorable inline warnings when making certain changes.
- Added **File > Show Project in Folder** to menu.
- Added a `decimals <number>` option to numeric keywords in the **GPS Text** video filter.
- Changed **Recent Projects** to **Projects**: items in this view no longer disappear as **Recent** reaches its maximum length and old items are removed.
- Added a **Remove** action to the context menu in **Projects**.
- Hide the **Reframe** video filter and button if **GPU Effects** is on.
- Upgraded FFmpeg to version 7.1.

### Fixes

- Fixed a crash doing when doing more than one **Playlist > menu > Add Selected to Slideshow**. In theory, this could fix other random crashes in **Timeline**.
- Fixed a crash opening a project containing a subtitle track with no items.
- Fixed odd value for computed width in **Reframe** output video filter causes export to fail.
- Fixed **Reframe** visual control can create odd-valued dimensions.
- Fixed AVCHD video frame rate is double (could fix other formats).
- Fixed making a proxy video for a iPhone 16 Pro video containing spatial audio.
- Fixed GPU filters paste below non-GPU filters.
- Fixed **Slideshow Generator** dialog is too tall with vertical video mode.
- Fixed **GPS Offset** would reset in **GPS Text** video filter.
- Fixed the maximum allowed **Time** in the **Time Remap** filter to prevent white frames.
9 changes: 6 additions & 3 deletions credits.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,9 +22,9 @@ projects:
License](http://www.gnu.org/licenses/gpl-2.0.html)
- [WebM](http://www.webmproject.org/) VP8 and VP9 encoders under a [BSD
License](http://www.webmproject.org/license/software/)
- [aom](https://aomedia.googlesource.com/aom/) AV1 encoder, Copyright (c) 2016, Alliance for Open Media. All rights rerved. Under a [BSD
- [aom](https://aomedia.googlesource.com/aom/) AV1 encoder, Copyright (c) 2016, Alliance for Open Media. All rights reserved. Under a [BSD
License](https://aomedia.googlesource.com/aom/+/refs/heads/master/LICENSE)
- [dav1d](https://www.videolan.org/projects/dav1d.html) AV1 decoder, Copyright © 2018-2019, VideoLAN and dav1d authors. Unr a [BSD
- [dav1d](https://www.videolan.org/projects/dav1d.html) AV1 decoder, Copyright © 2018-2019, VideoLAN and dav1d authors. Under a [BSD
License](https://code.videolan.org/videolan/dav1d/-/blob/master/COPYING)
- [LAME](http://lame.sourceforge.net/) MP3 encoder under the [LGPL
v2.1 License](http://www.gnu.org/licenses/lgpl-2.1.html)
Expand Down Expand Up @@ -105,10 +105,13 @@ exption](https://doc.qt.io/qt-5.9/qtnetwork-index.html#licenses-and-attributions
License](http://www.gnu.org/licenses/gpl-2.0.html)
- [Spatial Media](https://github.com/VarolOkan/spatial-media), a C++ port of [Google's Spatial
Media](https://github.com/google/spatial-media), under the [Apache 2.0 License](http://www.apache.org/licenses/ICENSE-2.0)
- [VMAF](https://github.com/Netflix/vmaf), a visual quality measurement tool by Netflix under the [BSD-2-Clause-Patent linse](https://raw.githubusercontent.com/Netflix/vmaf/master/LICENSE)
- [VMAF](https://github.com/Netflix/vmaf), a visual quality measurement tool by Netflix under the [BSD-2-Clause-Patent license](https://raw.githubusercontent.com/Netflix/vmaf/master/LICENSE)
- [gopro2gpx](https://github.com/NetworkAndSoftware/gopro2gpx) and [gpmf-parser](https://github.com/gopro/gpmf-parser)
under the [Apache 2.0 License](http://www.apache.org/licenses/LICENSE-2.0)
- [OpenCV](https://opencv.org) under the [Apache 2.0 License](http://www.apache.org/licenses/LICENSE-2.0)
- [libspatialaudio](https://github.com/videolabs/libspatialaudio) under the [LGPL v2.1
License](http://www.gnu.org/licenses/lgpl-2.1.html)
- [SVT-AV1](https://gitlab.com/AOMediaCodec/SVT-AV1) under the [BSD 3-Clause Clear License](https://spdx.org/licenses/BSD-3-Clause-Clear.html)
- [OpenBLAS](https://www.openblas.net/) under the [BSD-3-Clause license](https://github.com/OpenMathLib/OpenBLAS/blob/develop/LICENSE)
- [whisper.cpp](https://github.com/ggerganov/whisper.cpp) under the [MIT
License](http://opensource.org/licenses/mit-license.php)
18 changes: 9 additions & 9 deletions download.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ We pledge that our downloads are always free of
malware, spyware, and adware. However, we can only provide that guarantee if you come to this website
to download.

#### Current Version: 24.09.13
#### Current Version: 24.10.29

<div class="OSTEST">
<p>
Expand Down Expand Up @@ -39,7 +39,7 @@ To avoid ads and get automatic updates:<br>

{:.win}
|-----------------------|-------------------
| [Windows installer](https://sourceforge.net/projects/shotcut/files/v24.09.13/shotcut-win64-240913.exe/download) | [Windows portable zip](https://sourceforge.net/projects/shotcut/files/v24.09.13/shotcut-win64-240913.zip/download)
| [Windows installer](https://sourceforge.net/projects/shotcut/files/v24.10.29/shotcut-win64-241029.exe/download) | [Windows portable zip](https://sourceforge.net/projects/shotcut/files/v24.10.29/shotcut-win64-241029.zip/download)
{:.withborders}

{:.win}
Expand All @@ -50,7 +50,7 @@ To avoid ads and get automatic updates:<br>

{:.win}
|-----------------------|-------------------
| [Windows installer](https://sourceforge.net/projects/shotcut/files/v24.09.13/shotcut-win_ARM-240913.exe/download) | [Windows portable zip](https://sourceforge.net/projects/shotcut/files/v24.09.13/shotcut-win_ARM-240913.zip/download)
| [Windows installer](https://sourceforge.net/projects/shotcut/files/v24.10.29/shotcut-win_ARM-241029.exe/download) | [Windows portable zip](https://sourceforge.net/projects/shotcut/files/v24.10.29/shotcut-win_ARM-241029.zip/download)
{:.withborders}

{:.win}
Expand All @@ -76,7 +76,7 @@ To avoid ads and get automatic updates:<br>

{:.mac}
|-----------------------
| [macOS universal](https://sourceforge.net/projects/shotcut/files/v24.09.13/shotcut-macos-240919.dmg/download)
| [macOS universal](https://sourceforge.net/projects/shotcut/files/v24.10.29/shotcut-macos-241029.dmg/download)
{:.withborders}

{:.mac}
Expand All @@ -96,7 +96,7 @@ To avoid ads and get automatic updates:<br>
**Important**: If you have a Mac that is from 2013 or earlier you might experience a video preview color problem due to our migration to [Apple Metal](https://developer.apple.com/metal/). In that case, use [version 22.12.21](https://sourceforge.net/projects/shotcut/files/v22.12.21/shotcut-macos-221221.dmg/download).

{:.mac}
An [unsigned app bundle is available](https://sourceforge.net/projects/shotcut/files/v24.09.13/shotcut-macos-unsigned-240913.dmg/download) so that you
An [unsigned app bundle is available](https://sourceforge.net/projects/shotcut/files/v24.10.29/shotcut-macos-unsigned-241029.dmg/download) so that you
can modify the build per the Free Software license agreement.

---
Expand All @@ -120,7 +120,7 @@ src='https://raw.githubusercontent.com/snapcore/snap-store-badges/master/EN/%5BE

{:.linux}
|-----------------------|-------------------
| [Linux portable tar](https://sourceforge.net/projects/shotcut/files/v24.09.13/shotcut-linux-x86_64-240913.txz/download) | [Linux AppImage](https://sourceforge.net/projects/shotcut/files/v24.09.13/shotcut-linux-x86_64-240913.AppImage/download)
| [Linux portable tar](https://sourceforge.net/projects/shotcut/files/v24.10.29/shotcut-linux-x86_64-241029.txz/download) | [Linux AppImage](https://sourceforge.net/projects/shotcut/files/v24.10.29/shotcut-linux-x86_64-241029.AppImage/download)
{:.withborders}

{:.linux}
Expand All @@ -146,11 +146,11 @@ the portable tar.
##### Other

File checksums for downloads are available in
[md5sum](https://github.com/mltframework/shotcut/releases/download/v24.09.13/md5sums.txt)
or [sha256sum](https://github.com/mltframework/shotcut/releases/download/v24.09.13/sha256sums.txt) format.
[md5sum](https://github.com/mltframework/shotcut/releases/download/v24.10.29/md5sums.txt)
or [sha256sum](https://github.com/mltframework/shotcut/releases/download/v24.10.29/sha256sums.txt) format.

[Source code
archive](https://github.com/mltframework/shotcut/releases/download/v24.09.13/shotcut-src-240913.txz)
archive](https://github.com/mltframework/shotcut/releases/download/v24.10.29/shotcut-src-241029.txz)
/ [GitHub repository](https://github.com/mltframework/shotcut)

[Older versions](https://github.com/mltframework/shotcut/releases/) are
Expand Down
1 change: 1 addition & 0 deletions features.html
Original file line number Diff line number Diff line change
Expand Up @@ -82,6 +82,7 @@ <h3>Audio Features</h3>
<li>Stereo, mono, 5.1 surround, quad surround, and Ambisonic spatial audio configurations</li>
<li>Pitch compensation for video speed changes</li>
<li>Record directly to timeline for voiceover, for example</li>
<li>Convert spoken word to subtitle text</li>
</ul>
</div>
<div class="col-md-6">
Expand Down
3 changes: 3 additions & 0 deletions notes/configuration/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,7 @@ Windows registry, a bool is stored as a string: true or false.
| savePath | string | the file system path for the file-save dialog
| recent | string list | list of recent media and XML files with full path: comma-separated in Linux or Windows INI, multi-string in Windows registry, and array of strings in macOS plist (View > Recent)
| | | This is no longer saved here as of version 23.05.07 and moved to a separate `recent.ini` text file.
| projects | string list | list of XML project files with full path: comma-separated `recent.ini` text file
| clearRecent | bool | Setting > Clear Recent on Exit
| theme | string | UI theme, one of: dark, light, or system (Settings > Theme)
| titleBars | bool | whether to show the title bar for UI panels (View > Show Title Bars)
Expand Down Expand Up @@ -181,6 +182,8 @@ Windows registry, a bool is stored as a string: true or false.
| columns/duration | bool | whether to show the Duration column (default true)
| trackTimeline | bool | Subtitles > menu > Track Timeline Cursor (default true)
| showPrevNext | bool | Subtitles > menu > Show Previous/Next (default true)
| whisperExe | string | override full path to whisper.cpp main example executable
| whisperModel | string | override full path the whisper model in ggml format
| **notes**
| zoom | real number | Notes > context menu > Decrease/Increase Text Size
{:.withborders}
2 changes: 1 addition & 1 deletion notes/windowsdev/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ category: notes
If your Qt installer no longer includes this you can [get it from our S3 bucket](https://s3.amazonaws.com/misc.meltymedia/shotcut-build/qt-6.7.1-x64-mingw.txz), and
extract this alongside your other Qt versions, for example `C:\Qt`.
(You can get `tar` and `xz` needed to extract this from `msys2`.)
- [Shotcut SDK (930 MB current version 24.09.13)](https://s3.amazonaws.com/builds.us.meltytech/shotcut/shotcut-win64-sdk-240913.txz)
- [Shotcut SDK (1.0 GB current version 24.10.29)](https://s3.amazonaws.com/builds.us.meltytech/shotcut/shotcut-win64-sdk-241029.txz)
Extract it to `C:\Projects`

1. Extract the Shotcut SDK .zip file to a new folder in `C:\` named "Projects" (`C:\Projects`).
Expand Down
36 changes: 36 additions & 0 deletions releasenotes.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,42 @@ These are brief notes about known problems and feature additions. See
log](https://github.com/mltframework/shotcut/commits/master) for more
information.

##### Release 24.10.29

- Removed the **Export > Video > Resample** button. Now, there are simply ignorable inline warnings when making certain changes.
- Added **Subtitles > menu > Speech to Text**:
- This uses AI, but we did not make it. It uses a C++ port of [OpenAI Whisper](https://openai.com/index/whisper/) called [whisper.cpp](https://github.com/ggerganov/whisper.cpp).
- Expect there to be occasional errors. Like humans and non-ideal conditions; it is not perfect. We will not take action on bug reports about some piece of audio not converting to the expected text.
- Our builds include a basic model that has decent speed and accuracy but not a big size. (You can think of the model as the brain.)
- You can download a bigger and better better brain (model) in `ggml` format and configure it in the **Speech to Text** dialog, but it will be slower.
- The dialog creates two jobs that appear in the **Jobs** panel: one to export audio and another to convert to text.
- The results are added to the **Subtitles** panel as a new top-level Subtitle Track.
- Currently, the only GPU our build supports is Apple Silicon. Otherwise, it is heavily multi-threaded on the CPU.
- Known quirk: subtitle items sometimes start earlier than expected.
- **Transition** improvements:
- **Ripple Delete** a transition restores the entirety of the clips included in the transition.
- **Lift** (non-ripple delete) a transition no longer leaves a gap; the gap is filled with the adjacent clips.
- Moving an adjacent clip can now adjust the transition duration.
- Known bug: Using **Undo** after any of the above disconnects the transition from its adjacent clips and returns future operations around the transition to the old behavior.
- The only way to get the old behavior is to first drag the transition to an empty area of another track. Ripple mode should be on if you want ripple delete. Or, simply accept the new behavior and adjust things as needed.
- Added **File > Show Project in Folder** to menu.
- Added a `decimals <number>` option to numeric keywords in the **GPS Text** video filter.
- Changed **Recent Projects** to **Projects**: items in this view no longer disappear as **Recent** reaches its maximum length and old items are removed.
- Added a **Remove** action to the context menu in **Projects**.
- Fixed crash opening a project containing a subtitle track with no items.
- Fixed a crash doing when doing more than one **Playlist > menu > Add Selected to Slideshow**. In theory, this could fix other random crashes in **Timeline**.
- Fixed odd value for computed width in **Reframe** output video filter causes export to fail.
- Fixed **Reframe** visual control can create odd-valued dimensions.
- Fixed making a proxy video for a iPhone 16 Pro video containing spatial audio.
- Fixed AVCHD video frame rate is double (could fix other formats).
- Fixed GPU filters paste below non-GPU filters.
- Fixed **Slideshow Generator** dialog is too tall with vertical video mode.
- Fixed **GPS Offset** would reset in **GPS Text** video filter.
- Fixed the maximum allowed **Time** in the **Time Remap** filter to prevent white frames.
- Hide the **Reframe** video filter and button if **GPU Effects** is on.
- Upgraded FFmpeg to version 7.1.


##### Release 24.09.13

- Fixed seeking and frozen video with some files or scenarios.
Expand Down
6 changes: 3 additions & 3 deletions version.json
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
{
"version_number": 240913,
"version_string": "24.09.13",
"url": "https://shotcut.org/blog/new-release-240913/"
"version_number": 241029,
"version_string": "24.10.29",
"url": "https://shotcut.org/blog/new-release-241029/"
}

0 comments on commit 9e639f9

Please sign in to comment.