videox

Download HTML5 videos from a website page using Media Source Extensions (MSE).

Note:

videox is designed for pages using Media Source Extensions (MSE) technique. For pages using other techniques, just embed a HTTP URL into video tag, for example, videox will throw an error.
Some pages have video ads using the same technique as the actual video content, the MSE. videox can't distingush them, it just downloads all video ads and the actual video by default. The easiest way to deal with this is using a browser with ads block extension. Alternatively you can modify this program as you need as it's just a web crawler based on puppeteer.

Prerequisites

chrome. Needed if the websites were providing MP4 video you wanted that is usually the case. Otherwise chromium, puppeteer downloaded automatically is enough.

Design

https://www.tiaoxingyubolang.com/zh/article/2020-10-09_mediasource

Usage

const Videox = require('videox')

const targetUrl = 'https://www.youtube.com/watch?v=h32FxBqmu_U'

(async () = {
  const videox = new Videox({
    debug: true,
    headless: true,
    downloadBrowser: false,
    logTo: process.stdout,
    browserExecutePath: '/usr/bin/chromium',
    browserArgs: ['--no-sandbox'],
    downloadAsFile: true,
    downloadPath: path.join(__dirname, 'download'),
    checkCompleteLoopInterval: 100,
    waitForNextDataTimeout: 8000,
  })

  await videox.init()

  await videox.get(targetUrl)

  await videox.destroy()
})()

API

Class: Videox

Event: 'data'

objectURL <string> The URL created from URL.createObjectURL, usually starts with blob.
mimeCodec <string> Corresponding mimeCodec.
chunk <Buffer> The data received from page.

If options.downloadAsFile is specified as false, this event must be listened for receiving media data.

objectURL and mimeCode together identify a media file to which chunk corresponding.

new Videox([options])

options <object>
- debug <bool> Default: false.
- headless <bool> Default: true.
- downloadBrowser <bool> Default: false.
- logTo <Writable> Default: process.stdout.
- browserExecutePath: <string> Default: '/usr/bin/chromium'.
- browserArgs: <array>: Default: [].
- downloadAsFile <bool> Default: true.
- dowloadPath <string> Default: ''.
- checkCompleteLoopInterval <number> The time interval between checking whether current download progress is commplete, in milliseconds. Default: 100,
- waitForNextDataTimeout: <number> The timeout waiting for next media data, in milliseconds. Default: 3000.
Returns: <Videox>

Usually dowloadBrowser is false and browserExecutePath is filled with common browser path to download MP4 using browsers other than the default chromium. See puppeteer package for more information.

videox.init()

Returns: <Promise>

video.get(options)

pageUrl <string> Required.
Returns: <Promise>

videox.destroy()

Returns: <Promise>

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.github/workflows		.github/workflows
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
demo.html		demo.html
example.js		example.js
index.js		index.js
package-lock.json		package-lock.json
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

videox

Prerequisites

Design

Usage

API

Class: Videox

Event: 'data'

new Videox([options])

videox.init()

video.get(options)

videox.destroy()

About

Releases

Packages

Languages

License

arstgit/videox

Folders and files

Latest commit

History

Repository files navigation

videox

Prerequisites

Design

Usage

API

Class: Videox

Event: 'data'

new Videox([options])

videox.init()

video.get(options)

videox.destroy()

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages