AO3.js

Scrapes data from ao3.org. Now with Types™.

What it is

AO3.js is a Node.js (et al.) API for scraping AO3 (Archive of Our Own) data straight to your own JavaScript (or TypeScript) server. It provides an interface to retrieve information on AO3 tags, works, series, and more!

What is capable of

Method	Description	Parameters	Return Type
`getTag`	Retrieves details for a specific AO3 tag.	`{ tagName: string }` - Name of the tag.	`Promise<Tag>`
`getTagNameById`	Gets tag name based on its ID.	`{ tagId: string }` - Tag ID to look up.	`Promise<string>`
`getWork`	Fetches metadata for an AO3 work.	`{ workId: string, chapterId?: string }` - The work ID, with optional chapter ID.	`Promise<WorkSummary>` \| `Promise<LockedWorkSummary>`
`getWorkWithChapters`	Fetches a work and its chapter list.	`{ workId: string }` - The ID of the work.	`Promise<{ title: string; authors: Author[] \| Anonymous; workId: string; chapters: Chapter[] }>`
`getSeries`	Retrieves details for a specific series.	`{ seriesId: string }` - The ID of the series.	`Promise<Series>`
`getUser`	Fetches profile information for a user.	`{ username: string }` - Username of the user to fetch.	`Promise<User>`
`setFetcher`	Sets a custom fetch function for requests.	`{ fetcher: typeof fetch }` - Custom fetch function.	`void`

Why Override Fetch?

Using setFetcher, you can override the default fetch method used by the library. This can be useful if:

You need to provide custom headers for authentication or API key access.
You want to use a different network library for enhanced functionality (e.g., retrying failed requests).
You are working in a Node environment without native fetch support and need a polyfill.

Data Types

Tag: Details about a tag, including id, name, category, and metadata.
WorkSummary / LockedWorkSummary: Summarizes a work, including title, authors, tags, and statistics.
Series: Information on a series, such as title, authors, works, and publication details.
User: Profile information for an AO3 user, including pseudonyms, works, bookmarks, and more.
Chapter: Details about individual chapters within a work.

Sample usage

With yarn

yarn install @bobaboard/ao3.js

or npm

npm install @bobaboard/ao3.js

Then go to town in your JavaScript (or TypeScript) files:

import { getTag, getWork } from "@bobaboard/ao3.js";

const tag = await getTag({
  tagName: "Ever Given Container Ship (Anthropomorphic)",
});
const work = await getWork({ workId: "123456" });

Further explanation of AO3.js works and suggestions for how to add to it can be found in this comment. Also consider taking a look at TypeScript types.

Important Notes

Parameters Are Objects!

Most methods in the public interface expect parameters to be passed as objects rather than individual arguments. This allows flexibility in expanding parameters without breaking the interface.

If you run into CORS errors

This library is meant to be used as part of a NodeJS application and run on a server. If you try to run it as part of a browser application, you'll run into an error about Cross-Origin Resource Sharing (CORS). In short—for your protection—browsers block data requests from a website to another, unless the destination website specifically allows such requests to be made. AO3 doesn't.

If you want to run a browser application written with this library, users will need a browser extension to allow CORS requests, like this one for Chrome.

Difference between this and the Python library

JavaScript vs Python aside, this is a newer library that is being actively developed, and is not feature complete. If you'd like for us to prioritize a feature, please open an issue.

How It Works

AO3.js uses:

the fetch API to fetch the HTML making up an AO3 page
cheerio to make it a DOM tree for our goals.

For an introduction to this kind of scraping, see here.

For the rest of the owl, you can reach us through our Issues tab, or at Fandom Coders.

`ReferenceError`: `fetch` is not defined

This error means your runtime (e.g. NodeJS) does not include a fetch implementation. The easiest way to fix this issue, is to switch to a runtime that does support it. For NodeJS, this is any version above (and including) 18.

If you need to use a older version of NodeJS, you can polyfill it by using the node-fetch library.

In your terminal run:

npm install node-fetch@2

If you wish to override fetch with your own implementation, you can use the setFetcher method to use the fetch returned by the node-fetch library. See next section for more details.

import { setFetcher } from "@bobaboard/ao3.js";
import fetch from "node-fetch";

// You MUST call this before calling other ao3.js methods
setFetcher(fetch);

Overriding fetch

If you wish to provide more complex logic for fetch (for example to handle rate limiting), you can override the fetch method with your own implementation by using the exported setFetch function.

For example, to override fetch with the node-fetch implementation:

import { setFetcher } from "@bobaboard/ao3.js";
import fetch from "node-fetch";

// You MUST call this before calling other ao3.js methods
setFetcher(fetch);

Handling caching + rate limiting

This library doesn't handle caching requests by default. This means that if you call the same method twice, the underlying requests to AO3 will also be made twice.

Similarly, this library doesn't handle managing rate limit for you (not yet, at least). This means that if you make too many requests to AO3 too quickly, you'll get errors once AO3 starts asking you to pause requests.

If you want to avoid these issues, you can use the following code to add caching and automatic retrying to the library:

import { setFetcher } from "@bobaboard/ao3.js";

const CACHE = new Map();
setFetcher(async (...params: Parameters<typeof fetch>) => {
  try {
    if (CACHE.has(params[0])) {
      console.log(`Using cached response for request to ${params[0]}`);
      return CACHE.get(params[0]).clone();
    }
    console.log(`Making a new request to ${params[0]}`);
    let response = await fetch(...params);
    console.log(`Request status: ${response.status}`);
    while (response.status === 429) {
      const waitSeconds = response.headers.get("retry-after");
      console.log(
        `Asked to wait ${waitSeconds} seconds request to ${params[0]}`
      );
      if (!waitSeconds) {
        throw new Error(
          "A wait request was made without indication of length."
        );
      }
      console.log(`Waiting ${waitSeconds} seconds`);
      await new Promise((res) => {
        setTimeout(() => res(null), parseInt(waitSeconds) * 1000);
      });
      console.log(`Continuing with request to ${params[0]}`);
      response = await fetch(...params);
    }
    if (response.status === 200) {
      // Remove request from the cache after 5 minutes
      setTimeout(() => {
        console.log(`Clearing cache entry for request ${params[0]}`);
        CACHE.set(params[0], null);
      }, 1000 * 60 * 5);
      console.log(`Setting cache entry for request ${params[0]}`);
      CACHE.set(params[0], response.clone());
    }
    return response;
  } catch (e) {
    console.error(e);
    throw e;
  }
});

The logging will help you understand what's going on, but it's by no mean necessary.

How do I help?

See (CONTRIBUTE.md)[CONTRIBUTE.md].

Name		Name	Last commit message	Last commit date
Latest commit History 165 Commits
.circleci		.circleci
src		src
tests		tests
types		types
.editorconfig		.editorconfig
.gitattributes		.gitattributes
.gitignore		.gitignore
CONTRIBUTE.md		CONTRIBUTE.md
LICENSE		LICENSE
README.md		README.md
jest.config.ts		jest.config.ts
logo-transparent-small.png		logo-transparent-small.png
logo-transparent.png		logo-transparent.png
logo.png		logo.png
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AO3.js

What it is

What is capable of

Why Override Fetch?

Data Types

Sample usage

Important Notes

Parameters Are Objects!

If you run into CORS errors

Difference between this and the Python library

How It Works

`ReferenceError`: `fetch` is not defined

Overriding fetch

Handling caching + rate limiting

How do I help?

About

Releases

Packages

Contributors 6

Languages

License

FujoWebDev/AO3.js

Folders and files

Latest commit

History

Repository files navigation

AO3.js

What it is

What is capable of

Why Override Fetch?

Data Types

Sample usage

Important Notes

Parameters Are Objects!

If you run into CORS errors

Difference between this and the Python library

How It Works

ReferenceError: fetch is not defined

Overriding fetch

Handling caching + rate limiting

How do I help?

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 6

Languages

`ReferenceError`: `fetch` is not defined

Packages