AO3.js is a Node.js (et al.) API for scraping AO3 (Archive of Our Own) data straight to your own JavaScript (or TypeScript) server. It provides an interface to retrieve information on AO3 tags, works, series, and more!
Method | Description | Parameters | Return Type |
---|---|---|---|
getTag |
Retrieves details for a specific AO3 tag. | { tagName: string } - Name of the tag. |
Promise<Tag> |
getTagNameById |
Gets tag name based on its ID. | { tagId: string } - Tag ID to look up. |
Promise<string> |
getWork |
Fetches metadata for an AO3 work. | { workId: string, chapterId?: string } - The work ID, with optional chapter ID. |
Promise<WorkSummary> | Promise<LockedWorkSummary> |
getWorkWithChapters |
Fetches a work and its chapter list. | { workId: string } - The ID of the work. |
Promise<{ title: string; authors: Author[] | Anonymous; workId: string; chapters: Chapter[] }> |
getSeries |
Retrieves details for a specific series. | { seriesId: string } - The ID of the series. |
Promise<Series> |
getUser |
Fetches profile information for a user. | { username: string } - Username of the user to fetch. |
Promise<User> |
setFetcher |
Sets a custom fetch function for requests. | { fetcher: typeof fetch } - Custom fetch function. |
void |
Using setFetcher
, you can override the default fetch
method used by the library. This can be useful if:
- You need to provide custom headers for authentication or API key access.
- You want to use a different network library for enhanced functionality (e.g., retrying failed requests).
- You are working in a Node environment without native fetch support and need a polyfill.
Tag
: Details about a tag, includingid
,name
,category
, and metadata.WorkSummary
/LockedWorkSummary
: Summarizes a work, including title, authors, tags, and statistics.Series
: Information on a series, such as title, authors, works, and publication details.User
: Profile information for an AO3 user, including pseudonyms, works, bookmarks, and more.Chapter
: Details about individual chapters within a work.
With yarn
yarn install @bobaboard/ao3.js
or npm
npm install @bobaboard/ao3.js
Then go to town in your JavaScript (or TypeScript) files:
import { getTag, getWork } from "@bobaboard/ao3.js";
const tag = await getTag({
tagName: "Ever Given Container Ship (Anthropomorphic)",
});
const work = await getWork({ workId: "123456" });
Further explanation of AO3.js works and suggestions for how to add to it can be found in this comment. Also consider taking a look at TypeScript types.
Most methods in the public interface expect parameters to be passed as objects rather than individual arguments. This allows flexibility in expanding parameters without breaking the interface.
This library is meant to be used as part of a NodeJS application and run on a server. If you try to run it as part of a browser application, you'll run into an error about Cross-Origin Resource Sharing (CORS). In short—for your protection—browsers block data requests from a website to another, unless the destination website specifically allows such requests to be made. AO3 doesn't.
If you want to run a browser application written with this library, users will need a browser extension to allow CORS requests, like this one for Chrome.
JavaScript vs Python aside, this is a newer library that is being actively developed, and is not feature complete. If you'd like for us to prioritize a feature, please open an issue.
AO3.js uses:
- the fetch API to fetch the HTML making up an AO3 page
- cheerio to make it a DOM tree for our goals.
For an introduction to this kind of scraping, see here.
For the rest of the owl, you can reach us through our Issues tab, or at Fandom Coders.
This error means your runtime (e.g. NodeJS) does not include a fetch
implementation. The easiest way to fix this issue, is to switch to a runtime
that does support it. For NodeJS, this is any version above (and including)
18
.
If you need to use a older version of NodeJS, you can polyfill it by using the
node-fetch
library.
In your terminal run:
npm install node-fetch@2
If you wish to override fetch
with your own implementation, you can use the
setFetcher
method to use the fetch returned by the node-fetch
library. See
next section for more details.
import { setFetcher } from "@bobaboard/ao3.js";
import fetch from "node-fetch";
// You MUST call this before calling other ao3.js methods
setFetcher(fetch);
If you wish to provide more complex logic for fetch
(for example to handle
rate limiting), you can override the fetch method with your own implementation
by using the exported setFetch
function.
For example, to override fetch
with the node-fetch
implementation:
import { setFetcher } from "@bobaboard/ao3.js";
import fetch from "node-fetch";
// You MUST call this before calling other ao3.js methods
setFetcher(fetch);
This library doesn't handle caching requests by default. This means that if you call the same method twice, the underlying requests to AO3 will also be made twice.
Similarly, this library doesn't handle managing rate limit for you (not yet, at least). This means that if you make too many requests to AO3 too quickly, you'll get errors once AO3 starts asking you to pause requests.
If you want to avoid these issues, you can use the following code to add caching and automatic retrying to the library:
import { setFetcher } from "@bobaboard/ao3.js";
const CACHE = new Map();
setFetcher(async (...params: Parameters<typeof fetch>) => {
try {
if (CACHE.has(params[0])) {
console.log(`Using cached response for request to ${params[0]}`);
return CACHE.get(params[0]).clone();
}
console.log(`Making a new request to ${params[0]}`);
let response = await fetch(...params);
console.log(`Request status: ${response.status}`);
while (response.status === 429) {
const waitSeconds = response.headers.get("retry-after");
console.log(
`Asked to wait ${waitSeconds} seconds request to ${params[0]}`
);
if (!waitSeconds) {
throw new Error(
"A wait request was made without indication of length."
);
}
console.log(`Waiting ${waitSeconds} seconds`);
await new Promise((res) => {
setTimeout(() => res(null), parseInt(waitSeconds) * 1000);
});
console.log(`Continuing with request to ${params[0]}`);
response = await fetch(...params);
}
if (response.status === 200) {
// Remove request from the cache after 5 minutes
setTimeout(() => {
console.log(`Clearing cache entry for request ${params[0]}`);
CACHE.set(params[0], null);
}, 1000 * 60 * 5);
console.log(`Setting cache entry for request ${params[0]}`);
CACHE.set(params[0], response.clone());
}
return response;
} catch (e) {
console.error(e);
throw e;
}
});
The logging will help you understand what's going on, but it's by no mean necessary.
See (CONTRIBUTE.md)[CONTRIBUTE.md].