Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add "Speeding up migrations" section to "State persistence" #1365

Draft
wants to merge 2 commits into
base: master
Choose a base branch
from
Draft
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ Long-running [Actor](../../index.mdx) jobs may need to migrate between servers.
To prevent data loss, long-running Actors should:

- Periodically save (persist) their state.
- Listem for [migration events](/sdk/js/api/apify/class/PlatformEventManager)
- Listen for [migration events](/sdk/js/api/apify/class/PlatformEventManager)
- Check for persisted state when starting, allowing them to resume from where they left off.

For short-running Actors, the risk of restarts and the cost of repeated runs are low, so you can typically ignore state persistence.
Expand Down Expand Up @@ -51,7 +51,7 @@ By default, an Actor keeps its state in the server's memory. During a server swi

The [Apify SDKs](/sdk) handle state persistence automatically.

This is done using the `Actor.on()` method and the `migrating` event.
This is done using the `Actor.on()` method and the `migrating` event.

- The `migrating` event is triggered just before a migration occurs, allowing you to save your state.
- To retrieve previously saved state, you can use the [`Actor.getValue`](/sdk/js/reference/class/Actor#getValue)/[`Actor.get_value`](/sdk/python/reference/class/Actor#get_value) methods.
Expand Down Expand Up @@ -81,15 +81,15 @@ await Actor.exit();
<TabItem value="Python" label="Python">

```python
from apify import Actor
from apify import Actor, Event

async def actor_migrate():
async def actor_migrate(_event_data):
await Actor.set_value('my-crawling-state', {'foo': 'bar'})

async def main():
async with Actor:
# ...
Actor.on('migrating', actor_migrate)
Actor.on(Event.MIGRATING, actor_migrate)
# ...
```

Expand Down Expand Up @@ -128,3 +128,50 @@ async def main():
</Tabs>

For improved Actor performance consider [caching repeated page data](/academy/expert-scraping-with-apify/saving-useful-stats).

## Speeding up migrations

Once your Actor receives the `migrating` event, the Apify platform will shut it down and restart it on a new server within one minute.
To speed this process up, once you have persisted the Actor state,
you can manually reboot the Actor in the `migrating` event handler using the `Actor.reboot()` method
available in the [Apify SDK for JavaScript](/sdk/js/reference/class/Actor#reboot) or [Apify SDK for Python](/sdk/python/reference/class/Actor#reboot).

<Tabs groupId="main">
<TabItem value="JavaScript" label="JavaScript">

```js
import { Actor } from 'apify';

await Actor.init();
// ...
Actor.on('migrating', async () => {
// ...
// save state
// ...
await Actor.reboot();
});
// ...
await Actor.exit();
```

</TabItem>
<TabItem value="Python" label="Python">

```python
from apify import Actor, Event

async def actor_migrate(_event_data):
# ...
# save state
# ...
await Actor.reboot()

async def main():
async with Actor:
# ...
Actor.on(Event.MIGRATING, actor_migrate)
# ...
```

</TabItem>
</Tabs>
Loading