Skip to content

Commit

Permalink
fix: avoid permanent redirects (upstream branch) (#1202)
Browse files Browse the repository at this point in the history
Closes #1101. It's the same PR,
but from an upstream branch, not from my fork. As discussed on Slack,
this works around unauthorized npm token on CI.

I also rebased the original branch on top of current master, hopefully
without mistakes. Otherwise the changes should be the same.
  • Loading branch information
honzajavorek authored Sep 10, 2024
2 parents 069d356 + d7894d6 commit b04820e
Show file tree
Hide file tree
Showing 71 changed files with 134 additions and 133 deletions.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@

## Intro

This repository is the home of Apify's documentation, which you can find at [docs.apify.com](https://docs.apify.com/). The documentation is written using [Markdown](https://github.com/adam-p/markdown-here/wiki/Markdown-Cheatsheet). Source files of the [platform documentation](https://docs.apify.com/platform) are located in the [/sources](https://github.com/apify/apify-docs/tree/master/sources) directory. However, other sections, such as SDKs for [JavaScript/Node.js](https://docs.apify.com/sdk/js/), [Python](https://docs.apify.com/sdk/python/), or [CLI](https://docs.apify.com/cli), have their own repositories. For more information, see the [Contributing guidelines](./CONTRIBUTING.md).
This repository is the home of Apify's documentation, which you can find at [docs.apify.com](https://docs.apify.com/). The documentation is written using [Markdown](https://github.com/adam-p/markdown-here/wiki/Markdown-Cheatsheet). Source files of the [platform documentation](https://docs.apify.com/platform) are located in the [/sources](https://github.com/apify/apify-docs/tree/master/sources) directory. However, other sections, such as SDKs for [JavaScript/Node.js](https://docs.apify.com/sdk/js/), [Python](https://docs.apify.com/sdk/python/), or [CLI](https://docs.apify.com/cli/), have their own repositories. For more information, see the [Contributing guidelines](./CONTRIBUTING.md).

## Before you start contributing

Expand Down
2 changes: 1 addition & 1 deletion sources/academy/glossary/concepts/dynamic_pages.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ slug: /concepts/dynamic-pages

---

Oftentimes, web pages load additional information dynamically, long after their main body is loaded in the browser. A subset of dynamic pages takes this approach further and loads all of its content dynamically. Such style of constructing websites is called Single-page applications (SPAs), and it's widespread thanks to some popular JavaScript libraries, such as [React](https://reactjs.org/) or [Vue](https://vuejs.org/).
Oftentimes, web pages load additional information dynamically, long after their main body is loaded in the browser. A subset of dynamic pages takes this approach further and loads all of its content dynamically. Such style of constructing websites is called Single-page applications (SPAs), and it's widespread thanks to some popular JavaScript libraries, such as [React](https://react.dev/) or [Vue](https://vuejs.org/).

As you progress in your scraping journey, you'll quickly realize that different websites load their content and populate their pages with data in different ways. Some pages are rendered entirely on the server, some retrieve the data dynamically, and some use a combination of both those methods.

Expand Down
2 changes: 1 addition & 1 deletion sources/academy/glossary/concepts/http_headers.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,4 +47,4 @@ HTTP/1.1 and HTTP/2 headers have several differences. Here are the three key dif
2. Certain headers are no longer used in HTTP/2 (such as **Connection** along with a few others related to it like **Keep-Alive**). In HTTP/2, connection-specific headers are prohibited. While some browsers will ignore them, Safari and other Webkit-based browsers will outright reject any response that contains them. Easy to do by accident, and a big problem.
3. While HTTP/1.1 headers are case-insensitive and could be sent by the browsers with capitalized letters (e.g. **Accept-Encoding**, **Cache-Control**, **User-Agent**), HTTP/2 headers must be lower-cased (e.g. **accept-encoding**, **cache-control**, **user-agent**).

> To learn more about the difference between HTTP/1.1 and HTTP/2 headers, check out [this](https://httptoolkit.tech/blog/translating-http-2-into-http-1/) article
> To learn more about the difference between HTTP/1.1 and HTTP/2 headers, check out [this](https://httptoolkit.com/blog/translating-http-2-into-http-1/) article
4 changes: 2 additions & 2 deletions sources/academy/glossary/concepts/robot_process_automation.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ In a traditional automation workflow, you
2. Program a bot that does each of those chunks.
3. Execute the chunks of code in the right order (or in parallel).

With the advance of [machine learning](https://en.wikipedia.org/wiki/Machine_learning), it is becoming possible to [record](https://www.nice.com/rpa/rpa-guide/process-recorder-function-in-rpa/) your workflows and analyze which can be automated. However, this technology is still not perfected and at times can even be less practical than the manual process.
With the advance of [machine learning](https://en.wikipedia.org/wiki/Machine_learning), it is becoming possible to [record](https://www.nice.com/info/rpa-guide/process-recorder-function-in-rpa/) your workflows and analyze which can be automated. However, this technology is still not perfected and at times can even be less practical than the manual process.

## Is RPA the same as web scraping? {#is-rpa-the-same-as-web-scraping}

Expand All @@ -39,6 +39,6 @@ An easy-to-follow [video](https://www.youtube.com/watch?v=9URSbTOE4YI) on what R

To learn about RPA in plain English, check out [this](https://enterprisersproject.com/article/2019/5/rpa-robotic-process-automation-how-explain) article.

[This](https://www.cio.com/article/3236451/what-is-rpa-robotic-process-automation-explained.html) article explains what RPA is and discusses both its advantages and disadvantages.
[This](https://www.cio.com/article/227908/what-is-rpa-robotic-process-automation-explained.html) article explains what RPA is and discusses both its advantages and disadvantages.

You might also like to check out this article on [12 Steps to Automate Workflows](https://quandarycg.com/automating-workflows/).
2 changes: 1 addition & 1 deletion sources/academy/glossary/tools/postman.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ slug: /tools/postman

[Postman](https://www.postman.com/) is a powerful collaboration platform for API development and testing. For scraping use-cases, it's mainly used to test requests and proxies (such as checking the response body of a raw request, without loading any additional resources such as JavaScript or CSS). This tool can do much more than that, but we will not be discussing all of its capabilities here. Postman allows us to test requests with cookies, headers, and payloads so that we can be entirely sure what the response looks like for a request URL we plan to eventually use in a scraper.

The desktop app can be downloaded from its [official download page](https://www.postman.com/downloads/), or the web app can be used with a signup - no download required. If this is your first time working with a tool like Postman, we recommend checking out their [Getting Started guide](https://learning.postman.com/docs/getting-started/introduction/).
The desktop app can be downloaded from its [official download page](https://www.postman.com/downloads/), or the web app can be used with a signup - no download required. If this is your first time working with a tool like Postman, we recommend checking out their [Getting Started guide](https://learning.postman.com/docs/introduction/overview/).

## Understanding the interface {#understanding-the-interface}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ The **Dockerfile** is a file which gives the Apify platform (or Docker, more spe

If your project doesn’t already contain a Dockerfile, don’t worry! Apify offers [many base images](/sdk/js/docs/guides/docker-images) that are optimized for building and running Actors on the platform, which can be found [here](https://hub.docker.com/u/apify). When using a language for which Apify doesn't provide a base image, [Docker Hub](https://hub.docker.com/) provides a ton of free Docker images for most use-cases, upon which you can create your own images.

> Tip: You can see all of Apify's Docker images [on DockerHub](https://hub.docker.com/r/apify/).
> Tip: You can see all of Apify's Docker images [on DockerHub](https://hub.docker.com/u/apify).
At the base level, each Docker image contains a base operating system and usually also a programming language runtime (such as Node.js or Python). You can also find images with preinstalled libraries or install them yourself during the build step.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ In one of the later lessons, we'll be learning how to integrate our Actor on the

### Docker {#docker}

Docker is a massive topic on its own, but don't be worried! We only expect you to know and understand the very basics of it, which can be learned about in [this short article](https://docs.docker.com/get-started/overview/) (10 minute read).
Docker is a massive topic on its own, but don't be worried! We only expect you to know and understand the very basics of it, which can be learned about in [this short article](https://docs.docker.com/guides/docker-overview/) (10 minute read).

### The basics of Actors {#actor-basics}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -230,7 +230,7 @@ await Actor.exit();

**A:** The Apify client mimics the Apify API, so there aren't any super significant differences. It's super handy as it helps with managing the API calls (parsing, error handling, retries, etc) and even adds convenience functions.

The one main difference is that the Apify client automatically uses [**exponential backoff**](/api/client/js#retries-with-exponential-backoff) to deal with errors.
The one main difference is that the Apify client automatically uses [**exponential backoff**](/api/client/js/docs#retries-with-exponential-backoff) to deal with errors.

**Q: How do you pass input when running an Actor or task via API?**

Expand Down
6 changes: 3 additions & 3 deletions sources/academy/platform/get_most_of_actors/actor_readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ slug: /get-most-of-actors/actor-readme
- Whenever you build an Actor, think of the original request/idea and the "use case" = "user need" it should solve, please take notes and share them with Apify, so we can help you write a blog post supporting your Actor with more information, more detailed explanation, better SEO.
- Consider adding a video, images, and screenshots to your README to break up the text.
- This is an example of an Actor with a README that corresponds well to the guidelines below:
- https://apify.com/dtrungtin/airbnb-scraper
- [apify.com/tri_angle/airbnb-scraper](https://apify.com/tri_angle/airbnb-scraper)
- Tip no.1: if you want to add snippets of code anywhere in your README, you can use [Carbon](https://github.com/carbon-app/carbon).
- Tip no.2: if you need any quick Markdown guidance, check out https://www.markdownguide.org/cheat-sheet/

Expand Down Expand Up @@ -74,12 +74,12 @@ Aim for sections 1–6 below and try to include at least 300 words. You can move
- Refer to the input tab on Actor's detail page. If you like, you can add a screenshot showing the user what the input fields will look like.
- This is an example of how to refer to the input tab:

> Twitter Scraper has the following input options. Click on the [input tab](https://apify.com/vdrmota/twitter-scraper/input-schema) for more information.
> Twitter Scraper has the following input options. Click on the [input tab](https://apify.com/quacker/twitter-scraper/input-schema) for more information.
7. **Output**

- Mention "You can download the dataset extracted by (Actor name) in various formats such as JSON, HTML, CSV, or Excel.”
- Add a simplified JSON dataset example, like here: https://apify.com/drobnikj/crawler-google-places#output-example
- Add a simplified JSON dataset example, like here: [apify.com/compass/crawler-google-places#output-example](https://apify.com/compass/crawler-google-places#output-example)

8. **Tips or Advanced options section**
- Share any tips on how to best run the Actor, such as how to limit compute unit usage, get more accurate results, or improve speed.
Expand Down
2 changes: 1 addition & 1 deletion sources/academy/platform/get_most_of_actors/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ slug: /get-most-of-actors

---

[Apify Store](https://apify.com/store) is home to hundreds of public Actors available to the Apify community. Anyone is welcome to [publish Actors](/platform/actors/publishing) in the store, and you can even [monetize your Actors](https://get.apify.com/monetize-your-code).
[Apify Store](https://apify.com/store) is home to hundreds of public Actors available to the Apify community. Anyone is welcome to [publish Actors](/platform/actors/publishing) in the store, and you can even [monetize your Actors](https://apify.com/partners/actor-developers).

In this section, we will go over some of the practical steps you can take to ensure the high quality of your public Actors. You will learn:

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -152,7 +152,7 @@ Getting new users can be an art in itself, but there are **two proven steps** yo

Don’t underestimate your own network! Your social media connections can be a valuable ally in promoting your Actor. Not only can they use your tool to enrich their own professional activities, but also support your work by helping you promote your Actor to their network.

For inspiration, you can check Apify’s [Twitter](https://twitter.com/apify), [Facebook](https://www.facebook.com/apifytech/), and [LinkedIn](https://linkedin.com/company/apifytech) pages, and **don’t forget to tag Apify on your posts** we will retweet and share your posts to help you reach an even broader audience.
For inspiration, you can check Apify’s [Twitter](https://twitter.com/apify) or [LinkedIn](https://www.linkedin.com/company/apifytech/) pages, and **don’t forget to tag Apify on your posts** we will retweet and share your posts to help you reach an even broader audience.

- **YouTube**

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ Naming your Actor can be tricky. Especially when you've spent a long time coding
## Scrapers {#scrapers}

For Actors such as [YouTube Scraper](https://apify.com/bernardo/youtube-scraper) or [Amazon Scraper](https://apify.com/vaclavrut/amazon-crawler), which scrape web pages, we usually have one Actor per domain. This helps with naming, as the domain name serves as your Actor's name.
For Actors such as [YouTube Scraper](https://apify.com/streamers/youtube-scraper) or [Amazon Scraper](https://apify.com/junglee/amazon-crawler), which scrape web pages, we usually have one Actor per domain. This helps with naming, as the domain name serves as your Actor's name.

GOOD:

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -102,11 +102,11 @@ Now that you’ve created a cool new Actor, let others see it! Share it on your
- Try to publish an article about your Actor in relevant external magazines like [hackernoon.com](https://hackernoon.com/) or [techcrunch.com](https://techcrunch.com/). Do not limit yourself to blogging platforms.
- If you publish an article in external media (magazine, blog etc.), be sure to include backlinks to your Actor and the Apify website to strengthen the domain's SEO.
- It's always better to use backlinks with the [`dofollow` attribute](https://raventools.com/marketing-glossary/dofollow-link/).
- Always use the most relevant URL as the backlink's landing page. For example, when talking about Apify Store, link to the Store page (https://apify.com/store), not to Apify homepage (https://apify.com).
- Always use the most relevant URL as the backlink's landing page. For example, when talking about Apify Store, link to the Store page ([apify.com/store](https://apify.com/store)), not to Apify homepage ([apify.com](https://apify.com)).
- Always use the most relevant keyword or phrase for the backlink's text. This can boost the landing page's SEO and help the readers know what to expect from the link.

> **GOOD**: Try the [Facebook scraper](https://apify.com/pocesar/facebook-pages-scraper) now.
> <br/> **AVOID**: Try the Facebook scraper [here](https://apify.com/pocesar/facebook-pages-scraper).
> **GOOD**: Try the [Facebook scraper](https://apify.com/apify/facebook-pages-scraper) now.
> <br/> **AVOID**: Try the Facebook scraper [here](https://apify.com/apify/facebook-pages-scraper).
### Social media and forums

Expand Down
2 changes: 1 addition & 1 deletion sources/academy/platform/getting_started/actors.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ Once an Actor has been pushed to the Apify platform, they can be shared to the w
## Actors on the Apify platform {#actors-on-platform}

For a super quick and dirty understanding of what a published Actor looks like, and how it works, let's run an SEO audit of **apify.com** using the [SEO audit Actor](https://apify.com/drobnikj/seo-audit-tool).
For a super quick and dirty understanding of what a published Actor looks like, and how it works, let's run an SEO audit of **apify.com** using the [SEO audit Actor](https://apify.com/misceres/seo-audit-tool).

On the front page of the Actor, click the green **Try for free** button. If you're logged into your Apify account which you created during the [**Getting started**](./index.md) lesson, you'll be taken to the Apify Console and greeted with a page that looks like this:

Expand Down
2 changes: 1 addition & 1 deletion sources/academy/platform/running_a_web_server.md
Original file line number Diff line number Diff line change
Expand Up @@ -236,4 +236,4 @@ When we deploy and run this Actor on the Apify platform, then we can open the **

With that, we're done! And our application works like a charm :)

The complete code of this Actor is available [here](https://www.apify.com/apify/example-web-server). You can run it there or copy it to your account.
The complete code of this Actor is available [here](https://apify.com/apify/example-web-server). You can run it there or copy it to your account.
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ import TabItem from '@theme/TabItem';

The most popular way of [integrating](https://help.apify.com/en/collections/1669769-integrations) the Apify platform with an external project/application is by programmatically running an [Actor](/platform/actors) or [task](/platform/actors/running/tasks), waiting for it to complete its run, then collecting its data and using it within the project. Follow this tutorial to have an idea on how to approach this, it isn't as complicated as it sounds!

> Remember to check out our [API documentation](/api/v2) with examples in different languages and a live API console. We also recommend testing the API with a desktop client like [Postman](https://www.getpostman.com/) or [Insomnia](https://insomnia.rest).
> Remember to check out our [API documentation](/api/v2) with examples in different languages and a live API console. We also recommend testing the API with a desktop client like [Postman](https://www.postman.com/) or [Insomnia](https://insomnia.rest).

Apify API offers two ways of interacting with it:
Expand Down Expand Up @@ -78,7 +78,7 @@ Via API, let's quickly try to run [Web Scraper](https://apify.com/apify/web-scra
https://api.apify.com/v2/acts/apify~web-scraper/runs?token=YOUR_TOKEN
```

Here is how it looks in [Postman](https://www.getpostman.com/):
Here is how it looks in [Postman](https://www.postman.com/):

![Run an Actor via API in Postman](./images/run-actor-postman.png)

Expand Down
Loading

0 comments on commit b04820e

Please sign in to comment.