Skip to content

Commit

Permalink
Merge pull request #155 from dpordomingo/troubleshooting
Browse files Browse the repository at this point in the history
Document error handling
  • Loading branch information
dpordomingo authored Jul 19, 2019
2 parents 5f48768 + 9b1010b commit 7d3e9ad
Show file tree
Hide file tree
Showing 5 changed files with 261 additions and 62 deletions.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ To run it you only need:

If you want more details of each step, you will find in the [**Quick Start Guide**](docs/quickstart/README.md) all the steps to get started with **source{d} CE**, from the installation of its dependencies to running SQL queries to inspect git repositories.

If you have any problem running **source{d} CE** you can take a look at out [FAQ & Troubleshooting](docs/faq-troubleshooting.md) section, and our [source{d} Forum](https://forum.sourced.tech), where you can also ask for help when using **source{d} CE**. If you spotted a bug, or have a feature request, please [open an issue](https://github.com/src-d/sourced-ce/issues) to let us know abut it.
If you have any problem running **source{d} CE** you can take a look at out [Troubleshooting](docs/troubleshooting.md) section, and our [source{d} Forum](https://forum.sourced.tech), where you can also ask for help when using **source{d} CE**. If you spotted a bug, or have a feature request, please [open an issue](https://github.com/src-d/sourced-ce/issues) to let us know abut it.


## Architecture
Expand Down
3 changes: 2 additions & 1 deletion docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,8 @@

## Learn More

* [FAQ & Troubleshooting](./faq-troubleshooting.md)
* [FAQ](./faq.md)
* [Troubleshooting](./troubleshooting.md)
* [Architecture](./architecture.md)
* [Contribute](./CONTRIBUTING.md)
* [License](../LICENSE.md)
Expand Down
98 changes: 40 additions & 58 deletions docs/faq-troubleshooting.md → docs/faq.md
Original file line number Diff line number Diff line change
@@ -1,21 +1,23 @@
# FAQ and Troubleshooting
# FAQ

_For tips and advices to deal with unexpected errors, please refer to [Troubleshooting guide](./troubleshooting.md)_

## Index

- [Where Can I Find More Assistance to Run source{d} or Notify You About Any Issue or Suggestion?](#where-can-i-find-more-assistance-to-run-source-d-or-notify-you-about-any-issue-or-suggestion)
- [How Can I Update My Version Of source{d} CE?](#how-can-i-update-my-version-of-source-d-ce)
- [How Can I Update My Version Of **source{d} CE**?](#how-can-i-update-my-version-of-source-d-ce)
- [How To Restore Dashboards and Charts to Defaults](#how-to-restore-dashboards-and-charts-to-defaults)
- [How Can I See Logs of Running Components?](#how-can-i-see-logs-of-running-components)
- [How to Update the Data from the Organizations That I'm Analyzing](#how-to-update-the-data-from-the-organizations-being-analyzed)
- [Can I Query Gitbase or Babelfish with External Tools?](#can-i-query-gitbase-or-babelfish-with-external-tools)
- [Where Can I Read More About the Web Interface?](#where-can-i-read-more-about-the-web-interface)
- [When I Try to Create a Chart from a Query, Nothing Happens.](#when-i-try-to-create-a-chart-from-a-query-nothing-happens)
- [When I Try to Export a Dashboard, Nothing Happens.](#when-i-try-to-export-a-dashboard-nothing-happens)
- [The Dashboard Takes a Long to Load, and the UI Freezes.](#the-dashboard-takes-a-long-to-load-and-the-ui-freezes)


## Where Can I Find More Assistance to Run source{d} or Notify You About Any Issue or Suggestion?

If this documentation was not enough, you could also try:
_If you're dealing with an error or something that you think that can be caused
by an unexpected error, please refer to our [Troubleshooting guide](./troubleshooting.md).
With the info that you can obtain following those steps, you could fix the problem
or you will be able to explain it better in the following channels:_

* [open an issue](https://github.com/src-d/sourced-ce/issues), if you want to
suggest a new feature, if you need assistance with a contribution, or if you
Expand Down Expand Up @@ -81,19 +83,40 @@ $ sourced restart
```


## How Can I See Logs of Running Components?
## How to Update the Data from the Organizations Being Analyzed

```shell
$ cd ~/.sourced/workdirs/__active__
$ docker-compose logs -f [components...]
```
There is no way to update imported data, and
[when a scraper is restarted](./troubleshooting.md#how-can-i-restart-one-scraper),
it procedes as it follows:

### gitcollector

Organizations and repositories are downloaded independently, so if they fail,
the process is not stopped until all the organizations and repositories have been
iterated.

Where `-f` will keep the connection opened, and the logs will appear as they come
instead of exiting after the last logged one.
If `gitcollector` is restarted, it will download more repositories, but it won’t
update any of the already existent ones. You can see the progress of the new process
in the welcome dashboard; since already existent repositories won't be updated,
those will appear as `failed` in progress status.
Where you can pass a space separated list of component names to see only their
logs (i.e. `sourced-ui`, `gitbase`, `bblfsh`, `gitcollector`, `ghsync`, `metadatadb`, `postgres`, `redis`).
If you do not pass any component name, there will appear the logs of all of them.
### ghsync
The way how metadata is imported by `ghsync` is a bit different, and it is done
sequentially per each organization, so if any step fails, the whole importation
will fail.
Pull requests, issues, and users of the same organization, are imported in that
order in separate transaction each one, and if one transaction fails, the process
will be stopped so the next ones won't be processed.

Once the three different entities have been imported, the organization will be
considered as "done", and restarting `ghsync` won't cause to update its data.
If `ghsync` is restarted, it will only import data from organizations that could
not be finished considering the rules explained above. The process of `ghsync`
will be updated in the welcome dashboard and if an organization was already
imported, it will appear as "nothing imported" in the status chart.
## Can I Query Gitbase or Babelfish with External Tools?
Expand All @@ -114,44 +137,3 @@ the [Architecture documentation](./architecture.md#docker-networking)
The user interface is based in the open-sourced [Apache Superset](http://superset.apache.org),
so you can also refer to [Superset tutorials](http://superset.apache.org/tutorial.html)
for advanced usage of the web interface.


## When I Try to Create a Chart from a Query, Nothing Happens.

The charts can be created from the SQL Lab, using the `Explore` button once you
run a query. If nothing happens, the browser may be blocking the new window that
should be opened to edit the new chart. You should configure your browser to let
source{d} UI to open pop-ups (e.g. in Chrome it is done allowing `127.0.0.1:8088`
to handle `pop-ups and redirects` from the `Site Settings` menu).


## When I Try to Export a Dashboard, Nothing Happens.

If nothing happens when pressing `Export` button from the dashboard list, then
you should configure your browser to let source{d} UI to open pop-ups (e.g. in
Chrome it is done allowing `127.0.0.1:8088` to handle `pop-ups and redirects`
from the `Site Settings` menu)


## The Dashboard Takes a Long to Load and the UI Freezes.

_This is a known issue that we're trying to address, but here is more info about it._
In some circumstances, loading the data for the dashboards can take some time,
and the UI can be frozen in the meanwhile. It can happen —on big datasets—,
the first time you access the dashboards, or when they are refreshed.
There are some limitations with how Apache Superset handles long-running SQL
queries, which may affect the dashboard charts. Since most of the charts of the
Overview dashboard loads its data from gitbase, its queries can take more time
than the expected for the UI.
When it happens, the UI can be frozen, or you can get this message in some charts:
>_Query timeout - visualization queries are set to timeout at 300 seconds.
Perhaps your data has grown, your database is under unusual load, or you are
simply querying a data source that is too large to be processed within the timeout
range. If that is the case, we recommend that you summarize your data further._
When it occurs, you should wait till the UI is responsive again, and separately
refresh each failing chart with its `force refresh` option (on its top-right corner).
With some big datasets, it took 3 refreshes and 15 minutes to get data for all charts.
4 changes: 2 additions & 2 deletions docs/quickstart/4-explore-sourced.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
# Explore source{d} CE Web Interface

_If you have any problem running **source{d} CE** you can take a look to our [FAQ & Troubleshooting](docs/faq-troubleshooting.md) section, and to our [source{d} Forum](https://forum.sourced.tech), where you can also ask for help when using **source{d} CE**. If you spotted a bug, or have a feature request, please [open an issue](https://github.com/src-d/sourced-ce/issues) to let us know abut it._
_If you have any problem running **source{d} CE** you can take a look to our [Troubleshooting](../troubleshooting.md) section, and to our [source{d} Forum](https://forum.sourced.tech), where you can also ask for help when using **source{d} CE**. If you spotted a bug, or have a feature request, please [open an issue](https://github.com/src-d/sourced-ce/issues) to let us know abut it._

_In some circumstances, loading the data for the dashboards can take some time, and the UI can be frozen in the meanwhile. It can happen —on big datasets—, the first time you access the dashboards, or when they are refreshed. Please, take a look to our
[FAQ & Troubleshooting](docs/faq-troubleshooting.md#the-dashboard-takes-a-long-to-load-and-the-ui-freezes)
[Troubleshooting](../troubleshooting.md#the-dashboard-takes-a-long-to-load-and-the-ui-freezes)
to get more info about this exact issue._

Once **source{d} CE** has been [initialized with `sourced init`](./3-init-sourced.md), it will automatically open the web UI. If the UI is not automatically opened, you can use `sourced web` command, or visit http://127.0.0.1:8088.
Expand Down
216 changes: 216 additions & 0 deletions docs/troubleshooting.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,216 @@

# Troubleshooting:

_For commonly asked questions and their answers, you can refer to the [FAQ](./faq.md)_

Currently, **source{d} CE** does not expose nor log all errors directly into the
UI. In the current stage of **source{d} CE**, following these steps is the
better way to know if something is failing, why, and to know how to recover the
app from some problems. The first two steps use to be always mandatory:

1. **[To see if any component is broken](#how-can-i-see-the-status-of-source-d-ce-components)**
1. **[To see the logs of the running components](#how-can-i-see-logs-of-the-running-components)**
1. [To know if scrapers finished their job](#how-can-i-see-what-happened-with-the-scrapers)
- [To restart one scraper](#how-can-i-restart-one-scraper)
1. [To restart o initialize **source{d} CE** again](#how-to-restart-source-d-ce)
1. [To ask for help if the issue could not be solved](./faq.md#where-can-i-find-more-assistance-to-run-source-d-or-notify-you-about-any-issue-or-suggestion)

Other issues that we detected, and which are strictly related to the UI are:

- [When I Try to Create a Chart from a Query, Nothing Happens.](#when-i-try-to-create-a-chart-from-a-query-nothing-happens)
- [When I Try to Export a Dashboard, Nothing Happens.](#when-i-try-to-export-a-dashboard-nothing-happens)
- [The Dashboard Takes a Long to Load, and the UI Freezes.](#the-dashboard-takes-a-long-to-load-and-the-ui-freezes)


## source{d} CE Fails During Its Initialization

The initialization can fail fast if there is any port conflict, or missing config
file, etcetera; those errors are clearly logged in the terminal when they appear.

If when initializing **source{d} CE**, all the required components appear as created,
but the loading spinner keeps spinning forever (more than 1 minute can be symptomatic),
there can be an underlying problem causing the UI not to be opened. In this
situation you should:

1. **[See if any component is broken](#how-can-i-see-the-status-of-source-d-ce-components)**
1. **[See app logs or certain component logs](#how-can-i-see-logs-of-the-running-components)**
1. [Restart o initialize **source{d} CE** again](#how-to-restart-source-d-ce)
1. [To ask for help if the issue could not be solved](./faq.md#where-can-i-find-more-assistance-to-run-source-d-or-notify-you-about-any-issue-or-suggestion)


## How Can I See the Status of source{d} CE Components?

To see the status of **source{d} CE** components, just run:

```
$ sourced status
Name Command State Ports
------------------------------------------------------------------------------
srcd-xxx_sourced-ui_1 /entrypoint.sh Up (healthy) :8088->8088
srcd-xxx_gitbase_1 ./init.sh Up :3306->3306
srcd-xxx_bblfsh_1 /tini -- bblfshd Up :9432->9432
srcd-xxx_bblfsh-web_1 /bin/bblfsh-web -addr ... Up :9999->8080
srcd-xxx_metadatadb_1 docker-entrypoint.sh ... Up :5433->5432
srcd-xxx_postgres_1 docker-entrypoint.sh ... Up :5432->5432
srcd-xxx_redis_1 docker-entrypoint.sh ... Up :6379->6379
srcd-xxx_ghsync_1 /bin/sh -c sleep 10s ... Exit 0
srcd-xxx_gitcollector_1 /bin/dumb-init -- /bi ... Exit 0
```

It will report the status of all **source{d} CE** component. All components should
be `Up`, but the scrapers: `ghsync` and `gitcollector`; these exceptions are
explanined in [How Can I See What Happened with the Scrapers?](#how-can-i-see-what-happened-with-the-scrapers)

If any component is not `Up` (but the scrapers), here are some key points to
understand what might be happening:

- All the components (but the scrapers) are restarted by Docker Compose
automatically —process that can take some seconds—; if the component
enters in a restart loop, something wrong is happening.
- When any component is failing, or died, you should
[see its logs to understand what is happening](#how-can-i-see-logs-of-the-running-components)

When one of the required components fails, it uses to print an error in the UI,

e.g. `lost connection to mysql server during query` while running a query might
mean that `gitbase` went down.

e.g. `unable to establish a connection with the bblfsh server: deadline exceeded`
in SQL Lab might mean that `bblfsh` went down.

If the failing component is not successfully restarted in a few seconds, or if it
goes down when running certain queries, it could be a good idea to [open an issue](https://github.com/src-d/sourced-ce/issues)
describing the problem.


## How Can I See Logs of The Running Components?

```shell
$ sourced logs [-f] [components...]
```

Adding `-f` will keep the connection opened, and the logs will appear as they
come instead of exiting after the last logged one.

You can pass a space-separated list of component names to see only their logs
(i.e. `sourced-ui`, `gitbase`, `bblfsh`, `gitcollector`, `ghsync`, `metadatadb`, `postgres`, `redis`).
If you do not pass any component name, there will appear the logs of all of them.

Currently, there is no way to filter by error level, so you could try with `grep`,
e.g.

```shell
sourced logs gitcollector | grep error
```

will output only log lines where `error` word appears.


## How Can I See What Happened with the Scrapers?

_When **souece{d} CE** is initialized with `sourced init local`, the scrapers are
not relevant because the repositories to analyze comes from your local data, so
`ghsync` and `gitcollector` status is not relevant in this case._

When running **souece{d} CE** to analyze data from a list of GitHub organizations,
`gitcollector` component is in charge of fetching GitHub repositories and `ghsync`
component is in charge of fetching GitHub metadata (issues, pull requests...)

Once the UI is opened, you can see the progress of the importation in the welcome
dashboard, reporting the data imported, skipped, failed and completed. The process
can take many minutes if the organization is big, so be patient. You can manually
refresh both charts to confirm that the process is progressing, and it is not stuck.
If you believe that there can be any problem during the process, the better way
to find what is happening is:

- **[check the components status](#how-can-i-see-the-status-of-source-d-ce-components)
with `sourced status`**; `gitcollector` and `ghsync` should be `Up` (the process
didn't finish yet), or `Exit 0` (the process finished succesfully). They are
independent components, so they can finish on different order depending on how
many repositories or metadata is needed to process.

- **[check the logs](#how-can-i-see-logs-of-the-running-components) of the failing component with `sourced logs [-f] {gitcollector,ghsync}`**
to get more info about the errors found.


## How Can I Restart One Scraper?

_Restarting a scraper should be done to recover from temporal problems like
connectivity loss, or lack of space in disc, not
[to update the data you're analyzing](./faq.md#how-to-update-the-data-from-the-organizations-being-analyzed)_

**source{d} CE** does not provide way to start only one scraper. The recommended way
to restart them would be [to restart the whole **source{d} CE**](#how-to-restart-source-d-ce),
which is fast and safe for your data. In order to restart **source{d} CE**, run:

```shell
$ sourced restart
```

_Read more about [which data will be imported after restarting a scraper](./faq.md#how-to-update-the-data-from-the-organizations-being-Analyzed)_

If you feel comfortable enough with Docker Compose, you could also try restarting
each scraper separatelly, running:

```shell
$ cd ~/.sourced/workdirs/__active__
$ docker-compose run gitcollector # to restart gitcollector
$ docker-compose run ghsync # to restart ghsync
```


## How to Restart source{d} CE

Restarting **source{d} CE**, can fix some errors and is also the official way to
restart the scrapers. It is also needed after downloading a new config (by running
`sourced compose download`). **source{d} CE** is restarted with the command:

```shell
$ sourced restart
```

It only recreates the component containers, keeping all your data, like charts,
dashboards, repositories, and GitHub metadata.


## When I Try to Create a Chart from a Query, Nothing Happens.

The charts can be created from the SQL Lab, using the `Explore` button once you
run a query. If nothing happens, the browser may be blocking the new window that
should be opened to edit the new chart. You should configure your browser to let
source{d} UI to open pop-ups (e.g. in Chrome it is done allowing `127.0.0.1:8088`
to handle `pop-ups and redirects` from the `Site Settings` menu).


## When I Try to Export a Dashboard, Nothing Happens.

If nothing happens when pressing the `Export` button from the dashboard list, then
you should configure your browser to let source{d} UI to open pop-ups (e.g. in
Chrome it is done allowing `127.0.0.1:8088` to handle `pop-ups and redirects`
from the `Site Settings` menu)


## The Dashboard Takes a Long to Load and the UI Freezes.

_This is a known issue that we're trying to address, but here is more info about it._

In some circumstances, loading the data for the dashboards can take some time,
and the UI can be frozen in the meanwhile. It can happen —on big datasets—,
the first time you access the dashboards, or when they are refreshed.

There are some limitations with how Apache Superset handles long-running SQL
queries, which may affect the dashboard charts. Since most of the charts of the
Overview dashboard loads its data from gitbase, its queries can take more time
than the expected for the UI.

When it happens, the UI can be frozen, or you can get this message in some charts:
>_Query timeout - visualization queries are set to timeout at 300 seconds.
Perhaps your data has grown, your database is under unusual load, or you are
simply querying a data source that is too large to be processed within the timeout
range. If that is the case, we recommend that you summarize your data further._

When it occurs, you should wait till the UI is responsive again, and separately
refresh each failing chart with its `force refresh` option (on its top-right corner).
With some big datasets, it took 3 refreshes and 15 minutes to get data for all charts.

0 comments on commit 7d3e9ad

Please sign in to comment.