Name	Name	Last commit message	Last commit date
Latest commit dependabot[bot] Bump kaminari from 0.17.0 to 1.2.1 May 28, 2020 053197d · May 28, 2020 History 1,330 Commits
app	app	remove grayscale from paper view	Aug 12, 2019
bin	bin	Upgrade to Rails 5.1.6	Feb 1, 2019
config	config	fog: use default region and signature v4 in all environments. make s3…	Apr 3, 2019
contrib/postgres-ka	contrib/postgres-ka	easier import from public available database dump	Jul 29, 2019
data	data	Keep Data Folder	Sep 17, 2014
db	db	add site_message to bodies to notify about scrapers needing updates	Jan 27, 2019
lib	lib	drop timeout, use safe_regexp for long running regex used by classifi…	Jul 29, 2019
log	log	Fresh Rails	Sep 2, 2014
public	public	Remove Logofont, use svg instead.	Feb 22, 2015
test	test	add support for major interpellations, remove no longer needed tests	Jan 27, 2019
vendor/assets	vendor/assets	Make filter automatic, fixes #112	Jan 28, 2019
.dockerignore	.dockerignore	Add development environment with docker and fig	Jan 29, 2015
.editorconfig	.editorconfig	Add editorconfig	Sep 4, 2014
.gitignore	.gitignore	ignore deploy file	Feb 7, 2016
.hound.yml	.hound.yml	Add hound and rubocop config	Mar 4, 2015
.rubocop.yml	.rubocop.yml	Adopts rubocop config for current version (0.63.1)	Feb 1, 2019
.travis.yml	.travis.yml	travis: provide redis	Jul 28, 2019
Dockerfile	Dockerfile	update used bundler	Jul 29, 2019
Gemfile	Gemfile	Bump kaminari from 0.17.0 to 1.2.1	May 28, 2020
Gemfile.lock	Gemfile.lock	Bump kaminari from 0.17.0 to 1.2.1	May 28, 2020
LICENSE.md	LICENSE.md	Add MIT License	Dec 30, 2014
Procfile	Procfile	switch to sidekiq	Aug 12, 2018
README.md	README.md	easier import from public available database dump	Jul 29, 2019
Rakefile	Rakefile	Fresh Rails	Sep 2, 2014
config.ru	config.ru	Fresh Rails	Sep 2, 2014
docker-compose.yml	docker-compose.yml	show obituary	May 9, 2019

Repository files navigation

kleineAnfragen.

Collecting kleine Anfragen from Parlamentsdokumentationssystemen for easy search- and linkability.

Development

For a simple and quick development environment, docker-compose is used. Install docker and docker-compose, then run:

docker-compose up

docker-compose downloads the required services (postgres, elasticsearch, redis, ...) as docker containers and links them with the app. If you want to look into postgres or elasticsearch, uncomment the ports section in docker-compose.yml.

You may be required to execute database migrations. Try this:

docker-compose run web rails db:migrate
docker-compose run web rails db:seed

To get a rails console, run:

docker-compose run web rails c

Importing papers from the public database dump

If you want to develop with already scraped data, you can use the public available data dumps from the kleineAnfragen.de data page. Download the latest kleineanfragen-....sql.bz2 from there and put it into tmp/dump/.

To begin importing the data, you have to first enter a docker container:

docker run -v $(pwd)/tmp/dump:/tmp/dump --rm --network kleineanfragen_default -it kleineanfragen_database bash

Inside this one-off throwaway container, import the data with following commands

bzcat /tmp/dump/kleineanfragen-*.sql.bz2 | psql -h database -U kleineanfragen import
pg_dump -h database -U kleineanfragen -d import --data-only | psql -h database -U kleineanfragen -d kleineanfragen
psql -h database -U kleineanfragen import -c "DROP SCHEMA public CASCADE; CREATE SCHEMA public; GRANT ALL ON SCHEMA public TO postgres; GRANT ALL ON SCHEMA public TO public;"
exit

Normalizing Names with Nomenklatura

For normalizing names of people, parties and ministries, we use Nomenklatura.

If you want to use nomenklatura while developing, you need to edit docker-compose.yml:

Uncomment the nomenklatura link
the NOMENKLATURA_ environment variables
the whole nomenklatura image
set GITHUB_CLIENT_ID and GITHUB_CLIENT_SECRET to those of a new Github OAuth application.

After your next docker-compose up login to your nomenklatura instance (reachable at http://localhost:8080) and get the API key from the profile link. Insert it into docker-compose.yml.

kleineAnfragen needs multiple Datasets with the following identifiers that must be created in Nomenklatura:

ka-parties
ka-people-XX (replace XX with a two letter state)
ka-ministries-XX (replace XX with a two letter state)

Troubleshooting

You just git pulled and now kleineanfragen doesn't start anymore? Try docker-compose rm web and docker-compose build web — this rebuilds the container that the application is running in.

Dependencies

ruby 2.5.5
postgres
elasticsearch (for search)
redis (for sidekiq)
nodejs (for asset compiling)
tika (for extracting text from pdfs)
Nomenklatura (for normalization of people names, ministries and parties)
Poppler / pdftoppm (for thumbnailing)
image_optim binaries (for compressing thumbnails)
s3 compatible storage like s3ninja (see contrib/s3ninja for the modified dockered version)

Configuration

config/application.rb

Please change the config.x.user_agent to your own email address.

.env

In development, the environment variables are set in docker-compose.yml. For development without docker-compose (or production), create .env and fill it with these:

export DATABASE_URL="postgres://user:pass@localhost/kleineanfragen"
export ELASTICSEARCH_URL="http://127.0.0.1:9200/"
export SECRET_KEY_BASE="FIXME"
export SECRET_SUBSCRIPTION_SALT="FIXME"
export S3_ACCESS_KEY="FIXME"
export S3_SECRET_KEY="FIXME"
export REDIS_URL="redis://localhost:6379"
export TIKA_SERVER_URL="http://localhost:9998"
export NOMENKLATURA_HOST="http://localhost:9000"
export NOMENKLATURA_APIKEY="FIXME"

config/fog.yml

This file contains the connection details to your s3 server/bucket. Test uses the tmp folder, so you don't need a connection to a running s3 compatible storage.

Jobs

Jobs are run by ActiveJob / Sidekiq.

You may need to prefix them with bundle exec, so the correct gems are used.

The typical arguments are [State, LegislativeTerm, Reference]

Import new papers
```
rails 'papers:import_new[BE, 17]'
```
Import single paper
```
rails 'papers:import[BE, 17, 1234]'
```

Other

The two import tasks should be enough for daily usage, if you need to (re-)upload the papers to s3 again or extract the text / names, you can use these:

rails 'papers:store[BE, 17, 1234]'
rails 'papers:extract_text[BE, 17, 1234]'
rails 'papers:extract_originators[BE, 17, 1234]'
rails 'papers:extract_answerers[BE, 17, 1234]'

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

kleineAnfragen.

Development

Importing papers from the public database dump

Normalizing Names with Nomenklatura

Troubleshooting

Dependencies

Configuration

Jobs

About

Releases

Packages

Contributors 7

Languages

License

robbi5/kleineanfragen

Folders and files

Latest commit

History

Repository files navigation

kleineAnfragen.

Development

Importing papers from the public database dump

Normalizing Names with Nomenklatura

Troubleshooting

Dependencies

Configuration

Jobs

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 7

Languages

Packages