-
Notifications
You must be signed in to change notification settings - Fork 0
Home
Using this repository, a user can index the list of Doctor to different Cities in USA For, this particular assignment the City of Focus is
NEW JERSEY
The extracted used in this application is inspired from the website https://health.usnews.com . The data is stored in an Elastic Search Index. The aim is to represent the data so obtained in form of report with the following requirements
- Total number of doctors by city
- Total number of doctors by specialty (element g of the scrapped elements)
- Total number of doctors based on their experience range (experience range : 0 – 4 years,5 – 10 years, 11 – 16 years, 17 – 20 years and 20 years above)
- Total number of doctors by zipcode (The last five digit of the address; numeric field)i. i.e., 222 New Rd, Linwood, NJ 08221 <- zipcode
Output of the Report so generated https://github.com/RastogiAbhijeet/python_elasticsearch_businesscase/blob/master/summ2.png?raw=true
The following code is written for Python version - 2.7*, and Elastic Search Version 6.2.4 is used to store the indexed data. Kibana Tool Version - 6.2.4 is used for management and Visualisation purposes.
Before Installing Elastic Search it is important to install Java on your system
Run the following Command | sudo apt-get install openjdk-8-jdk
- First, update your package index.
sudo apt-get update
- Download the latest Elasticsearch version, which is 2.3.1 at the time of writing.
wget
https://download.elastic.co/elasticsearch/release/org/elasticsearch/distribution/deb/elasticsearch/2.3.1/elasticsearch-2.3.1.deb
- Then install it in the usual Ubuntu way with dpkg.
_sudo dpkg -i elasticsearch-2.3.1.deb_
This results in Elasticsearch being installed in /usr/share/elasticsearch/ with its configuration files placed in /etc/elasticsearch and its init script added in /etc/init.d/elasticsearch.
To make sure Elasticsearch starts and stops automatically with the server, add its init script to the default runlevels.
sudo systemctl enable elasticsearch.service
To test whether the Service Runs or not, run the following command. By default the elastic search service will run on localhost:9200
curl -X GET "localhost:9200"
This will return the following output
{ "name" : "My First Cluster", "cluster_name" : "MyCluster", "cluster_uuid" : "CN-Gtg7rRvai3VAx8TC1dw", "version" : { "number" : "6.2.4", "build_hash" : "ccec39f", "build_date" : "2018-04-12T20:37:28.497551Z", "build_snapshot" : false, "lucene_version" : "7.2.1", "minimum_wire_compatibility_version" : "5.6.0", "minimum_index_compatibility_version" : "5.0.0" }, "tagline" : "You Know, for Search" }
If you dont see the following output and get a port 9200 error
open the /etc/elasticsearch/elasticsearch.yml and change the path_log and path_data to a valid directory and make sure you give valid permission to the directories
Follow the guide : https://www.elastic.co/guide/en/kibana/current/setup.html