Skip to content

Latest commit

 

History

History
executable file
·
44 lines (31 loc) · 2.07 KB

data-crawler.md

File metadata and controls

executable file
·
44 lines (31 loc) · 2.07 KB
copyright lastupdated
years
2015, 2017
2017-08-18

{:shortdesc: .shortdesc} {:new_window: target="_blank"} {:tip: .tip} {:pre: .pre} {:codeblock: .codeblock} {:screen: .screen} {:javascript: .ph data-hd-programlang='javascript'} {:java: .ph data-hd-programlang='java'} {:python: .ph data-hd-programlang='python'} {:swift: .ph data-hd-programlang='swift'}

Adding content with Data Crawler

The data crawler lets you automate the upload of content to the {{site.data.keyword.discoveryshort}} Service. {: shortdesc}

Crawling data with the Data Crawler

The Data Crawler is a command line tool that will help you take your documents from the repositories where they reside (for example: file shares, databases, Microsoft SharePoint® ) and push them to the cloud, to be used by the {{site.data.keyword.discoveryshort}} Service.

When to use the Data Crawler

The Data Crawler should be used if you want to have a managed upload of a significant number of files from a remote system, or you want to extract content from a supported repository (such as a DB2 database).

The Data Crawler is not intended to be a solution for uploading files from your local drive. Uploading files from a local drive should be done using the tooling or by using direct API calls. {: tip}

Using the Data Crawler

  1. Configure the {{site.data.keyword.discoveryshort}} service
  2. Download and install the Data Crawler on a supported Linux system that has access to the content that you want to crawl.
  3. Connect the Data Crawler to your content.
  4. Configure the Data Crawler to connect to the {{site.data.keyword.discoveryshort}} Service.
  5. Crawl your content.

You can get started quickly with the Data Crawler by following the example in: Getting started with the Data Crawler