Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ZEPPELIN-773] Adding support for spark interpreter to work with yarn… #1

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

mfelgamal
Copy link

@mfelgamal mfelgamal commented May 18, 2016

What is this PR for?

Adding support for zeppelin to run in yarn-cluster mode through Livy

  • Creating new interpreter (livy-spark) with (spark, pyspark, sparkr)
  • Extending the interpreter to allow manipulation in the configurations of Spark from zeppelin web ui.

What type of PR is it?

Feature

Todos

  • [Test case ] - Task
  • [Documentation ] - Task

What is the Jira issue?

  • [ZEPPELIN-773]

How should this be tested?

  • Install Livy on https://github.com/cloudera/hue/tree/master/apps/spark/java
  • Start the Livy server
  • Start the Zeppelin and go to interpreter page, create livy interpreter with supported configurations
  • Run livy-interpreter on any of three modes (%livy.spark to spark, %livy.pyspark to python, %livy.sparkr to R)

_If you run on a cluster, the cluster should support yarn, and run livy-server in yarn mode_

Screenshots (if appropriate)


Questions:

  • Does the licenses files need update? no
  • Is there breaking changes for older versions? no
  • Does this needs documentation? yes

ghost pushed a commit that referenced this pull request May 29, 2016
### What is this PR for?
It is an extension of #6 apache#714
It allows user to export data in a paragraph to a TSV/CSV file.

### What type of PR is it?
Feature

### Todos
* [x] - Improves the current Table features like Search, Fixed Headers, Sorting

### Is there a relevant Jira issue?
[ZEPPELIN-672](https://issues.apache.org/jira/browse/ZEPPELIN-672)

### How should this be tested?
1. Create a paragraph with data in %table view
2. Click on TSV/CSV button to export CSV/TSV file

### Screenshots (if appropriate)
![image](https://cloud.githubusercontent.com/assets/1140475/13525760/4913bd8e-e229-11e5-9cd5-480c8b583d5b.png)

### Questions:
* Does the licenses files need update?
 Need to have MIT license for Datatables.
* Is there breaking changes for older versions?
No
* Does this needs documentation?
No

Author: ankur_jain <[email protected]>
Author: Ankur Jain <[email protected]>
Author: Damien CORNEAU <[email protected]>

Closes apache#761 from ankurmitujjain/master and squashes the following commits:

4ddcc0f [Ankur Jain] Updated testcases for @corneadoug pull request
e6470aa [Ankur Jain] Merge pull request #1 from corneadoug/clean/dataframe
dd8901b [Damien CORNEAU] last fixes
5aca081 [Damien CORNEAU] Last Modifications
9c4412f [Damien CORNEAU] Remove buttons
2561630 [Ankur Jain] Updated for indent
c9b675d [Ankur Jain] Updated for indent
38ee3c3 [Ankur Jain] Updated for indent
b23cab4 [Ankur Jain] Updated for indent
09c87a0 [Ankur Jain] Updated for indent
e4b3abb [ankur_jain] Removed R.md accidentally added
d3aadc6 [ankur_jain] Updated testcase
210b7a6 [ankur_jain] Updates latest code of controller
80bd58c [ankur_jain] Merge branch 'upstream/master'
0ee76b1 [ankur_jain] Update 3 files
0c5f623 [ankur_jain] Revert "Merge branch 'upstream/master'"
adb66a3 [ankur_jain] Merge branch 'upstream/master'
6363e97 [ankur_jain] Merge branch 'master' of https://github.com/ankurmitujjain/incubator-zeppelin
0c94cab [ankur_jain] Merge branch 'master' of https://github.com/ankurmitujjain/incubator-zeppelin
d23202e [ankur_jain] Merge remote-tracking branch 'refs/remotes/origin/master' into apache/master
415c1f5 [ankur_jain] Merge branch 'apache/master'
7901f5e [ankur_jain] Merge branch 'refs/heads/master' into apache/master
6e6587b [ankur_jain] Updating codebase as per @prabhjyotsingh comments
aea8446 [ankur_jain] Merge branch 'apache/master'
df1620c [ankur_jain] Updated testcase as resultant paragraph have text of buttons and search box.
00b36e5 [ankur_jain] Reverted line 117 and 2122 as per previous code
9351a0d [ankur_jain] Committed for Datatables #6
mfelgamal pushed a commit that referenced this pull request Jul 19, 2016
### What is this PR for?
Currently available interpreter list is not shown in `Creating New Interpreter` section. It seems this bug was generated after apache#835 was merged. So I temporally deactivated [3 SerializedName code lines](apache@6d7f1bc).

### What type of PR is it?
Bug Fix

### Todos
* [x] - Fix interpreter listing bug when creating new interpreter

### What is the Jira issue?
[ZEPPELIN-931](https://issues.apache.org/jira/browse/ZEPPELIN-931)

### How should this be tested?
1. Build latest master branch and browse Zeppelin home
2. Create new interpreter -> You can not see the available interpreter list in this step like below attached screenshot
3. Apply this patch
4. Build again and browse  -> You can see the available interpreter list as normal

### Screenshots (if appropriate)
 - **Before**
<img width="1273" alt="screen shot 2016-06-01 at 12 36 42 pm" src="https://cloud.githubusercontent.com/assets/10060731/15723066/9082435e-27f5-11e6-9783-df44638dbbec.png">

 - **After**
<img width="1273" alt="screen shot 2016-06-01 at 12 33 06 pm" src="https://cloud.githubusercontent.com/assets/10060731/15723067/92bcc8ce-27f5-11e6-82f5-6c0db7b4342c.png">

### Questions:
* Does the licenses files need update? No
* Is there breaking changes for older versions? No
* Does this needs documentation? No

Author: AhyoungRyu <[email protected]>
Author: Jongyoul Lee <[email protected]>
Author: Ah young <[email protected]>

Closes apache#945 from AhyoungRyu/ZEPPELIN-931 and squashes the following commits:

711eb54 [Ah young] Merge pull request #2 from jongyoul/ZEPPELIN-931
6121f9b [Jongyoul Lee] - Fixed documentation
6e7dac9 [Ah young] Merge pull request #1 from jongyoul/ZEPPELIN-931
fed1b40 [Jongyoul Lee] - Fixed fieldName in interpreter-setting.json
6d7f1bc [AhyoungRyu] ZEPPELIN-931: fix interpreter listing bug
mfelgamal pushed a commit that referenced this pull request Jul 19, 2016
…Remote Interpreter

### What is this PR for?
Currenlty zeppelin server starts interpreter on localhost and with random port.The purpose of this pull request is to allow zeppelin server to connect to remotely executing zeppelin interpreter that user might have started in his service.This feature will be further helpful while cluster manager is to be implemented.(https://cwiki.apache.org/confluence/display/ZEPPELIN/Cluster+Manager+Proposal)

### What type of PR is it?
Improvement

### Todos
[ ] -Add documentation

### What is the Jira issue?
* [ZEPPELIN-940] https://issues.apache.org/jira/browse/ZEPPELIN-940

### How should this be tested?
Added Junit test in RemoteInterpreterProcessTest and it passes

### Screenshots (if appropriate)

### Questions:
* Does the licenses files need update? No
* Is there breaking changes for older versions? No
* Does this needs documentation? Yes for the new properties

Author: Sachin <[email protected]>
Author: SachinJanani <[email protected]>

Closes apache#955 from SachinJanani/master and squashes the following commits:

f279767 [Sachin] Changed the Markdown style for code block in document
f57eb78 [Sachin] Incorporated review comments related to documentation
067a06e [Sachin] Add documentation for connecting to existing remote interpreter
84d2347 [Sachin] Added checkbox for Connecting to existing process and renamed the variables
c7fdc66 [Sachin] Merge remote-tracking branch 'upstream/master'
9762134 [Sachin] Merge branch 'master' of https://github.com/SachinJanani/incubator-zeppelin
4d51cd9 [Sachin] Add UI component for the accepting Host and Port when executing option is selected
2e30e3d [Sachin] [ZEPPELIN-940] Allow zeppelin server to connect to already executing Remote Interpreter
355c1f2 [Sachin] Add UI component for the accepting Host and Port when executing option is selected
7af8112 [Sachin] Merge branch 'master' of https://github.com/SachinJanani/incubator-zeppelin
fbe2346 [Sachin] [ZEPPELIN-940] Allow zeppelin server to connect to already executing Remote Interpreter
53c1eea [SachinJanani] Merge pull request #1 from apache/master
mfelgamal pushed a commit that referenced this pull request Jul 19, 2016
### What is this PR for?
There were several changes in Zeppelin UI after apache#860, apache#1006, apache#1013, apache#1081 so update screenshot of documents accordingly.

### What type of PR is it?
Documentation

### Questions:
* Does the licenses files need update? no
* Is there breaking changes for older versions? no
* Does this needs documentation? no

Author: AhyoungRyu <[email protected]>
Author: Mina Lee <[email protected]>
Author: Mina Lee <[email protected]>

Closes apache#1089 from minahlee/doc/ZEPPELIN-1002 and squashes the following commits:

b237caf [Mina Lee] Merge pull request #1 from AhyoungRyu/doc/ZEPPELIN-1002/again
b18544a [AhyoungRyu] Update screenshot images in interpreters.md
add97fb [AhyoungRyu] Update screenshot images in notebookashomepage.md
cdaeb30 [AhyoungRyu] Update screenshot images in index.md
b21444a [AhyoungRyu] Update screenshot images in notebook_authorization.md
b23f7e4 [AhyoungRyu] Update screenshot images in dependencymanagement.md
e7a85f3 [AhyoungRyu] Update screenshot images in lens.md
cecd161 [AhyoungRyu] Update screenshot images in ignite.md
9f8cb71 [AhyoungRyu] Update screenshot images in elasticsearch.md
0c9a688 [AhyoungRyu] Hide dynamicinterpreterloading.md temporarily
a17f31f [Mina Lee] Update doc image in Explore Zeppelin UI page
mfelgamal pushed a commit that referenced this pull request Aug 16, 2016
### What is this PR for?
Google BigQuery is a popular no-ops datawarehouse. This commit will enable Apache Zeppelin users to perform BI and Analytics on their datasets in BigQuery.

### What type of PR is it?
Feature

### Todos
* Make bigquery interpreter appear in the interpreters section in the UI
* Build SQL completion
* Authorization of non-gcp

### What is the Jira issue?
https://issues.apache.org/jira/browse/ZEPPELIN-1153

### How should this be tested?
copy conf/zeppelin-site.xml.template to conf/zeppelin-site.xml
Add org.apache.zeppelin.bigquery.bigQueryInterpreter to property zeppelin.interpreters in zeppelin-site.xml
Start Zeppelin
Add BigQuery Interpreter with your project ID
Create new note with %bsql.sql and run your SQL against public datasets in bigquery.

### Screenshots (if appropriate)
![screenshot from 2016-07-12 14 27 30](https://cloud.githubusercontent.com/assets/4242273/16785302/31b104e2-4842-11e6-87c0-b79763dd85c0.png)

### Questions:
* Does the licenses files need update? No
* Is there breaking changes for older versions? No
* Does this needs documentation? No

Author: Babu Prasad Elumalai <[email protected]>
Author: babupe <[email protected]>
Author: Alexander Bezzubov <[email protected]>

Closes apache#1170 from babupe/babupe-bigquery and squashes the following commits:

ffed801 [Babu Prasad Elumalai] pushing BQ Exception to logs and Interpreter error output
d3c2316 [babupe] Merge pull request #2 from bzz/babupe-add-auth-docs
64525b8 [Alexander Bezzubov] Fix typos in docs
03a777f [Alexander Bezzubov] add docs for BigQuery auth outside of GCE
fcab6b7 [babupe] Merge pull request #1 from bzz/babupe-final
6a95333 [Alexander Bezzubov] Rename Apach2.0 license for google's code to adhere naming conventions
7d4f40b [Alexander Bezzubov] Add exidentaly removed licenses due to merge conflict
3be1912 [Babu Prasad Elumalai] New changes
41e076e [Babu Prasad Elumalai] Fixed formatting with readme file
97874a4 [Babu Prasad Elumalai] Pushing cropped screenshots
64affbb [babupe] Added cropped interpreter screenshot
4a1d29c [Babu Prasad Elumalai] Removed unnecessary dependencies in pom.xml
e520b7b [Babu Prasad Elumalai] Exclude constants.json file for rat plugin since its static config file
69cb724 [Babu Prasad Elumalai] Fixed license header and added manual unit test documentation
bbf26cc [Babu Prasad Elumalai] Added path and specific wording
4a3153f [Babu Prasad Elumalai] removed bad package from import
d0c8e01 [Babu Prasad Elumalai] Added technical description to bigquery.md
b6d181c [Babu Prasad Elumalai] Trying to add screenshot in README
569757f [Babu Prasad Elumalai] Incorporated feedback
764385c [Babu Prasad Elumalai] Interpreter modification, License, doc changes
d85abd2 [Babu Prasad Elumalai] Modified code and license
17f6d89 [Babu Prasad Elumalai] ZEPPELIN-1153 comments committed
8fa647b [Babu Prasad Elumalai] BigQuery Interpreter for Apazhe Zeppelin
22e3487 [babupe] Update LICENSE
e88b017 [babupe] Created a new license file
d90e10f [babupe] Removed BigQuery from notice
aa52553 [Babu Prasad Elumalai] Merge branch 'master' of https://github.com/apache/zeppelin
ae096d2 [Babu Prasad Elumalai] License changes
20962d2 [Babu Prasad Elumalai] Pushing license changes
3d5f8e7 [Babu Prasad Elumalai] Modified license header
5a2e674 [Babu Prasad Elumalai] Added license info for Jackson library and added BQ API source
4db74c1 [Babu Prasad Elumalai] Adding license stuff
31c373f [Babu Prasad Elumalai] Fixed formatting with readme file
287744c [Babu Prasad Elumalai] Merge branch 'babupe-bigquery' of https://github.com/babupe/zeppelin into babupe-bigquery
f318b20 [Babu Prasad Elumalai] Pushing cropped screenshots
17fd4e8 [babupe] Added cropped interpreter screenshot
f872aa0 [Babu Prasad Elumalai] Removed unnecessary dependencies in pom.xml
5983e36 [Babu Prasad Elumalai] Exclude constants.json file for rat plugin since its static config file
11e88dc [Babu Prasad Elumalai] Replaced license header with formatting
4b82abd [Babu Prasad Elumalai] Fixed license header and added manual unit test documentation
87f5efe [Babu Prasad Elumalai] Added path and specific wording
6132d78 [Babu Prasad Elumalai] Fixing License and skipping failing tests
2254a49 [Babu Prasad Elumalai] removed bad package from import
73e3f6d [Babu Prasad Elumalai] Added technical description to bigquery.md
089820b [Babu Prasad Elumalai] Trying to add screenshot in README
a00b48e [Babu Prasad Elumalai] Incorporated feedback
17846f1 [Babu Prasad Elumalai] Interpreter modification, License, doc changes
50c41fc [Babu Prasad Elumalai] Modified code and license
75d8ee6 [Babu Prasad Elumalai] ZEPPELIN-1153 comments committed
2a2bedc [Babu Prasad Elumalai] BigQuery Interpreter for Apazhe Zeppelin
mfelgamal pushed a commit that referenced this pull request Sep 7, 2016
### What is this PR for?
There are 2 issues and their proposed fixes:
1. On a paragraph run, for every line of output, there is a broadcast of the new line from zeppelin. In case of thousands of lines of output, the browser/s would hang because of the volume of these append-output events.
2. In the above case, besides the browser-hang, another bug observed is that result data is will repeated twice (coming from append-output calls + finish-event calls).

The proposed solution for #1 is:
- Buffer the append-output event into a queue instead of sending the event immediately.
- In a separate thread, read from the queue periodically and send the append-output event.

Solution for #2 is:
- Donot append output to result if the paragraph is not runnig.

### What type of PR is it?
Improvement + Bug Fix

### Todos

### What is the Jira issue?
https://issues.apache.org/jira/browse/ZEPPELIN-1292

### How should this be tested?
The test could be to run a simple paragraph with large result. Eg:
```
%sh
for i in {1..10000}
do
echo $i
done
```
PS: One will need to clear browser cache between running with and without this code patch since there are javascript changes as well.

### Screenshots (if appropriate)

### Questions:
* Does the licenses files need update?
No
* Is there breaking changes for older versions?
No
* Does this needs documentation?
It could need for the design. Otherwise I have added code comments explaining behaviour.

Author: Beria <[email protected]>

Closes apache#1283 from beriaanirudh/ZEPPELIN-1292 and squashes the following commits:

17f0524 [Beria] Use diamond operator
7852368 [Beria] nit
4b68c86 [Beria] fix checkstyle
d168614 [Beria] Remove un-necessary class CheckAppendOutputRunner
2eae38e [Beria] Make AppendOutputRunner non-static
72c316d [Beria] Scheduler service to replace while loop in AppendOutputRunner
599281f [Beria] fix unit tests that run after
dd24816 [Beria] Add license in test file
3984ef8 [Beria] fix tests when ran with other tests
1c893c0 [Beria] Add licensing
1bdd669 [Beria] fix javadoc comment
27790e4 [Beria] Avoid infinite loop in tests
5057bb3 [Beria] Incorporate feedback 1. Synchronize on AppendOutputRunner creation 2. Use ScheduledExecutorService instead of while loop 3. Remove Thread.sleep() from tests
82e9c4a [Beria] Fix comment
7020f0c [Beria] Buffer append output results + fix extra incorrect results
mfelgamal pushed a commit that referenced this pull request Sep 23, 2016
### What is this PR for?
Several changes on doc of spark interpreter.

* %spark, %sql, %pyspark only works when spark is the default interpreter group of note. So I update the doc to use the full interpreter name.
* Add SparkSession for 2.0
* Also add comments inline with other changes to explain the reason.

### What type of PR is it?
[Documentation]

### Todos
* [ ] - Task

### What is the Jira issue?
* No jira created.

### Questions:
* Does the licenses files need update? No
* Is there breaking changes for older versions? No
* Does this needs documentation? No

Author: AhyoungRyu <[email protected]>
Author: Jeff Zhang <[email protected]>
Author: Jeff Zhang <[email protected]>

Closes apache#1398 from zjffdu/spark_doc_fix and squashes the following commits:

ac01f2b [Jeff Zhang] Merge pull request #1 from AhyoungRyu/spark_doc_fix/ahyoung
5fa523f [AhyoungRyu] Fix typos
3c0f678 [AhyoungRyu] Add 'R' and refine a sentence
2336900 [AhyoungRyu] Improve spark.md
40d4b11 [Jeff Zhang] [MINOR] Doc fix for spark interpreter
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant