Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Counting Namenode Servers method #44

Open
clementLe opened this issue Aug 24, 2018 · 3 comments
Open

Counting Namenode Servers method #44

clementLe opened this issue Aug 24, 2018 · 3 comments

Comments

@clementLe
Copy link

Method to count NN Servers (to determine if we have to enable HA) could be unsuitable in case we have several and all NN in the same group.
In the actual version, NN are counted by summing the number of groups with NAMENODE service:

# ansible-hortonworks/playbooks/set_variables.yml
set_fact:
  namenode_groups: "{{ namenode_groups }} + [ '{{ item.host_group }}' ]"
when: groups[item.host_group] is defined and groups[item.host_group]|length > 0 and 'NAMENODE' in item.services
with_items: "{{ blueprint_dynamic }}"
no_log: True

#ansible-hortonworks/playbooks/roles/ambari-blueprint/templates/blueprint_dynamic.j2  
# Check if we have multiple NN servers:
namenode_groups|length > 1

The things is that we can have 2 Namenodes in the same Group (if both servers are hosting exactly the same services).
Example :

  - role: hdp-namenode-1
    clients: "{{ hdp_namenode_1_client }}" # not relevant here
    services:
      - NAMENODE
      - ZKFC
      - JOURNALNODE
      - RESOURCEMANAGER
      - ZOOKEEPER_SERVER
      - METRICS_MONITOR
   […]

To suit this case, we need to change:
The method of counting NN:

    - name: Initialize the control variables
      set_fact:
        namenode_groups: []
        namenode_count: 0
   [...]

   - name: Populate the namenode groups list
      set_fact:
        namenode_groups: "{{ namenode_groups }} + [ '{{ item.host_group }}' ]"
        namenode_count: "{{ namenode_count | int + groups[item.role]|length }}"
      when: groups[item.host_group] is defined and groups[item.host_group]|length > 0 and 'NAMENODE' in item.services
      with_items: "{{ blueprint_dynamic }}"
      no_log: True

And so the methods to check if we have at least 2 NN:

"xasecure.audit.destination.hdfs.dir" : "hdfs://{% if namenode_count | int > 1 %}{{ hdfs_ha_name }}{% else %}{{ hostvars[groups[namenode_groups.0]|sort|list|first]['ansible_fqdn'] }}:8020{% endif %}/ranger/audit",

[…]

{% if namenode_count | int > 1 -%}

Same thing for other HA like Ranger KMS, RM, …

Are you in line with this approach ?

@alexandruanghel
Copy link
Contributor

Hi, sorry about my late reply!

You are absolutely correct, this specific configuration (using more than 1 node in the master groups) would not work as I haven't accounted for so far (mainly because it's very rare when 2 masters are holding the exact same services - but not impossible as your usecase demonstrates).

The fix is however twofold:

  1. There is one fix in the Ansible as you identified, although I'd probably do namenode_hosts and then namenode_hosts|length in the if statements. Like in this kafka example: https://github.com/hortonworks/ansible-hortonworks/blob/master/playbooks/set_variables.yml#L183 and https://github.com/hortonworks/ansible-hortonworks/blob/master/playbooks/roles/ambari-blueprint/templates/blueprint_dynamic.j2#L570

It's essentially the same end result, but namenode_hosts would have more information which might be useful in the future (and the length can always be used when needed).

  1. However, my main concern is not the Ansible fix, but Ambari itself, mainly how Ambari uses the HOSTGROUP variable: https://github.com/hortonworks/ansible-hortonworks/blob/master/playbooks/roles/ambari-blueprint/templates/blueprint_dynamic.j2#L445 .
    I would need to do extensive testing to make sure Ambari knows how to use such a blueprint as this won't be just limited to the NameNode but for all services that use the HOSTGROUP variables.

Zookeeper: https://github.com/hortonworks/ansible-hortonworks/blob/master/playbooks/roles/ambari-blueprint/templates/blueprint_dynamic.j2#L488 (what happens for example if you have 3 zookeepers, but 2 groups, 1 group with 1 unique node and 1 group with 2 identical nodes).

All of this might work just fine, but it needs testing...

Next week I won't be able to do this so after. You're also free to test whenever you can and see if you get any issues from Ambari.

Thanks for reporting this!

@clementLe
Copy link
Author

Hi Alexandru,

I have made some test and you are totally right about Ambari HOSTGROUP management.

I first tried to create namenode_hostgroups like this (same for ZK) :

    - name: Populate the namenode lists
      set_fact:
        namenode_groups: "{{ namenode_groups }} + [ '{{ item.role }}' ]"
        namenode_hosts: "{{ namenode_hosts }} + {{ groups[item.role] }}"
        namenode_hostgroups: "{{ namenode_hostgroups }} + [ '{{ item.role }}' ]*{{ groups[item.role]|length }}"
      when: groups[item.role] is defined and groups[item.role]|length > 0 and 'NAMENODE' in item.services
      with_items: "{{ blueprint_dynamic }}"

With this inventory:

[hdp-namenode] #NN;RM;ZK
server1
server2
[hdp-master] #ZK
server3

At this step:
namenode_hostgroups: [ 'hdp-namenode', 'hdp-namenode' ]

So :

"dfs.namenode.http-address.{{ hdfs_ha_name }}.nn1" : "%HOSTGROUP::{{ namenode_groups[0] }}%:50070",
"dfs.namenode.http-address.{{ hdfs_ha_name }}.nn2" : "%HOSTGROUP::{{ namenode_groups[1] }}%:50070",

will give:

"dfs.namenode.http-address.{{ hdfs_ha_name }}.nn1" : "%HOSTGROUP::hdp-namenode%:50070",
"dfs.namenode.http-address.{{ hdfs_ha_name }}.nn2" : "%HOSTGROUP::hdp-namenode%:50070",

The Blueprint installation create an hdfs-site.xml with nn1=nn2=server1.example.com
I had to put directly the FQDN of Namenodes for nn1 and nn2 to have a functionnal configuration.

"dfs.namenode.http-address.{{ hdfs_ha_name }}.nn1" : "server1.example.com:50070",
"dfs.namenode.http-address.{{ hdfs_ha_name }}.nn2" : "server2.example.com:50070",

FYI I didn't have this problem with ZK configuration even if zookeeper_hostgroups was: [ 'hdp-namenode', 'hdp-namenode', 'hdp-master' ], the generated ZK quorum was: server1,server2,server3.

I'm wondering if it's a good idea to use directly the fqdn instead of Ambari %HOSTGROUP::%.

We don't really need HOSTGROUP when we deploy a blueprint with a Jinja template and the Ansible inventory/group_vars.

@alexandruanghel
Copy link
Contributor

Hi, sorry I didn't get the chance to come back on this.

It makes sense what you say and there are instances where Ambari doesn't do the HOSTGROUP thing, like the audit solr zookeepers (https://github.com/hortonworks/ansible-hortonworks/blob/master/playbooks/roles/ambari-blueprint/templates/blueprint_dynamic.j2#L110).

But I don't know all the implications of this... I've been using HOSTGROUP as this was the property of the exported blueprint.
In theory it should work fine if we manage this from Ansible, but for example, you won't be able to build a new cluster from an exported blueprint of a cluster built like that (because the exported blueprint will contain static hosts rather than dynamic HOSTGROUP).

Alternatively, we could just choose to have this limitation and live with having a bit more complex blueprint_dynamic variable...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants