-
Notifications
You must be signed in to change notification settings - Fork 925
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enhance KubeVirtNodeDriver
Compute Driver
#1983
Conversation
No more hangs when using ex_disk. The boot disk should be the first one in disks (and volumes) list (/dev/vda), otherwise the vm will not boot.
@cdfmlr Thanks for the contribution and good PR description. I will have a look as soon as I get a chance. In the mean time, would you mind documenting breaking changes (disks, ports) in |
Sure, I will add it recently. |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## trunk #1983 +/- ##
==========================================
- Coverage 83.26% 83.25% -0.00%
==========================================
Files 353 353
Lines 81305 81445 +140
Branches 8565 8606 +41
==========================================
+ Hits 67692 67807 +115
+ Misses 10823 10814 -9
- Partials 2790 2824 +34
|
# check if new node is present | ||
# But why not just use the resp from the POST request? | ||
# Or self.get_node()? | ||
# I don't think a for loop over list_nodes is necessary. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I agree, either using list_nodes()
or get_node()
which calls list_nodes()
underneath is not really great and efficient...
libcloud/compute/drivers/kubevirt.py
Outdated
@@ -391,63 +772,162 @@ def create_node( | |||
) | |||
raise KeyError(msg) | |||
|
|||
claim_name = disk["volume_spec"]["claim_name"] | |||
|
|||
if claim_name not in self.ex_list_persistent_volume_claims(namespace=namespace): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The method would be a bit more readable if it was refactored into multiple smaller function (e.g. one for creating volume, etc.).
libcloud/compute/drivers/kubevirt.py
Outdated
"size" not in disk["volume_spec"] | ||
or "storage_class_name" not in disk["volume_spec"] | ||
): | ||
msg = ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have test cases which cover this scenario + all other regular + edge cases?
libcloud/compute/drivers/kubevirt.py
Outdated
elif isinstance(auth, NodeAuthPassword): | ||
password = auth.password | ||
cloud_init_config = ( | ||
"""#cloud-config\n""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be a bit more readable if we stored cloud init configs in template files and then load those + render them here.
@@ -1231,3 +1719,151 @@ def ex_delete_service(self, namespace, service_name): | |||
except Exception: | |||
raise | |||
return result.status in VALID_RESPONSE_CODES | |||
|
|||
|
|||
def _deep_merge_dict(source: dict, destination: dict) -> dict: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good to add some test cases for this function.
return destination | ||
|
||
|
||
def _memory_in_MB(memory): # type: (Union[str, int]) -> int |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, some tests would this function would be good.
@cdfmlr Thanks. I had a look again. It mostly looks good, but there are a couple of improvements which can be made:
|
@Kami Thank you for the review. Indeed, the Also, at the moment, I'm unable to add more test cases due to these priority. However, there are some existing test cases for Despite the problems with code readability and test coverage, we've been using this code in a production environment for several months. It has been performing well and handling a considerable number of situations that aren't covered by the test cases. |
@cdfmlr Thanks. Do you happen to have any ETA when you will be to add some more tests + refactor the code a bit? Since I'm planning to do a v3.9.0 release this week. |
cherry-pick 3a4fe39e8e this feature is required by our internal e2e test.
@Kami sorry for my delay. I have added more tests covering the helper functions ( |
""" | ||
# size -> cpu and memory limits / requests | ||
|
||
ex_memory_limit = ex_memory_request = ex_cpu_limit = ex_cpu_request = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be safer to do:
ex_memory_limit, ex_memory_request, ex_cpu_limit, ex_cpu_request = None, None, None, None
Right now the code above works fine since we are defaulting to None
, but if this ever changed to default to dictionary or a list this could have unintended side affects.
@staticmethod | ||
def _create_node_size( | ||
vm, size=None, ex_cpu=None, ex_memory=None | ||
): # type: (dict, NodeSize, int, int) -> None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the future when you are refactoring code, you can move type annotations from comment directly to the function signature (we don't support Python 2 anymore).
public_key = auth.pubkey | ||
cloud_init_config = ( | ||
"""#cloud-config\n""" """ssh_authorized_keys:\n""" """ - {}\n""" | ||
).format(public_key) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could there be any surprises if auth.pubkey
contains a leading or a trailing line break? Aka do we need to call .strip()
on the value (ideally that would already happen in the base NodeAuth class, but I need to verify that is indeed the case).
password = auth.password | ||
cloud_init_config = ( | ||
"""#cloud-config\n""" | ||
"""password: {}\n""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here - do we need to perform any additional cleaning and sanitization of the password value?
To be on the safe side and to prevent possible YAML injections, we should escape / quote those values (password, pubkey).
Since everything except the header looks like yaml, probably the safest way is to define a dictionary for other fields and then calling yaml.dump()
on it and appending it to the static header value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
EDIT: I forgot we don't have a dependency on pyyaml
yet so we would need to add one which is not that great.
One option which would probably work is to just use json.dumps()
on the actual pubkey / password value to take care of the value escaping / quoting, but we would need to verify it works correctly.
In [4]: s = """#cloud-config\n""" """ssh_authorized_keys:\n""" """ - {}\n"""
In [5]: print(yaml.safe_load(s.format(json.dumps("key with \" quotes ' bar"))))
{'ssh_authorized_keys': ['key with " quotes \' bar']}
It looks like it should indeed do the trick, but more unit tests + testing it with the actual cloud init would be better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agree. json.dumps seems a great workaround here. introduced in 7d7c102. also added test cases to it.
@cdfmlr Thanks for adding those changes. I added a couple of more comments. It's mostly a couple of small things, plus the potential security issue with possible YAML injection in case the pub key or password is supplied by the end user and not sanitized before being passed to the Libcloud code. |
Merged into trunk. Thanks. |
Enhance
KubeVirtNodeDriver
Compute DriverDescription
This pull request brings several improvements to the
create_node
method and related functions within theKubeVirtNodeDriver
(libcloud/compute/drivers/kubevirt.py
).Features added to
KubeVirtNodeDriver.create_node
:NodeDriver
class:size: NodeSize
parameter, while retaining legacy compatibility withex_cpu
andex_memory
.auth: NodeAuthSSHKey|NodeAuthPassword
parameter.deploy_node
):KubeVirtNodeDriver
now supports node deployment automatically, benefiting from the above compatibility changes.ex_disks
parameter to align with the related KubeVirt API, making it compatible with any volume types rather than hardcoded support for limited volume or disk types (previously onlypersistentVolumeClaim
was supported).ex_template
parameter, allowing users to customize the entire Kubernetes object declaring the virtual machine. This is particularly useful for:Fixes:
_to_node
: Improved the logic for parsing memory, eliminating crashes on virtual machines with more than 1 GiB RAM.create_node
: (==Breaking change==) Renamed the parameter fromports
toex_ports
.Addressed various bugs in thecreate_node
method, which were eliminated during the refactor and implementation of new features.Other Changes:
libcloud/compute/drivers/kubevirt.py
:KubeVirtNodeSize
function to assist in constructingNodeSize
instances forKubeVirtNodeDriver
.KubeVirtNodeImage
function to help constructNodeImage
instances forKubeVirtNodeDriver
.DISK_TYPES
out of thecreate_node
and exported it, enabling users to access a list of supported disk types.libcloud/test/compute/test_kubevirt.py
:create_node
method: This method was not tested previously.Status
Checklist (tick everything that applies)