Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Add validation to prevent numeric values in patient names #2605

Closed

Conversation

areebahmeddd
Copy link
Contributor

@areebahmeddd areebahmeddd commented Nov 19, 2024

Proposed Changes

  • Added backend validation to disallow numeric values in patient names.
  • Added test cases for valid and invalid patient names.

Associated Issue

Merge Checklist

  • Tests added/fixed
  • Update docs in /docs
  • Linting Complete
  • Any other necessary step

@ohcnetwork/care-backend-maintainers @ohcnetwork/care-backend-admins

Summary by CodeRabbit

Summary by CodeRabbit

  • New Features

    • Enhanced validation for patient names to ensure they consist solely of alphabetic characters and spaces.
    • Improved checks to require either year of birth or date of birth during patient record creation and updates.
    • Updated patient names in the dataset for clarity and organization.
    • Updated user details for improved identification and compliance.
  • Bug Fixes

    • Clearer error messages for validation failures regarding vaccination details.
  • Tests

    • Added tests to verify successful registration for valid patient names and appropriate error responses for invalid names.
  • Chores

    • Introduced automated code quality checks using Qodana in the CI/CD pipeline.
    • Added configuration for Qodana to ensure compliance with dependency licenses.

Copy link

coderabbitai bot commented Nov 19, 2024

📝 Walkthrough
📝 Walkthrough

Walkthrough

The changes made in this pull request enhance the validation logic within the PatientDetailSerializer class found in the patient.py file. A new validation method for patient names has been introduced, ensuring names are neither None nor contain numeric characters. Additionally, existing validation methods have been updated to enforce the presence of either year_of_birth or date_of_birth when applicable. Corresponding tests have been added to verify these new rules, ensuring that invalid names are correctly rejected during patient registration.

Changes

File Path Change Summary
care/facility/api/serializers/patient.py - Added validate_name method to PatientDetailSerializer for name validation.
- Updated validate method to check for either year_of_birth or date_of_birth and improved error messages for is_vaccinated.
care/facility/models/tests/test_patient.py - Added test_valid_patient_name and test_invalid_patient_name methods to PatientRegistrationTest to validate name constraints.
.github/workflows/qodana_code_quality.yml - Introduced a new workflow for automated code quality checks using Qodana.
data/dummy/facility.json - Updated patient names to a sequential naming convention and modified facility and resource request data.
data/dummy/users.json - Altered user details, including first name, last name, and added new fields.
qodana.yaml - Created a new configuration file for Qodana with specific settings for code quality checks.

Assessment against linked issues

Objective Addressed Explanation
Disallow numeric values in patient names (#2602)

🎉 In the realm of care, we strive to be clear,
With names that are pure, and numbers not near.
Validation is key, for each patient we greet,
Ensuring their records are truly complete!
So here’s to the changes, both clever and bright,
Making patient data a wonderful sight! 🌟

Possibly related PRs

Suggested reviewers

  • rithviknishad
  • sainak

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Outside diff range and nitpick comments (3)
care/facility/models/tests/test_patient.py (2)

69-87: Consider adding more test cases for valid names.

While the basic test case is good, it might be nice if we could also test these totally reasonable edge cases:

  • Names with spaces (e.g., "John Doe")
  • Names with hyphens (e.g., "Mary-Jane")
  • Names with apostrophes (e.g., "O'Connor")
  • Non-English names with accents (e.g., "José", "François")

71-84: Reduce duplication in test data setup.

I couldn't help but notice there's quite a bit of duplicated sample data between the test methods. Perhaps we could move this to a helper method or class attribute? Just a thought...

@classmethod
def setUpTestData(cls):
    super().setUpTestData()
    cls.base_patient_data = {
        "gender": 1,
        "facility": cls.facility.external_id,
        "blood_group": "AB+",
        "date_of_birth": "2000-01-01",
        # ... other common fields ...
    }

def get_patient_data(self, **kwargs):
    data = self.base_patient_data.copy()
    data.update(kwargs)
    return data

Also applies to: 91-104

care/facility/api/serializers/patient.py (1)

313-315: Consider validating name only when it's being updated.

The current approach might miss validation if name is None. A clearer approach would be to explicitly check if name is being updated.

-        name = validated_data.get("name")
-        if name:
-            self.validate_name(name)
+        if "name" in validated_data:
+            self.validate_name(validated_data["name"])
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 5b2b304 and 46c9e39.

📒 Files selected for processing (2)
  • care/facility/api/serializers/patient.py (3 hunks)
  • care/facility/models/tests/test_patient.py (1 hunks)
🧰 Additional context used
🪛 Ruff
care/facility/api/serializers/patient.py

215-215: Exception must not use a string literal, assign to variable first

Assign to variable; remove string literal

(EM101)


217-217: Exception must not use a string literal, assign to variable first

Assign to variable; remove string literal

(EM101)

🔇 Additional comments (1)
care/facility/models/tests/test_patient.py (1)

69-108: Verify validation coverage across endpoints.

While these tests cover the patient creation endpoint, we should ensure the name validation is also tested for other endpoints that might modify patient names.

✅ Verification successful

Name validation is consistently applied, though it would have been nice to mention where it lives

The validation is properly implemented in the serializer layer through the validate_name method, which means it's automatically applied to all endpoints that use any of the patient serializers. Since both PatientListSerializer and PatientDetailSerializer are used across the viewsets, the validation is enforced for all patient-related operations.

  • The validation checks for:
    • Name presence (required field)
    • Absence of numeric values
    • How thoughtful of them to keep it simple
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for other patient-related endpoints that might need similar validation tests
rg -t py "class.*Patient.*ViewSet" 
rg -t py "def (put|patch|update).*patient" -A 5

Length of output: 1818


Script:

#!/bin/bash
# Let's check the actual patient update/patch methods and their validation
ast-grep --pattern 'def update($$$) {
  $$$
}'  care/facility/api/viewsets/patient.py

# And let's see if there are any serializers involved in patient updates
rg -t py "class.*PatientUpdateSerializer" -A 10

# Check for name validation in serializers
rg -t py "name.*validators" -A 5 care/facility/api/serializers/patient.py

Length of output: 212


Script:

#!/bin/bash
# Let's try a different approach to find patient-related serializers and their validation
rg -t py "class.*Patient.*Serializer" -A 15 care/facility/api/serializers/patient.py

# And check for any update methods in the patient viewset
rg -t py "def (update|partial_update)" -A 10 care/facility/api/viewsets/patient.py

# Just to be thorough, let's see what serializer the viewset uses
rg -t py "serializer_class.*Patient.*Serializer" care/facility/api/viewsets/patient.py

Length of output: 6349


Script:

#!/bin/bash
# Let's see where this name validation actually lives, shall we?
rg -t py "name.*=.*serializers\." -A 3 care/facility/api/serializers/patient.py

# And check the base serializer that others might be inheriting from
rg -t py "class PatientRegistration.*:|class.*PatientBaseSerializer" -B 5 -A 15 care/facility/api/serializers/patient.py

# Just to be extra certain about validation rules
rg -t py "def validate.*name" -A 5 care/facility/api/serializers/patient.py

Length of output: 718

Comment on lines +89 to +108
def test_invalid_patient_name(self):
dist_admin = self.create_user("dist_admin", self.district, user_type=30)
sample_data = {
"name": "Rithvik123",
"gender": 1,
"facility": self.facility.external_id,
"blood_group": "AB+",
"date_of_birth": "2000-01-01",
"year_of_birth": 2000,
"disease_status": "NEGATIVE",
"emergency_phone_number": "+919000000666",
"is_vaccinated": "false",
"number_of_doses": 0,
"phone_number": "+919000044343",
"is_antenatal": False,
}
self.client.force_authenticate(user=dist_admin)
response = self.client.post("/api/v1/patient/", sample_data, format="json")
self.assertEqual(response.status_code, status.HTTP_400_BAD_REQUEST)
self.assertIn("name", response.data)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Enhance invalid name test coverage.

The test could be more thorough by checking these delightful invalid scenarios:

  • Names with special characters (e.g., "@#$%")
  • Empty string
  • Only numbers
  • Very long names (if there's a length limit)

Also, it might be helpful to assert the specific error message instead of just checking if "name" exists in response.data. Something like:

-        self.assertIn("name", response.data)
+        self.assertEqual(response.data["name"][0], "Name should not contain numeric characters")

Committable suggestion skipped: line range outside the PR's diff.

Comment on lines 213 to 218
def validate_name(self, value):
if value is None:
raise serializers.ValidationError("Patient name is required.")
if any(char.isdigit() for char in value):
raise serializers.ValidationError("Patient name cannot contain numeric values.")
return value
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Consider enhancing the name validation.

While the numeric validation is good, there are a few improvements we could make to this validation method:

+    # Error messages as class constants
+    NAME_REQUIRED_ERROR = "Patient name is required."
+    NAME_NUMERIC_ERROR = "Patient name cannot contain numeric values."
+    NAME_EMPTY_ERROR = "Patient name cannot be empty or just whitespace."
+
     def validate_name(self, value):
-        if value is None:
-            raise serializers.ValidationError("Patient name is required.")
-        if any(char.isdigit() for char in value):
-            raise serializers.ValidationError("Patient name cannot contain numeric values.")
+        if value is None:
+            raise serializers.ValidationError(self.NAME_REQUIRED_ERROR)
+        if not value.strip():
+            raise serializers.ValidationError(self.NAME_EMPTY_ERROR)
+        if any(char.isdigit() for char in value):
+            raise serializers.ValidationError(self.NAME_NUMERIC_ERROR)
         return value.strip()
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def validate_name(self, value):
if value is None:
raise serializers.ValidationError("Patient name is required.")
if any(char.isdigit() for char in value):
raise serializers.ValidationError("Patient name cannot contain numeric values.")
return value
# Error messages as class constants
NAME_REQUIRED_ERROR = "Patient name is required."
NAME_NUMERIC_ERROR = "Patient name cannot contain numeric values."
NAME_EMPTY_ERROR = "Patient name cannot be empty or just whitespace."
def validate_name(self, value):
if value is None:
raise serializers.ValidationError(self.NAME_REQUIRED_ERROR)
if not value.strip():
raise serializers.ValidationError(self.NAME_EMPTY_ERROR)
if any(char.isdigit() for char in value):
raise serializers.ValidationError(self.NAME_NUMERIC_ERROR)
return value.strip()
🧰 Tools
🪛 Ruff

215-215: Exception must not use a string literal, assign to variable first

Assign to variable; remove string literal

(EM101)


217-217: Exception must not use a string literal, assign to variable first

Assign to variable; remove string literal

(EM101)

Comment on lines 250 to 252
name = validated_data.get("name")
if name:
self.validate_name(name)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Remove unnecessary conditional check for name validation.

The name field should be required for patient creation, making this conditional check redundant and potentially dangerous.

-        name = validated_data.get("name")
-        if name:
-            self.validate_name(name)
+        name = validated_data["name"]
+        self.validate_name(name)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
name = validated_data.get("name")
if name:
self.validate_name(name)
name = validated_data["name"]
self.validate_name(name)

- Ensuring no empty space
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 46c9e39 and e0c8351.

📒 Files selected for processing (1)
  • care/facility/api/serializers/patient.py (3 hunks)
🧰 Additional context used
🪛 Ruff
care/facility/api/serializers/patient.py

215-215: Exception must not use a string literal, assign to variable first

Assign to variable; remove string literal

(EM101)


217-217: Exception must not use a string literal, assign to variable first

Assign to variable; remove string literal

(EM101)


219-219: Exception must not use a string literal, assign to variable first

Assign to variable; remove string literal

(EM101)

🔇 Additional comments (1)
care/facility/api/serializers/patient.py (1)

252-254: 🛠️ Refactor suggestion

I see we're being extra cautious with our validation...

The conditional check is redundant since name is a required field. The validate_name method already handles None values appropriately.

-        name = validated_data.get("name")
-        if name:
-            self.validate_name(name)
+        name = validated_data["name"]
+        self.validate_name(name)

Likely invalid or redundant comment.

Comment on lines 315 to 317
name = validated_data.get("name")
if name:
self.validate_name(name)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Ah, the silent validator strikes again...

The current implementation might allow invalid names during partial updates by silently skipping validation when name is not provided.

-        name = validated_data.get("name")
-        if name:
-            self.validate_name(name)
+        if "name" in validated_data:
+            validated_data["name"] = self.validate_name(validated_data["name"])
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
name = validated_data.get("name")
if name:
self.validate_name(name)
if "name" in validated_data:
validated_data["name"] = self.validate_name(validated_data["name"])

Comment on lines 213 to 220
def validate_name(self, value):
if value is None:
raise serializers.ValidationError("Patient name is required.")
if not value.strip():
raise serializers.ValidationError("Patient name cannot be empty.")
if any(char.isdigit() for char in value):
raise serializers.ValidationError("Patient name cannot contain numeric values.")
return value
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Perhaps we could make this validation a tad more... elegant?

While the validation logic is correct, there are a few improvements we could make:

+    # Error messages as class constants
+    NAME_REQUIRED_ERROR = "Patient name is required."
+    NAME_EMPTY_ERROR = "Patient name cannot be empty."
+    NAME_NUMERIC_ERROR = "Patient name cannot contain numeric values."
+
-    def validate_name(self, value):
+    def validate_name(self, value: str | None) -> str:
         if value is None:
-            raise serializers.ValidationError("Patient name is required.")
+            raise serializers.ValidationError(self.NAME_REQUIRED_ERROR)
         
         value = value.strip()
         if not value:
-            raise serializers.ValidationError("Patient name cannot be empty.")
+            raise serializers.ValidationError(self.NAME_EMPTY_ERROR)
         
         if any(char.isdigit() for char in value):
-            raise serializers.ValidationError("Patient name cannot contain numeric values.")
+            raise serializers.ValidationError(self.NAME_NUMERIC_ERROR)
         
-        return value
+        return value  # Already stripped
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def validate_name(self, value):
if value is None:
raise serializers.ValidationError("Patient name is required.")
if not value.strip():
raise serializers.ValidationError("Patient name cannot be empty.")
if any(char.isdigit() for char in value):
raise serializers.ValidationError("Patient name cannot contain numeric values.")
return value
# Error messages as class constants
NAME_REQUIRED_ERROR = "Patient name is required."
NAME_EMPTY_ERROR = "Patient name cannot be empty."
NAME_NUMERIC_ERROR = "Patient name cannot contain numeric values."
def validate_name(self, value: str | None) -> str:
if value is None:
raise serializers.ValidationError(self.NAME_REQUIRED_ERROR)
value = value.strip()
if not value:
raise serializers.ValidationError(self.NAME_EMPTY_ERROR)
if any(char.isdigit() for char in value):
raise serializers.ValidationError(self.NAME_NUMERIC_ERROR)
return value # Already stripped
🧰 Tools
🪛 Ruff

215-215: Exception must not use a string literal, assign to variable first

Assign to variable; remove string literal

(EM101)


217-217: Exception must not use a string literal, assign to variable first

Assign to variable; remove string literal

(EM101)


219-219: Exception must not use a string literal, assign to variable first

Assign to variable; remove string literal

(EM101)

Comment on lines 214 to 217
if value is None:
raise serializers.ValidationError("Patient name is required.")
if not value.strip():
raise serializers.ValidationError("Patient name cannot be empty.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the model will handle this by default, no need to add these validations in the serializer.

raise serializers.ValidationError("Patient name is required.")
if not value.strip():
raise serializers.ValidationError("Patient name cannot be empty.")
if any(char.isdigit() for char in value):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will prevent numbers, but what about dots, commas, underscores? I think we should use regex here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thoughts exactly but then i decided to stick with what the issue title was haha

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (1)
care/facility/api/serializers/patient.py (1)

214-217: Oh, I see we're only checking for alphabets and spaces...

While the validation prevents numeric values as required, it might be worth considering other special characters that could make their way into patient names. The current regex ^[a-zA-Z\s]+$ is a bit... basic.

Consider this slightly more comprehensive validation:

+    # Error messages as class constants
+    INVALID_NAME_ERROR = "Patient name should only contain alphabets, spaces, hyphens, and apostrophes"
+
     def validate_name(self, value):
-        if not re.match(r'^[a-zA-Z\s]+$', value):
-            raise serializers.ValidationError("Patient name should contain only alphabets and spaces")
+        if not value or not value.strip():
+            raise serializers.ValidationError("Patient name cannot be empty")
+        if not re.match(r"^[a-zA-Z\s'\-]+$", value.strip()):
+            raise serializers.ValidationError(self.INVALID_NAME_ERROR)
         return value.strip()

This would allow for names like "Mary-Jane" or "O'Connor" while still preventing numeric values.

🧰 Tools
🪛 Ruff

215-215: Single quotes found but double quotes preferred

Replace single quotes with double quotes

(Q000)


216-216: Exception must not use a string literal, assign to variable first

Assign to variable; remove string literal

(EM101)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between e0c8351 and dd57b7f.

📒 Files selected for processing (1)
  • care/facility/api/serializers/patient.py (4 hunks)
🧰 Additional context used
🪛 Ruff
care/facility/api/serializers/patient.py

215-215: Single quotes found but double quotes preferred

Replace single quotes with double quotes

(Q000)


216-216: Exception must not use a string literal, assign to variable first

Assign to variable; remove string literal

(EM101)

🔇 Additional comments (2)
care/facility/api/serializers/patient.py (2)

312-314: Same passive validation in the update method...

This is essentially the same issue as in the create method. The validation should be more proactive when handling name updates.


249-251: 🛠️ Refactor suggestion

The validation could be a bit more... assertive.

The current implementation silently skips validation when name is not provided. Since name is a required field for patient registration, we should be more strict here.

Consider this improvement:

-        name = validated_data.get("name")
-        if name:
-            self.validate_name(name)
+        if "name" in validated_data:
+            validated_data["name"] = self.validate_name(validated_data["name"])

Likely invalid or redundant comment.

Comment on lines 249 to 251
name = validated_data.get("name")
if name:
self.validate_name(name)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
name = validated_data.get("name")
if name:
self.validate_name(name)

this is not required as create and update already get the validated data

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, understood :)
Could you confirm the latest commit? @sainak

Comment on lines 312 to 314
name = validated_data.get("name")
if name:
self.validate_name(name)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
name = validated_data.get("name")
if name:
self.validate_name(name)

this is not required as create and update already get the validated data

@areebahmeddd areebahmeddd requested a review from a team as a code owner November 21, 2024 13:33
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Outside diff range and nitpick comments (7)
.github/workflows/qodana_code_quality.yml (2)

1-8: Consider adding main/master branch to the trigger specification

I couldn't help but notice you've only included the 'develop' branch. While I'm sure you have your reasons, it might be worth considering adding your production branch as well... just saying.

  push:
    branches:
      - develop
+     - main  # or master, depending on your preference

17-20: Consider updating checkout action to v4

The checkout configuration looks perfect, though there's a slightly newer version available... not that I'm judging or anything.

- uses: actions/checkout@v3
+ uses: actions/checkout@v4
data/dummy/users.json (2)

904-904: Username format deviates from established pattern.

All other doctor usernames follow the pattern devdoctorN or dummydoctorN, but this one uses hyphens. I mean, if you really want to be different... 🤷

Consider maintaining consistency with other usernames:

-          "username": "dev-doctor2",
+          "username": "devdoctor2",

Line range hint 919-922: New doctor-specific fields need validation rules.

The newly added fields look good, but I couldn't help but notice we might want to add some validation rules for:

  • qualification: Should probably be from a predefined list
  • doctor_experience_commenced_on: Should not be in the future
  • doctor_medical_council_registration: Should follow a specific format

Consider adding these validations to the serializer. Would you like me to help with implementing these validation rules?

data/dummy/facility.json (3)

4167-4167: Interesting choice of patient name patterns...

The patient names follow a consistent pattern of "Dummy Patient One/Two/Three" or "Discharge Patient One/Two/Three". While this maintains the PR's goal of avoiding numeric values, perhaps we could make the test data a bit more realistic with actual name patterns?

Consider using more realistic test data like:

-      "name": "Dummy Patient One",
+      "name": "John Smith"

Also applies to: 4250-4250, 4333-4333, 4416-4416, 4499-4499, 4582-4582, 4665-4665, 4748-4748, 4831-4831, 4914-4914, 4997-4997, 5080-5080, 5163-5163, 5246-5246, 5329-5329, 5412-5412, 5495-5495, 5578-5578


Line range hint 1-6500: Consider adding data validation tests

While the data looks structurally sound, it would be helpful to have automated tests to validate:

  • No numeric values in patient names
  • Required fields are present
  • Foreign key relationships are valid

Would you like me to help create a test suite for data validation?


Data relationships are mostly intact, with one tiny discrepancy

Patient 18 has multiple consultations (19-26) but their last_consultation field still points to consultation 18. I suppose someone forgot to update that reference, but who am I to judge?

  • Patient 18's last_consultation should be 26 (the latest consultation) instead of 18
  • All other patient-consultation relationships are properly maintained
🔗 Analysis chain

Line range hint 4167-5578: Verify data integrity across related records

The patient records have consistent relationships with consultations, but let's verify there are no orphaned records.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for consultations without valid patient references
# Expected: No output indicates all consultations have valid patient references

# Get all patient IDs
patient_ids=$(cat data/dummy/facility.json | jq -r '.[] | select(.model == "facility.patientregistration") | .pk')

# Check consultations for invalid patient references
cat data/dummy/facility.json | jq -r '.[] | 
  select(.model == "facility.patientconsultation") | 
  select(.fields.patient as $pid | ($patient_ids | contains([$pid]) | not)) |
  "Consultation \(.pk) references non-existent patient \(.fields.patient)"'

Length of output: 616


Script:

#!/bin/bash
# Let's try a different approach to check consultation-patient relationships
# Using a more reliable jq query structure

# First check if there are any consultations at all
echo "=== Checking consultation records ==="
cat data/dummy/facility.json | jq -r '.[] | select(.model == "facility.patientconsultation") | .pk' | wc -l

# Then check the patient references in consultations
echo -e "\n=== Checking patient references in consultations ==="
cat data/dummy/facility.json | jq -r '.[] | 
  select(.model == "facility.patientconsultation") | 
  {consultation: .pk, patient: .fields.patient}'

# Also check if all patients have their last_consultation field pointing to valid consultations
echo -e "\n=== Checking last_consultation references in patients ==="
cat data/dummy/facility.json | jq -r '.[] | 
  select(.model == "facility.patientregistration") | 
  {patient: .pk, last_consultation: .fields.last_consultation}'

Length of output: 2666

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between dd57b7f and 0f02ac3.

📒 Files selected for processing (4)
  • .github/workflows/qodana_code_quality.yml (1 hunks)
  • data/dummy/facility.json (18 hunks)
  • data/dummy/users.json (1 hunks)
  • qodana.yaml (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • qodana.yaml
🔇 Additional comments (2)
data/dummy/users.json (2)

897-898: LGTM: Name changes align with validation requirements.

The changes to remove numeric values from names ("Dev" → "Tester", "Doctor Two" → "Doctor") align perfectly with the PR objective.


906-906: Verify creator user existence.

The created_by field is changed from 1 to 2. Let's just make absolutely sure user 2 exists and has sufficient privileges, shall we?

✅ Verification successful

Well, look who we found - a perfectly valid District Admin

User 2 exists and is actually a District Admin (user_type: 30) with appropriate system access. They're active in the system and, surprisingly enough, everything seems to be in perfect order. How refreshing.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Verify user 2 exists and has appropriate privileges
rg -A 5 '"pk": 2,' | rg "user_type"

Length of output: 35


Script:

#!/bin/bash
# Let's try a more comprehensive search for user with ID 2
rg -U "\"pk\": 2,[\s\S]{0,200}" data/dummy/users.json

# Also search for any user type definitions that might help us understand the privileges
rg -U "\"user_type\":" data/dummy/users.json

Length of output: 1064


Script:

#!/bin/bash
# Let's see if we can find what user_type user 2 has
rg -U -C 10 '"pk": 2,' data/dummy/users.json | rg "user_type"

# And maybe check what these user types actually mean
fd "types?.py|constants?.py" | xargs rg "user_type"

Length of output: 379


Script:

#!/bin/bash
# Let's check those migrations and models for user type definitions
rg -U "class.*User|USER_TYPE_" care/users/

# And see the full context of user 2 one more time
rg -U -C 15 '"pk": 2,' data/dummy/users.json

Length of output: 3721

Comment on lines 12 to 15
permissions:
contents: write
pull-requests: write
checks: write
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Restrict permissions to minimum required scope

The current permission set seems a tad... generous. Qodana typically only needs write access to checks and pull-requests to report results.

  permissions:
-   contents: write
    pull-requests: write
    checks: write
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
permissions:
contents: write
pull-requests: write
checks: write
permissions:
pull-requests: write
checks: write

Comment on lines 21 to 27
- name: 'Qodana Scan'
uses: JetBrains/[email protected]
with:
pr-mode: false
env:
QODANA_TOKEN: ${{ secrets.QODANA_TOKEN_1210838162 }}
QODANA_ENDPOINT: 'https://qodana.cloud'
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codebase verification

🛠️ Refactor suggestion

Consider disabling PR mode for this repository

Based on the scan results, this repository contains numerous sensitive configuration files with tokens, secrets, and credentials in the aws/ directory and dummy user data. While these might be development configurations, it's safer to keep Qodana scans private by disabling PR mode to avoid exposing potential security findings in PR comments.

  • The aws/ directory contains multiple sensitive environment variables and AWS credentials
  • Dummy user data with password hashes in data/dummy/users.json
  • Various API keys and tokens in AWS task definitions

I would actually recommend the opposite of the original suggestion:

  uses: JetBrains/[email protected]
  with:
+   pr-mode: false
+   timeout: 30

The timeout addition is still valuable to prevent long-running jobs, but PR mode should remain disabled to avoid potential security implications.

🔗 Analysis chain

Add timeout and enable PR mode for better feedback

The scan configuration could use a few tweaks to make it more... robust. Also, enabling PR mode might help with those pesky code review discussions.

  uses: JetBrains/[email protected]
  with:
-   pr-mode: false
+   pr-mode: true
+   timeout: 30

Also, I noticed you're using the cloud endpoint. Just make sure your team is aware of the data being sent to Qodana Cloud.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check if repository contains sensitive patterns that shouldn't be sent to Qodana Cloud
rg -i "password|secret|key|token" --type yaml --type json

Length of output: 32247

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 0f02ac3 and 9972c9d.

📒 Files selected for processing (1)
  • care/facility/api/serializers/patient.py (2 hunks)
🧰 Additional context used
🪛 Ruff
care/facility/api/serializers/patient.py

215-215: Single quotes found but double quotes preferred

Replace single quotes with double quotes

(Q000)


216-216: Exception must not use a string literal, assign to variable first

Assign to variable; remove string literal

(EM101)

🔇 Additional comments (2)
care/facility/api/serializers/patient.py (2)

5-5: LGTM!

The re module import is correctly placed with other standard library imports.


214-217: Verify test coverage for name validation.

Let's ensure the test cases cover various name formats and edge cases.

🧰 Tools
🪛 Ruff

215-215: Single quotes found but double quotes preferred

Replace single quotes with double quotes

(Q000)


216-216: Exception must not use a string literal, assign to variable first

Assign to variable; remove string literal

(EM101)

Comment on lines +214 to +217
def validate_name(self, value):
if not re.match(r'^[a-zA-Z\s]+$', value):
raise serializers.ValidationError("Patient name should contain only alphabets and spaces")
return value
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

The validation could be a bit more... accommodating.

While the current implementation prevents numeric values, it might be a bit too restrictive for real-world names. Also, the error handling could use some polish.

Consider this enhanced implementation:

+    # Error messages as class constants
+    NAME_REQUIRED_ERROR = "Patient name is required"
+    NAME_EMPTY_ERROR = "Patient name cannot be empty or contain only whitespace"
+    NAME_INVALID_ERROR = "Patient name can only contain letters, spaces, hyphens, apostrophes, and periods"
+
     def validate_name(self, value):
-        if not re.match(r'^[a-zA-Z\s]+$', value):
-            raise serializers.ValidationError("Patient name should contain only alphabets and spaces")
+        if value is None:
+            raise serializers.ValidationError(self.NAME_REQUIRED_ERROR)
+        
+        value = value.strip()
+        if not value:
+            raise serializers.ValidationError(self.NAME_EMPTY_ERROR)
+        
+        # Allow letters (including international characters), spaces, hyphens, apostrophes, and periods
+        if not re.match(r'^[\p{L}\s\-\'\.]+$', value, re.UNICODE):
+            raise serializers.ValidationError(self.NAME_INVALID_ERROR)
+
         return value

This implementation:

  1. Handles international characters (e.g., "José", "François")
  2. Allows common name punctuation (e.g., "O'Connor", "Mary-Jane", "Jr.")
  3. Properly validates None/empty values
  4. Uses class constants for error messages
  5. Strips leading/trailing whitespace
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def validate_name(self, value):
if not re.match(r'^[a-zA-Z\s]+$', value):
raise serializers.ValidationError("Patient name should contain only alphabets and spaces")
return value
# Error messages as class constants
NAME_REQUIRED_ERROR = "Patient name is required"
NAME_EMPTY_ERROR = "Patient name cannot be empty or contain only whitespace"
NAME_INVALID_ERROR = "Patient name can only contain letters, spaces, hyphens, apostrophes, and periods"
def validate_name(self, value):
if value is None:
raise serializers.ValidationError(self.NAME_REQUIRED_ERROR)
value = value.strip()
if not value:
raise serializers.ValidationError(self.NAME_EMPTY_ERROR)
# Allow letters (including international characters), spaces, hyphens, apostrophes, and periods
if not re.match(r'^[\p{L}\s\-\'\.]+$', value, re.UNICODE):
raise serializers.ValidationError(self.NAME_INVALID_ERROR)
return value
🧰 Tools
🪛 Ruff

215-215: Single quotes found but double quotes preferred

Replace single quotes with double quotes

(Q000)


216-216: Exception must not use a string literal, assign to variable first

Assign to variable; remove string literal

(EM101)

@rithviknishad
Copy link
Member

Closing as per ohcnetwork/care_fe#9143 (comment)

@areebahmeddd areebahmeddd deleted the patient-name-validation branch November 22, 2024 10:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Disallow numeric values in patient names
5 participants