Skip to content

Commit

Permalink
SNOW-1859608: Add missing docs for DataFrame.map (#2766)
Browse files Browse the repository at this point in the history
<!---
Please answer these questions before creating your pull request. Thanks!
--->

1. Which Jira issue is this PR addressing? Make sure that there is an
accompanying issue to your PR.

   <!---
   In this section, please add a Snowflake Jira issue number.

Note that if a corresponding GitHub issue exists, you should still
include
   the Snowflake Jira issue number. For example, for GitHub issue
#1400, you should
   add "SNOW-1335071" here.
    --->

   Fixes SNOW-1859608

2. Fill out the following pre-review checklist:

- [ ] I am adding a new automated test(s) to verify correctness of my
new code
- [ ] If this test skips Local Testing mode, I'm requesting review from
@snowflakedb/local-testing
   - [ ] I am adding new logging messages
   - [ ] I am adding a new telemetry message
   - [ ] I am adding new credentials
   - [ ] I am adding a new dependency
- [ ] If this is a new feature/behavior, I'm adding the Local Testing
parity changes.
- [ ] I acknowledge that I have ensured my changes to be thread-safe.
Follow the link for more information: [Thread-safe Developer
Guidelines](https://github.com/snowflakedb/snowpark-python/blob/main/CONTRIBUTING.md#thread-safe-development)

3. Please describe how your code solves the related issue.

   Add missing docs for DataFrame.map.
  • Loading branch information
sfc-gh-helmeleegy authored Dec 18, 2024
1 parent 97e178a commit 3c38e82
Show file tree
Hide file tree
Showing 2 changed files with 72 additions and 1 deletion.
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@
#### Improvements
- Improve performance of `DataFrame.map`, `Series.apply` and `Series.map` methods by mapping numpy functions to snowpark functions if possible.
- Updated integration testing for `session.lineage.trace` to exclude deleted objects
- Added documentation for `DataFrame.map`.

## 1.26.0 (2024-12-05)

Expand Down
72 changes: 71 additions & 1 deletion src/snowflake/snowpark/modin/plugin/docstrings/dataframe.py
Original file line number Diff line number Diff line change
Expand Up @@ -4818,7 +4818,77 @@ def value_counts():

def map():
"""
Apply a function to the `DataFrame` elementwise.
Apply a function to a Dataframe elementwise.
Added in version 2.1.0: DataFrame.applymap was deprecated and renamed to DataFrame.map.
This method applies a function that accepts and returns a scalar to every element of a DataFrame.
Parameters
----------
func : callable
Python function, returns a single value from a single value.
na_action : {None, ‘ignore’}, default None
If ‘ignore’, propagate NaN values, without passing them to func.
**kwargs
Additional keyword arguments to pass as keywords arguments to func.
Returns
-------
DataFrame
Transformed DataFrame.
See also
--------
DataFrame.apply
Apply a function along input axis of DataFrame.
DataFrame.replace
Replace values given in to_replace with value.
Series.map
Apply a function elementwise on a Series.
Examples
--------
>>> df = pd.DataFrame([[1, 2.12], [3.356, 4.567]])
>>> df
0 1
0 1.000 2.120
1 3.356 4.567
>>> df.map(lambda x: len(str(x)))
0 1
0 3 4
1 5 5
Like Series.map, NA values can be ignored:
>>> df_copy = df.copy()
>>> df_copy.iloc[0, 0] = pd.NA
>>> df_copy.map(lambda x: len(str(x)), na_action='ignore') # doctest: +SKIP
0 1
0 NaN 4
1 5.0 5
It is also possible to use map with functions that are not lambda functions:
>>> df.map(round, ndigits=1)
0 1
0 1.0 2.1
1 3.4 4.6
Note that a vectorized version of func often exists, which will be much faster. You could square each number elementwise.
>>> df.map(lambda x: x**2)
0 1
0 1.000000 4.494400
1 11.262736 20.857489
But it’s better to avoid map in that case.
>>> df ** 2
0 1
0 1.000000 4.494400
1 11.262736 20.857489
"""

def mask():
Expand Down

0 comments on commit 3c38e82

Please sign in to comment.