Skip to content

Commit

Permalink
Updates from featuretools v0.26.0 (#1539)
Browse files Browse the repository at this point in the history
* Change to use GitHub Token rather than GitHub PAT (#1402)

* Update dependency_check.yml

* Update release_notes.rst

* Update dependency_check.yml

* Use builtin secret token with create pull request (#1407)

* Use builtin secret token with create pull request

* Update release_notes.rst

* Use repo scoped token again (#1409)

* Use repo scoped token again

* Update release_notes.rst

* Update latest_dependencies.txt (#1410)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* Lower max depth to 1 if single entity (#1412)

* lower DFS max depth to 1 if single entity

* add test for including seed features with depth greater than max depth

* test max depth=0 doesn't create depth 1 features on single table

* update release notes

* change log entry to user warning and test for warning

* lint

* fix max depth warning in docs

* remove outdated comment

* rework single table assertions to be more readable

* use feature_with_name helper in seed_features test

* lint

* add max_depth=None and max_depth=-1 cases to single table test

* move helper function def out of loop; remove invalid max_depth=None case

* lint

* Drop Python 3.6 support (#1413)

* remove py36 from CI test matrix

* remove warning when importing featuretools about dropping 3.6 support

* remove python 3.6 from setup.py

* remove py36 from list of supported version in installation docs

* remove py36 constraint on dependency

* update release notes

* v0.24.0 (#1414)

* bump version number

* update release notes

* Update latest_dependencies.txt (#1415)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* Separate workflows and unit tests (#1422)

* separate workflows

* update release notes

* fix incorrect word

* update numpy req

* separate link check

* remove dask separation

* copy from main

* release notes

* Add minimum dependency generator GitHub Action (#1428)

* add min deps checker

* update release notes

* fix filename

* generate auto PR

* update latest dep check

* file rename

* better release notes

* move to 1 folder

* fix fastparquet?

* Update minimum dependencies (#1431)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* Bump pyyaml from 3.12 to 5.4 in /featuretools/tests/requirement_files (#1433)

* Bump pyyaml from 3.12 to 5.4 in /featuretools/tests/requirement_files

Bumps [pyyaml](https://github.com/yaml/pyyaml) from 3.12 to 5.4.
- [Release notes](https://github.com/yaml/pyyaml/releases)
- [Changelog](https://github.com/yaml/pyyaml/blob/master/CHANGES)
- [Commits](yaml/pyyaml@3.12...5.4)

Signed-off-by: dependabot[bot] <[email protected]>

* Update requirements.txt

* Update release_notes.rst

* Update release_notes.rst

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Gaurav Sheni <[email protected]>

* Update nbsphinx version to resolve docs build issue (#1436)

* update release note for test

* update release notes

* pin markupsafe version

* update nbsphinx version and remove markupsafe

* update release notes

* Update latest dependencies (#1437)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* Update latest dependencies (#1439)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* Bump psutil requirement (#1438)

* Bump psutil requirement

* Update release_notes.rst

* Update minimum dependencies (#1443)

* Add unit tests against minimum dependencies (#1432)

* Fix numpy installation for minimum unit tests (#1445)

* Update latest dependencies (#1446)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* Update latest dependencies (#1448)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* v0.24.1 (#1450)

* Update latest dependencies (#1454)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* Update latest dependencies (#1455)

* Bump urllib3 from 1.26.4 to 1.26.5 in /featuretools/tests/requirement_files (#1457)

* Bump urllib3 in /featuretools/tests/requirement_files

Bumps [urllib3](https://github.com/urllib3/urllib3) from 1.26.4 to 1.26.5.
- [Release notes](https://github.com/urllib3/urllib3/releases)
- [Changelog](https://github.com/urllib3/urllib3/blob/main/CHANGES.rst)
- [Commits](urllib3/urllib3@1.26.4...1.26.5)

---
updated-dependencies:
- dependency-name: urllib3
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>

* Update test-requirements.txt

* Update release_notes.rst

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Gaurav Sheni <[email protected]>

* Update alteryx_open_src_update_checker to 2.0.0 (#1460)

* Update setup.py

* Update __init__.py

* Update release_notes.rst

* Update setup.py

* Update install_test.yml

* double for loop

* Update latest dependencies (#1464)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* Add get_valid_primitives function (#1462)

* add function skeleton

* add tests

* add get_valid_primitives and update tests

* add test and fix typo

* update release notes

* add test for non-str invalid primitive

* remove unused code from custom primitives

* lint

* remove unused var names and avoid erroring due to compatibility

* rework compatibility check

* make ft.get_valid_primitives callable, add to API reference, add note to docstring

* make get_entityset_type private

* Bump minimum pip from 19.0.2 to 21.1.2 (#1475)

* Bump pip from 19.0.2 to 19.2 in /featuretools/tests/requirement_files

Bumps [pip](https://github.com/pypa/pip) from 19.0.2 to 19.2.
- [Release notes](https://github.com/pypa/pip/releases)
- [Changelog](https://github.com/pypa/pip/blob/main/NEWS.rst)
- [Commits](pypa/pip@19.0.2...19.2)

---
updated-dependencies:
- dependency-name: pip
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>

* Update test-requirements.txt

* Update test-requirements.txt

* Update minimum_test_requirements.txt

* Update release_notes.rst

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Gaurav Sheni <[email protected]>
Co-authored-by: Roy Wedge <[email protected]>

* Add dataframe_type property to EntitySet (#1473)

* add dataframe_type property

* remove _get_entityset_type

* update if not pandas entityset checks in tests

* add docstring to dataframe_type

* update release notes

* rework dataframe_type logic

* add test cases

* use dataframe_type in more tests

* remove some unused ks imports

* more test updates

* fix faulty comparison in tests

* v0.25.0 (#1485)

* bump version number

* update release notes

* Update latest dependencies (#1487)

* Update latest dependencies (#1499)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* Fix docs to avoid logging demos (#1498)

* set testing header to prevent logging

* add library to url

* release notes

* release notes

Co-authored-by: Gaurav Sheni <[email protected]>

* Update latest dependencies (#1500)

* Update latest dependencies (#1502)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* Update latest dependencies (#1503)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* Add replace_inf_values util function (#1505)

* add replace_inf_values util function

* update release notes

* fix release notes

* add optional columns parameter to function

* lint fix

* Test compatibility with upcoming pandas release 1.3.0 (#1492)

* update requirements

* comment at local error

* fix test_transform error

* fix boolean conversion error

* remove requirements change

* fix timezone warning

* fix astype warning and use view

* Add release note

* Update latest dependencies (#1520)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* Add URL and Email Address primitives (#1508)

* Update latest dependencies (#1524)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* Primitive options include entities overrides exclude entities (#1518)

* update ignore_entity for primitive

* update variable_filter to return True if entity in include_entities

* update release notes

* Update TLD list and add license for email file (#1531)

* add license to primitive data

* update TLD list

* update release notes

* typo

* update TLD list

* v0.26.0 (#1525)

* bump version

* update release notes

* make underline longer

* alphabetize contributors

* Update docs/source/release_notes.rst

* Update docs/source/release_notes.rst

Co-authored-by: Gaurav Sheni <[email protected]>

Co-authored-by: Gaurav Sheni <[email protected]>

* Update latest dependencies (#1534)

* uncomment future release

* replace target_entity in a few tests

* delete test_entity.py again

* fix include_over_exclude test

* put Fixes section back in the changelog

Co-authored-by: Gaurav Sheni <[email protected]>
Co-authored-by: machineFL <[email protected]>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Nate Parsons <[email protected]>
Co-authored-by: Jeff Hernandez <[email protected]>
Co-authored-by: Frances Hartwell <[email protected]>
Co-authored-by: Tamar Grey <[email protected]>
Co-authored-by: Ethan Tu <[email protected]>
  • Loading branch information
10 people authored Jul 20, 2021
1 parent fbd56b7 commit 2a3106b
Show file tree
Hide file tree
Showing 31 changed files with 5,407 additions and 48 deletions.
2 changes: 2 additions & 0 deletions MANIFEST.in
Original file line number Diff line number Diff line change
Expand Up @@ -2,5 +2,7 @@ include *.txt
include LICENSE
include README.md
include featuretools/primitives/data/featuretools_unit_test_example.csv
include featuretools/primitives/data/free_email_provider_domains.txt
include featuretools/primitives/data/free_email_provider_domains_license
recursive-exclude * __pycache__
recursive-exclude * *.py[co]
31 changes: 20 additions & 11 deletions docs/source/api_reference.rst
Original file line number Diff line number Diff line change
Expand Up @@ -251,6 +251,26 @@ Feature encoding

encode_features

Feature Selection
~~~~~~~~~~~~~~~~~
.. currentmodule:: featuretools.selection
.. autosummary::
:toctree: generated/

remove_low_information_features
remove_highly_correlated_features
remove_highly_null_features
remove_single_value_features

Feature Matrix utils
~~~~~~~~~~~~~~~~~~~~
.. currentmodule:: featuretools.computational_backends
.. autosummary::
:toctree: generated/

replace_inf_values


Saving and Loading Features
~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. currentmodule:: featuretools
Expand Down Expand Up @@ -384,14 +404,3 @@ Data Type Util Methods

list_logical_types
list_semantic_tags

Feature Selection
------------------
.. currentmodule:: featuretools.selection
.. autosummary::
:toctree: generated/

remove_low_information_features
remove_highly_correlated_features
remove_highly_null_features
remove_single_value_features
7 changes: 7 additions & 0 deletions docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,8 @@
import os
import subprocess
import sys
import shutil
from pathlib import Path

import featuretools

Expand Down Expand Up @@ -348,4 +350,9 @@
napoleon_use_rtype = True

def setup(app):
home_dir = os.environ.get('HOME', '/')
ipython_p = Path(home_dir + "/.ipython/profile_default/startup")
ipython_p.mkdir(parents=True, exist_ok=True)
file_p = os.path.abspath(os.path.dirname(__file__))
shutil.copy(file_p + "/set-headers.py", home_dir + "/.ipython/profile_default/startup")
app.add_css_file("style.css")
2 changes: 1 addition & 1 deletion docs/source/guides/using_dask_entitysets.rst
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ We can pass the ``EntitySet`` we created above to ``featuretools.dfs`` in order
.. ipython:: python
feature_matrix, features = ft.dfs(entityset=es,
target_dataframe="dask_entity",
target_dataframe_name="dask_entity",
trans_primitives=["negate"],
max_depth=1)
feature_matrix
Expand Down
2 changes: 1 addition & 1 deletion docs/source/guides/using_koalas_entitysets.rst
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ We can pass the ``EntitySet`` we created above to ``featuretools.dfs`` in order
.. ipython:: python
feature_matrix, features = ft.dfs(entityset=es,
target_dataframe="koalas_entity",
target_dataframe_name="koalas_entity",
trans_primitives=["negate"],
max_depth=1)
feature_matrix
Expand Down
22 changes: 16 additions & 6 deletions docs/source/release_notes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@

Release Notes
-------------

Future Release
==============
* Enhancements
Expand All @@ -12,7 +13,6 @@ Future Release
* Remove ``add_interesting_values`` from ``Entity`` (:pr:`1269`)
* Move ``set_secondary_time_index`` method from ``Entity`` to ``EntitySet`` (:pr:`1280`)
* Refactor Relationship creation process (:pr:`1370`)
* Add auto assign bot on GitHub (:pr:`1380`)
* Replaced ``Entity.update_data`` with ``EntitySet.update_dataframe`` (:pr:`1398`)
* Move validation check for uniform time index to ``EntitySet`` (:pr:`1400`)
* Replace ``Entity`` objects in ``EntitySet`` with Woodwork dataframes (:pr:`1405`)
Expand All @@ -25,12 +25,7 @@ Future Release
* Update ``EntitySet.concat`` to work with Woodwork DataFrames (:pr:`1490`)
* Add function to list semantic tags (:pr:`1486`)
* Documentation Changes
* Improve formatting of release notes (:pr:`1396`)
* Testing Changes
* Update Dask/Koalas test fixtures (:pr:`1382`)
* Update Spark config in test fixtures and docs (:pr:`1387`, :pr:`1389`)
* Don't cancel other CI jobs if one fails (:pr:`1386`)
* Update boto3 and urllib3 version requirements (:pr:`1394`)

Thanks to the following people for contributing to this release:
:user:`gsheni`, :user:`jeff-hernandez`, :user:`rwedge`, :user:`tamargrey`, :user:`thehomebrewnerd`
Expand Down Expand Up @@ -157,6 +152,21 @@ You can list all the available semantic tags by calling ``featuretools.list_sema
>>> ft.list_semantic_tags()
v0.26.0 Jul 15, 2021
====================
* Enhancements
* Add ``replace_inf_values`` utility function for replacing ``inf`` values in a feature matrix (:pr:`1505`)
* Add URLToProtocol, URLToDomain, URLToTLD, EmailAddressToDomain, IsFreeEmailDomain as transform primitives (:pr:`1508`, :pr:`1531`)
* Fixes
* ``include_entities`` correctly overrides ``exclude_entities`` in ``primitive_options`` (:pr:`1518`)
* Documentation Changes
* Prevent logging on build (:pr:`1498`)
* Testing Changes
* Test featuretools on pandas 1.3.0 release candidate and make fixes (:pr:`1492`)

Thanks to the following people for contributing to this release:
:user:`frances-h`, :user:`gsheni`, :user:`rwedge`, :user:`tamargrey`, :user:`thehomebrewnerd`, :user:`tuethan1999`

v0.25.0 Jun 11, 2021
====================
* Enhancements
Expand Down
5 changes: 5 additions & 0 deletions docs/source/set-headers.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
import urllib.request

opener = urllib.request.build_opener()
opener.addheaders = [("Testing", "True")]
urllib.request.install_opener(opener)
1 change: 0 additions & 1 deletion featuretools/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,6 @@
import pkg_resources
import sys
import traceback
import warnings
from woodwork import list_logical_types, list_semantic_tags

logger = logging.getLogger('featuretools')
Expand Down
6 changes: 5 additions & 1 deletion featuretools/computational_backends/api.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,4 +3,8 @@
approximate_features,
calculate_feature_matrix
)
from .utils import bin_cutoff_times, create_client_and_cluster
from .utils import (
bin_cutoff_times,
create_client_and_cluster,
replace_inf_values
)
21 changes: 21 additions & 0 deletions featuretools/computational_backends/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
from functools import wraps

import dask.dataframe as dd
import numpy as np
import pandas as pd
import psutil

Expand Down Expand Up @@ -271,3 +272,23 @@ def _check_cutoff_time_type(cutoff_time, es_time_type):
if es_time_type == "datetime_time_index" and not is_datetime:
raise TypeError("cutoff_time times must be datetime type: try casting "
"via pd.to_datetime()")


def replace_inf_values(feature_matrix, replacement_value=np.nan, columns=None):
"""Replace all ``np.inf`` values in a feature matrix with the specified replacement value.
Args:
feature_matrix (DataFrame): DataFrame whose columns are feature names and rows are instances
replacement_value (int, float, str, optional): Value with which ``np.inf`` values will be replaced
columns (list[str], optional): A list specifying which columns should have values replaced. If None,
values will be replaced for all columns.
Returns:
feature_matrix
"""
if columns is None:
feature_matrix = feature_matrix.replace([np.inf, -np.inf], replacement_value)
else:
feature_matrix[columns] = feature_matrix[columns].replace([np.inf, -np.inf], replacement_value)
return feature_matrix
2 changes: 1 addition & 1 deletion featuretools/demo/flight.py
Original file line number Diff line number Diff line change
Expand Up @@ -59,7 +59,7 @@ def load_flight(month_filter=None,
filename, csv_length = get_flight_filename(demo=demo)

print('Downloading data ...')
url = "https://api.featurelabs.com/datasets/{}?version={}".format(filename, ft.__version__)
url = "https://api.featurelabs.com/datasets/{}?library=featuretools&version={}".format(filename, ft.__version__)

chunksize = math.ceil(csv_length / 99)
pd.options.display.max_columns = 200
Expand Down
4 changes: 2 additions & 2 deletions featuretools/demo/retail.py
Original file line number Diff line number Diff line change
Expand Up @@ -59,8 +59,8 @@ def load_retail(id='demo_retail_data', nrows=None, return_single_table=False):
'''
es = ft.EntitySet(id)
csv_s3_gz = "https://api.featurelabs.com/datasets/online-retail-logs-2018-08-28.csv.gz?version=" + ft.__version__
csv_s3 = "https://api.featurelabs.com/datasets/online-retail-logs-2018-08-28.csv?version=" + ft.__version__
csv_s3_gz = "https://api.featurelabs.com/datasets/online-retail-logs-2018-08-28.csv.gz?library=featuretools&version=" + ft.__version__
csv_s3 = "https://api.featurelabs.com/datasets/online-retail-logs-2018-08-28.csv?library=featuretools&version=" + ft.__version__
# Try to read in gz compressed file
try:
df = pd.read_csv(csv_s3_gz,
Expand Down
Loading

0 comments on commit 2a3106b

Please sign in to comment.