Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dataset: fix bug revealed by pandas 1.0 #561

Merged
merged 9 commits into from
Feb 8, 2020

Conversation

mattwelborn
Copy link
Contributor

Description

CI is failing. This seems to be caused by df = df.groupby(["name"])["return_result"].sum(skipna=True). It appears that skipna was never a valid kwarg of groupby().sum(), but that it was not checked until pandas 1.0. This PR removes skipna=True; the default behavior is to skip NaNs already.

Changelog description

Fixed a bug that caused errors with pandas v1.0.

Status

  • Code base linted
  • Ready to go

@dgasmith
Copy link
Contributor

dgasmith commented Feb 6, 2020

Can we bump minimum versions?

@mattwelborn
Copy link
Contributor Author

Can we bump minimum versions?

Of pandas? This PR should work with pre-1.0 pandas just fine. skipna was silently ignored in pre-1.0 pandas, so the behavior is the same.

@mattwelborn
Copy link
Contributor Author

@dgasmith Any idea what might be going on here:

_____________________________ test_mol_pagination ______________________________
storage_socket = <qcfractal.storage_sockets.sqlalchemy_socket.SQLAlchemySocket object at 0x7f5a1daab518>
    def test_mol_pagination(storage_socket):
        """
            Test Molecule pagination
        """
    
        assert len(storage_socket.get_molecules()["data"]) == 0
        mol_names = [
            "water_dimer_minima.psimol",
            "water_dimer_stretch.psimol",
            "water_dimer_stretch2.psimol",
            "neon_tetramer.psimol",
        ]
    
        total = len(mol_names)
        molecules = []
        for mol_name in mol_names:
            mol = ptl.data.get_molecule(mol_name)
            molecules.append(mol)
    
        inserted = storage_socket.add_molecules(molecules)
    
        assert inserted["meta"]["n_inserted"] == total
    
>       ret = storage_socket.get_molecules(skip=1)
qcfractal/tests/test_storage.py:1198: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
qcfractal/storage_sockets/sqlalchemy_socket.py:732: in get_molecules
    MoleculeORM, query, limit=limit, skip=skip, exclude=["molecule_hash", "molecular_formula"]
qcfractal/storage_sockets/sqlalchemy_socket.py:382: in get_query_projection
    rdata = [dict(zip(_projection, row)) for row in data]
qcfractal/storage_sockets/sqlalchemy_socket.py:382: in <listcomp>
    rdata = [dict(zip(_projection, row)) for row in data]
../../../miniconda/envs/test/lib/python3.6/site-packages/sqlalchemy/orm/loading.py:101: in instances
    util.raise_from_cause(err)
../../../miniconda/envs/test/lib/python3.6/site-packages/sqlalchemy/util/compat.py:398: in raise_from_cause
    reraise(type(exception), exception, tb=exc_tb, cause=cause)
../../../miniconda/envs/test/lib/python3.6/site-packages/sqlalchemy/util/compat.py:153: in reraise
    raise value
../../../miniconda/envs/test/lib/python3.6/site-packages/sqlalchemy/orm/loading.py:85: in instances
    for row in fetch
../../../miniconda/envs/test/lib/python3.6/site-packages/sqlalchemy/orm/loading.py:85: in <listcomp>
    for row in fetch
../../../miniconda/envs/test/lib/python3.6/site-packages/sqlalchemy/orm/loading.py:84: in <listcomp>
    keyed_tuple([proc(row) for proc in process])
../../../miniconda/envs/test/lib/python3.6/site-packages/sqlalchemy/sql/type_api.py:1266: in process
    return process_value(impl_processor(value), dialect)
qcfractal/storage_sockets/models/sql_base.py:24: in process_result_value
    return msgpackext_loads(value)
../../../miniconda/envs/test/lib/python3.6/site-packages/qcelemental/util/serialization.py:113: in msgpackext_loads
    return msgpack.loads(data, object_hook=msgpackext_decode, raw=False)
msgpack/_unpacker.pyx:184: in msgpack._cmsgpack.unpackb
    ???
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
>   ???
E   TypeError: a bytes-like object is required, not 'NoneType'
msgpack/_unpacker.pyx:135: TypeError

@dgasmith
Copy link
Contributor

dgasmith commented Feb 6, 2020

@mattwelborn Yea, sparse molecule bug.

@doaa-altarawy What do you think of the fix?

@mattwelborn
Copy link
Contributor Author

@dgasmith @doaa-altarawy I think we're back to database bugs now.

@codecov
Copy link

codecov bot commented Feb 7, 2020

Codecov Report

Merging #561 into master will decrease coverage by 3.16%.
The diff coverage is 94.11%.

@dgasmith
Copy link
Contributor

dgasmith commented Feb 7, 2020

@mattwelborn Can you reproduce the single failure? I cannot seem to manage it locally.

Copy link
Contributor

@dgasmith dgasmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Merging, this moves the bar forward. I will see if I can track down the remaining error.

@dgasmith dgasmith merged commit 01e3553 into MolSSI:master Feb 8, 2020
@dgasmith dgasmith mentioned this pull request Feb 8, 2020
@dgasmith dgasmith added this to the v0.13.1 milestone Feb 8, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants