Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate the most common errors #3053

Open
severo opened this issue Aug 28, 2024 · 0 comments
Open

Investigate the most common errors #3053

severo opened this issue Aug 28, 2024 · 0 comments
Labels

Comments

@severo
Copy link
Collaborator

severo commented Aug 28, 2024

All the entries in the cache that have a cause exception, sorted by number of occurrences:

error_code cause_exception count
EmptyDatasetError EmptyDatasetError 17886
DataFilesNotFoundError DataFilesNotFoundError 9233
ComputationError StatisticsComputationError 6534
ComputationError TypeError 3063
DatasetGenerationCastError DatasetGenerationCastError 2403
PreviousStepStillProcessingError CachedArtifactNotFoundError 2160
DatasetGenerationError ArrowInvalid 2127
FeaturesError ArrowInvalid 1543
InfoError HfHubHTTPError 1354
DatasetGenerationError UnicodeDecodeError 1211
DatasetGenerationError TypeError 1176
UnexpectedError TypeError 1033
FeaturesError ValueError 969
DatasetGenerationError ArrowNotImplementedError 943
UnexpectedError HfHubHTTPError 896
FeaturesError UnicodeDecodeError 868
DatasetGenerationError ValueError 867
SplitNamesFromStreamingError SplitsNotFoundError 844
UnexpectedError ValueError 822
InfoError BrokenPipeError 817
InfoError SplitsNotFoundError 708
ConfigNamesError ImportError 651
UnexpectedError ParserException 621
UnexpectedError ReadTimeout 463
UnexpectedError BinderException 462
DatasetGenerationError ParserError 445
FeaturesError ParserError 436
ComputationError ZeroDivisionError 421
DatasetGenerationError SchemaInferenceError 387
UnexpectedError FileNotFoundError 343
ConfigNamesError ValueError 312
PolarsParquetReadError FileNotFoundError 273
FeaturesError ZstdError 266
StreamingRowsError ValueError 266
FileFormatMismatchBetweenSplitsError ValueError 221
UnexpectedError RuntimeError 207
UnexpectedError PermissionError 202
UnexpectedError UnidentifiedImageError 195
UnexpectedError BadZipFile 192
PolarsParquetReadError ComputeError 180
DatasetGenerationError FileNotFoundError 174
RowsPostProcessingError ValueError 168
ComputationError UnidentifiedImageError 166
UnexpectedError UnicodeDecodeError 163
UnexpectedError EntryNotFoundError 162
FeaturesError ArrowTypeError 134
UnexpectedError ConnectionError 133
StreamingRowsError CastError 132
DatasetGenerationError KeyError 131
ComputationError SchemaError 124
ConfigNamesError BadZipFile 111
DatasetGenerationError ArrowTypeError 108
UnexpectedError ArrowInvalid 104
DatasetGenerationError CastError 104
RowsPostProcessingError KeyError 103
StreamingRowsError OSError 94
UnexpectedError ColumnNotFoundError 93
StreamingRowsError RuntimeError 89
RowsPostProcessingError UnidentifiedImageError 86
UnexpectedError SchemaError 83
ConfigNamesError FileNotFoundError 82
StreamingRowsError ArrowInvalid 80
UnexpectedError ReadError 79
StreamingRowsError KeyError 78
UnexpectedError ClientResponseError 77
UnexpectedError ComputeError 74
InfoError DatasetWithScriptNotSupportedError 70
RowsPostProcessingError TypeError 68
InfoError DatasetNotFoundError 67
CreateCommitError RepositoryNotFoundError 65
FeaturesError EmptyDataError 65
UnexpectedError InvalidInputException 58
UnexpectedError ServerDisconnectedError 54
ComputationError ValueError 54
UnexpectedError IOException 52
UnexpectedError ConversionException 50
ConfigNamesError TypeError 47
InfoError ReadTimeout 42
DatasetGenerationError EmptyDataError 42
StreamingRowsError TypeError 40
UnexpectedError KeyError 38
PolarsParquetReadError ColumnNotFoundError 38
DatasetGenerationError ConnectionError 38
DatasetGenerationError BadZipFile 37
StreamingRowsError UnicodeDecodeError 36
UnexpectedError NonMatchingSplitsSizesError 36
UnexpectedError OSError 34
DatasetGenerationError ArrowCapacityError 33
UnexpectedError ArrowTypeError 33
SplitNamesFromStreamingError FileNotFoundError 33
DatasetGenerationError GatedRepoError 33
InfoError ConnectionError 31
RowsPostProcessingError CouldntDecodeError 30
DatasetGenerationError OverflowError 26
StreamingRowsError UnidentifiedImageError 25
FeaturesError OverflowError 22
StreamingRowsError LibsndfileError 21
UnexpectedError ArrowCapacityError 20
UnexpectedError DecompressionBombError 20
StreamingRowsError FileNotFoundError 19
UnexpectedError NotImplementedError 18
RowsPostProcessingError OSError 18
UnexpectedError ZeroDivisionError 18
ComputationError DecompressionBombError 18
NormalRowsError DatasetGenerationError 18
DatasetGenerationError ReadError 17
UnexpectedError Error 17
FeaturesError RuntimeError 16
UnexpectedError DatasetGenerationError 15
UnexpectedError IndexError 14
DatasetGenerationError RuntimeError 13
UnexpectedError JSONDecodeError 13
UnexpectedError ParserError 12
DatasetGenerationError NotImplementedError 12
ComputationError ArrowInvalid 12
StreamingRowsError NotImplementedError 11
StreamingRowsError ParserError 11
ConfigNamesError InvalidConfigName 11
DatasetGenerationError ArrowIndexError 11
ComputationError DuplicateError 11
StreamingRowsError ArrowNotImplementedError 10
FeaturesError HfHubHTTPError 10
FeaturesError AttributeError 9
CreateCommitError BadRequestError 8
StreamingRowsError HfHubHTTPError 8
NormalRowsError FileNotFoundError 8
FeaturesError UnsupportedOperation 8
UnexpectedError FSTimeoutError 8
ConfigNamesError AttributeError 8
UnexpectedError AttributeError 8
DatasetGenerationError OSError 7
FeaturesError ArrowCapacityError 7
DatasetGenerationError AttributeError 7
ConfigNamesError ReadTimeout 7
CreateCommitError EntryNotFoundError 6
FeaturesError BadGzipFile 6
FeaturesError BadZipFile 6
NormalRowsError DatasetGenerationCastError 6
ComputationError InvalidOperationError 6
ComputationError OverflowError 5
UnexpectedError HTTPError 5
InfoError ValueError 5
DatasetGenerationError EOFError 4
InfoError FileNotFoundError 4
FeaturesError HTTPError 4
UnexpectedError TypeMismatchException 4
NormalRowsError HfHubHTTPError 4
UnexpectedError DatasetGenerationCastError 4
UnexpectedError InvalidOperationError 4
UnexpectedError ClientConnectorError 4
StreamingRowsError ReadError 3
ConfigNamesError UnicodeDecodeError 3
FeaturesError NotImplementedError 3
PolarsParquetReadError error 3
NormalRowsError OSError 3
UnexpectedError ExpectedMoreSplits 3
FeaturesError FileNotFoundError 3
DatasetGenerationError HfHubHTTPError 3
UnexpectedError error 3
UnexpectedError UnpicklingError 3
StreamingRowsError AssertionError 3
StreamingRowsError EmptyDataError 3
StreamingRowsError EntryNotFoundError 2
DatasetGenerationError UnsupportedOperation 2
UnexpectedError InternalException 2
DatasetGenerationError error 2
ConfigNamesError ScannerError 2
StreamingRowsError ArrowCapacityError 2
RetryableConfigNamesError HfHubHTTPError 2
StreamingRowsError DecompressionBombError 2
SplitNamesFromStreamingError HfHubHTTPError 2
ConfigNamesError JSONDecodeError 2
ConfigNamesError KeyError 2
InfoError DataFilesNotFoundError 2
DatasetGenerationError EntryNotFoundError 2
InfoError BadZipFile 2
UnexpectedError IsADirectoryError 2
UnexpectedError DuplicateError 2
FeaturesError KeyError 1
RowsPostProcessingError SyntaxError 1
CreateCommitError HfHubHTTPError 1
FeaturesError ConnectionError 1
StreamingRowsError error 1
StreamingRowsError OverflowError 1
StreamingRowsError HTTPError 1
StreamingRowsError ArrowTypeError 1
StreamingRowsError AttributeError 1
UnexpectedError ChunkedEncodingError 1
UnexpectedError EmptyDatasetError 1
RowsPostProcessingError LibsndfileError 1
UnexpectedError TransactionException 1
UnexpectedError EOFError 1
SplitNamesFromStreamingError ConnectionError 1
InfoError JSONDecodeError 1
UnexpectedError ClientPayloadError 1
FeaturesError ReadTimeout 1
UnexpectedError EmptyDataError 1
ConfigNamesError IsADirectoryError 1
DatasetGenerationError AssertionError 1
db.cachedResponsesBlue.aggregate([
  {
    $match: {
      "details.copied_from_artifact": {"$exists": false},
      "details.cause_exception": {"$exists": true},
    },
  },
  {
    $group: {
      _id: {
        error_code: "$error_code",
        cause_exception: "$details.cause_exception",
      },
      count: {
        $sum: 1
      },
    },
  }, {
    $sort: { count: -1 }
  }
]);
@severo severo added P2 Nice to have error handling labels Aug 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant