-
-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Catch Xapian::DatabaseError in libzim code. #873
Conversation
Either throw a ZimFileFormatError or directly handle the error (mainly returning default value or ignoring corrupted databases)
e109756
to
75c47ab
Compare
What about xapian::Error? It should be caught as weel!? |
If we have Xapian error, it is because of a bug in our code. Not because of an external cause (as a corrupted database). |
} catch( Xapian::DatabaseError& e ) { | ||
// [TODO] Ignore the database or raise a error ? | ||
// As we already ignore the database if `getDbFromAccessInfo` "detects" a DatabaseError, | ||
// we also ignore here. | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This (as well as the pre-existing handling of database errors in getDbFromAccessInfo
) is a little inconsistent with the henceforth explicit handling of database errors in the rest of this PR.
Is there some kind of an implicit rule of when errors should be suppressed and when they should be allowed to bubble up? If yes, it must be made explicit. Otherwise, we'd better first come up with such a rule.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Otherwise, we'd better first come up with such a rule.
May be time to do so.
I tend to say that we should suppres the "Format" error (corrupted archives...) and we should bubble the error (potentially using only one exception class (ZimFileFormatError)).
(But I'm open to your opinion)
But in this Pr, I haven't try to enforce this rule and try to keep the same "API" as before : when errors were suppressed I continue suppress it and when exceptions were bubble up, I have just throw a ZimFileFormatError instead.
@@ -278,6 +285,8 @@ int Search::getEstimatedMatches() const | |||
return mset.get_matches_estimated(); | |||
} catch(Xapian::QueryParserError& e) { | |||
return 0; | |||
} catch(Xapian::DatabaseError& e) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be Xapian::DatabaseCorruptError
? Although the latter seems to be the only subclass of Xapian::DatabaseError
that we may expect to catch here (and in most other locations changed by this PR), if another type of a Xapian::DatabaseError
is introduced in the future it may be wrongly converted to zim::ZimFileFormatError
. The question then may seem to be what should we do with such a new type of a Xapian exception, but it can be answered right away since there already exist other types of Xapian exceptions not derived from Xapian::DatabaseError
. If we let such exceptions propagate outside libzim
, doing the same for a future new type of error shouldn't be a problem. However, under such logic why do we have to make an exception (pun intended) for DatabaseCorruptError
exceptions and convert them to zim::ZimFileFormatError
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should catch at least Xapian::DatabaseCorruptError
and Xapian::DatabaseVersionError
. The common parent exception is Xapian::DatabaseError
.
It is a trade of between simplification of code (and maybe catch too many things) and specific catch (but doubled catch
block)
@veloman-yunkan This PR is the last blocking for next release. Do you agree with the following proposition[1]: Libzim must always throw exception inheriting from When we return a default value or throw an exception is open to discussion (and can involve in time). If both @veloman-yunkan and @kelson42 agree, I suggest we merge this PR soon (assuming it actually fix Xapian exception handling to comply with [1]) to make the release and discuss about dubious default value in separated issue. |
I don't have any objections against merging this PR if it eliminates some roadblock. My immediate concern is that these changes - though quite simple - are untested (at least, not by the unit tests). Regarding the conversion of Having written the above in general terms I tried to check how it applies to other 3rd party dependencies. Consider that a compressed cluster is corrupted. Currently 3rd party decompression code ( |
Agree. Even if not easy to test. (We should probably investigate fuzz testing in a general way)
Well, it is the same case for any of our third party dependencies. If zstd is buggy and cannot uncompress valid content, we cannot really recover from that. The only solution to detect if data is corrupted would be to check the zim archive (using checksum), but we will not handle it at the decompression (nor xapian search) level. I see two ways here:
I tend to follow the first path. We all know that all code can be buggy, but when we use it we first assume it is not. And we already take the responsibility to use it, we can (and should) take the responsibility on the error handling and labeling.
I agree here. We should rethink this too to be consistent. Please note that the original issue is that Xapian::Error do not inherit from std::exception. I have my preference about what to throw (explain above) but in itself, it is not such important. What is important is that user don't have to depend/link on xapian to be able to handle errors coming from our code (or called by our code). |
Either throw a ZimFileFormatError or directly handle the error (mainly returning default value or ignoring corrupted databases)
Fix #870