Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HSP-based heuristics are producing bad BGP orderings #123

Open
kasei opened this issue Dec 22, 2014 · 4 comments
Open

HSP-based heuristics are producing bad BGP orderings #123

kasei opened this issue Dec 22, 2014 · 4 comments
Assignees

Comments

@kasei
Copy link
Owner

kasei commented Dec 22, 2014

The HSP BGP ordering introduced in pull request #114 seem to be producing bad triple orderings. In this query:

PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
SELECT ?server (count(?server) AS ?count)
WHERE {
    GRAPH ?g {
        ?site <urn:app:freshtime:hard> ?hard ;
            <urn:app:hasrequest> ?request .
        ?request <http://www.w3.org/2007/ont/http#hasResponse> ?response .
        ?response <http://www.w3.org/2007/ont/httph#server> ?server .
        FILTER (xsd:integer(?hard) > 0)
    }
}
GROUP BY ?server
ORDER BY ?count

the triple ordering produces a cartesian product between (1⋈2) and 3:

(1) ?site <urn:app:freshtime:hard> ?hard
(2) ?site <urn:app:hasrequest> ?request
(3) ?response <http://www.w3.org/2007/ont/httph#server> ?server
(4) ?request <http://www.w3.org/2007/ont/http#hasResponse> ?response
@kasei
Copy link
Owner Author

kasei commented Dec 22, 2014

I think it might be good for the built-in BGP planner to do simple analysis and to either error or emit loud warnings when a cartesian product occurs when it's possible to avoid it.

@kjetilk
Copy link
Contributor

kjetilk commented Dec 22, 2014

Ouch, my bad!

Could you please write the warnings, and then revert the HSP code?

I'll see if I can find some time to make a real fix. I have an idea on how to do it, but don't know when...

kjetilk added a commit to kjetilk/perlrdf that referenced this issue Dec 22, 2014
Note that it may not be the final tests
kjetilk added a commit to kjetilk/perlrdf that referenced this issue Dec 22, 2014
kjetilk added a commit to kjetilk/perlrdf that referenced this issue Dec 22, 2014
@kjetilk
Copy link
Contributor

kjetilk commented Dec 22, 2014

I tried as a quick-fix to reintroduce the code that was in sort_for_join_variables into merge_patterns. It seems harder to test, because it produces different plans for every invocation, it seems. But also, it seems to remove some of the benefit of the heuristics, as some of the plans are rather bushy, and it moves some of the single-variable patterns around.

It does produce the right plan for my case though. I don't know if you want to merge this, or just revert for now? I fear it may take some before I can find time to implement the last heuristic.

@kjetilk
Copy link
Contributor

kjetilk commented Dec 22, 2014

I'm thinking about simplifying the problem further. First, it should be the case that the first block of triple patterns the merge_pattern method gets all have a common variable. There may be more than one such block. The problem may be that between these blocks, the common variables aren't adjacent, in the sense that the triple pattern that has a common variable between the blocks may not be the last in the first block and first in the second block.

So, the question is how this fact can be used in a quick-fix...

At the very least, it should be possible to just take the triple patterns with less than 2 variables, and not touch them in the first pattern, I suppose. But is it possible to do something more clever than that for a quick-fix?

@kasei kasei self-assigned this Jan 9, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants