Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG Empty sitemap files possible with multi-paged objects #61

Open
tractorcow opened this issue Jan 5, 2014 · 0 comments
Open

BUG Empty sitemap files possible with multi-paged objects #61

tractorcow opened this issue Jan 5, 2014 · 0 comments

Comments

@tractorcow
Copy link
Contributor

In some cases where DataObjects are registered, and that override the canView method (such as when posts are to be moderated before viewing, or could have some automatic expiration built in) it's possible for a sitemap file to be generated that's completely empty.

For example, I have the following dataobject Classified that expires after a set number of days.

My sitemap.xml file looks like the below;

<sitemapindex>
    <sitemap>
        <loc>http://www.mysite.co.nz/sitemap.xml/sitemap/SiteTree/1</loc>
        <lastmod>2013-09-19</lastmod>
    </sitemap>
    <sitemap>
        <loc>http://www.mysite.co.nz/sitemap.xml/sitemap/Classified/1</loc>
        <lastmod>2013-12-09</lastmod>
    </sitemap>
    <sitemap>
        <loc>http://www.mysite.co.nz/sitemap.xml/sitemap/Classified/2</loc>
        <lastmod>2014-01-01</lastmod>
    </sitemap>
    <sitemap>
        <loc>http://www.mysite.co.nz/sitemap.xml/sitemap/Classified/3</loc>
        <lastmod>2014-01-05</lastmod>
    </sitemap>
</sitemapindex>

in this case the older classifieds, while they still exist as dataobjects, return a value of false for canView (as they should), and thus do not appear in the individual sitemaps, but this means that the last two sitemaps are blank; Only Classifieds/1 has any entries, and google complains about /2 and /3 with errors.

A fix could either be:

  • The list of items is filtered by canIncludeInGoogleSitemap (least efficient, since it is o(n) speed).
  • Set some kind of extension in GoogleSitemap::get_sitemaps and GoogleSitemap::get_items to augment the DataList prior to page segmentation. Maybe something like singleton($class)->filterGoogleSitemapItems($list);
  • As a variation of the second option, there could perhaps some kind of query parameter on the DataList that's picked up by GoogleSitemapExtension.

The first option is the simplest, but least efficient, while the second allows items to be pre-filtered at the point of query, but requires additional code on the user's part.

As an aside, I apologise for the lack of attention to my other issue; I'll hopefully get more time in the coming months to review my outstanding PRs and issues. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant