-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GTFS-ride aggregation and GTFS versioning #25
Comments
Why not just use git? WRI and AFD are experimenting this by proposing a GitLab instance as a data infrastructure for GTFS-producing projects in developing countries: Versioning of the GTFS is built-in! (Example for Accra) |
Re Versioning: It seems to me that versioning of "history" (what GTFS-RIDE's main function is) is a problem with less dimensions that versioning different potential futures. Just two that I can think of, in fact:
In the instance of a specification change, I would say that it should be up to the interpreting software to be able to verify and accept different specification versions; AND, per GTFS theory, any new specification should be backward compatible. Therefore I see less of a reason to maintain previous specification versions. In the case of correcting an error, I do believe it is very important to keep old versions because people will have referenced them. Therefore, it is fairly straightforward to keep a file in a single git repo with a single branch, advancing (and appropriately tagging) for whichever type of commit you are making (error vs format). BUT..... GTFS-RIDE is also a format that can be used for "the future" and here is where I think it gets really dicey (as you can see in the preso that @antrim referenced). SO many potential dimensions. Le sigh. |
W.r.t. "best practice" around file management, I think we should consider that smaller file sizes are better in general because:
AKA - we should be storing files in the size in which they are created; likely each day IMHO and then at regular aggregation intervals. |
As in the presentation from @e-lo, this is an issue for which GTFS-ride doesn't have a solution. The current best practice which the project team has been using to create pilot GTFS-ride feeds follows the process in the comment from @ODOT-RPTD-mb referenced above. The most glaring issue arises when a new feed is published to correct an error in a previous feed. There is currently no mechanism to indicate which feed should be used to associate the ridership data when dates overlap. The most recently published feed is assumed "active" from its date until a subsequently published feed supersedes it. It seems this issue stems from the fact that GTFS is intended to be a forward-looking plan for anticipated services, but GTFS-ride needs a historical account of the services which were actually offered. The frequent publishing of new GTFS feeds is another issue contributing to the clumsy cumbersomeness of needing to handle many, large GTFS-ride feeds. It seems a merged, corrected "GTFS-retro" feed is what is desired. The idea of using GTFS-realtime together with GTFS to create such dataset was an intriguing idea, but probably still far off. I like the git idea as well, but this sounds like a broader issue with GTFS practices than one can be solved here. @antrim should this issue be closed or do you feel that more action is needed here? |
@carletop I have been thinking about accessing demand based on GTFS ride data. One thing that would help to estimate demand is an update on the demand for the previous vehicle to pass the same stop. This lead me to think about the issue you describe here and in particular your comments about GTFS as being forward-looking and GTFS-ride looking back. I have in the past used CapMetrics as a source of data for working on predictions. The way they have gathered the data from GTFS-realtime vehicle locations and posting it to GIT was very useful. It would be very useful if there was a standard way of providing the data corresponding to a row of board_alight.txt in real-time (on doors close). This could then be archived in GIT along with the current active GTFS and in turn use this GIT repostory and the realtime feed to further inform demand predictions. Is this something that would be possible using current APC systems? |
@scrudden One key challenge for archiving occupancy from GTFS-realtime today is that GTFS-rt only supports a high-level enumeration of occupancy with values like "MANY_SEATS_AVAILABLE, FEW_SEATS_AVAILABLE", etc.: There is currently a proposal being drafted that would allow more details about a vehicle, including more granular quantitative occupancy, to be expressed in GTFS and GTFS-rt. I'd welcome comments and ideas from everyone on the current draft spec: |
@barbeau Where is the best place to comment on gtfs-vehicles? Directly in the google doc? From what I have read so far the proposal seems to capture occupancy well but for my intended purpose, I would like to know the number of passengers boarding and alighting at each stop. |
Yes, just comment in the Google Doc right now. |
More documentation and consideration of how this data should/would be managed, stored, and amended over the course of time would be useful.
The text was updated successfully, but these errors were encountered: