-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Request pg upgrade for Data 101 Hub #5999
Comments
@cycomachead can you confirm if this is for the client, server, or both? Also in the case of the server, do you have any concerns about upgrading from 14->16 with the existing volumes created using 14? Typically when you upgrade postgresql major versions there's a dump and restore procedure that must be performed. We wouldn't be doing that in this case so I'm not sure if that would result in the volumes being corrupted. The solution for fixing a corrupted volume in this case would likely be to delete it and let K8s provision an empty one. |
For a client upgrade, the client is currently installed via apt. In order to get the postgresql16 client we'd likely be pulling the package installation out of apt.txt and we'd place it in environment.yml to be deployed via conda which provides 16.4. |
Ideally both the client and the server — I’ve always kept them in sync. Mostly when it comes to any pg specific course materials, it’d be nice to have things updated. (And supposedly COPY is quite improved. since that’s how students get data into their DBs for each assignment, it’d be nice to have.) plus there’s a few features around exploring performance that could be nice. As far as current volumes, I wasn’t aware that there were any that students use? Do you know what persistent DBs exist? I’ve been operating under the model that the dbs themselves are transient. (Earlier slack discussions not withstanding about different assignments.)-- Michael BallFrom my iPhonehttps://mball.coLecturer UC Berkeley EECS | Travel Advisor Fora On Aug 22, 2024, at 5:43 PM, felder ***@***.***> wrote:
https://docs.datahub.berkeley.edu/admins/howto/rebuild-postgres-image.html
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: ***@***.***>
|
@cycomachead so the way the student postgresql servers work is there is another container that runs in the pod which hosts each student's postgresql server. Additionally, a separate 4GB volume is provisioned for postgresql for each student. Those volumes are persistent (we delete them at the end of each semester). Mongodb works the same as well and it gets a 2GB volume. For example I do this:
Then restart my server and do this, we can see the information persists:
My concern here is if you or your teaching staff have already done some prep work within postgresql, upgrading the server might not work smoothly because the persistent volume contains a postgresql 14 db. Until I try this in staging, I won't really know what it's going to do. The remedy for an issue here would be removal of the existing persistent volumes so another, empty one, would be provisioned. As for how many there are, whenever someone signs into data101.datahub.berkeley.edu these volumes are provisioned for them if they do not exist. Last week as part of our routine maintenance I removed all of the volumes except ones owned by you and lisa. Currently, there are 19 of them. If your course treats these as ephemeral (which IMO is a very good idea since we don't archive them and support for an issue with a specific volume would be limited), then I'd assume there's not an issue with me deleting all of data101s volumes as part of the db upgrade. Can you confirm that you and lisa don't have any data within your postgresql volumes that you'd miss? If so, I'll proceed. |
In the meantime I can work on upgrading the client since the client is backwards compatible with the server. |
I see, thanks for the explanation! The persistence certainly makes sense
for ease / performance, but it's not a strict requirement. The new
volumes are likely from the accessibility workshop earlier in the week
where folks were using the a11y tool. I don't think there's anyone doing pg
stuff.
So, yes, they're safe to delete! So far, the work we've been doing is all
testing, many (?, at least some) of the assignments that students do
explicitly recreate the db if one exists, since in the process of testing,
etc. it's not too hard for a student to hose their db. Basically, the
current setup provides students with a compressed *.sql file which is
loaded. (Perhaps we should come up with a db naming scheme so the uses
would be more clear, but my understanding is that we never ask students to
directly use the jovyan or postgres dbs, and for us, there's nothing
specific to the template0 or 1 dbs that would matter)
[Separately, from the other Slack discussion, I think there's a use for
limited persistent / shared DBs, but given that'd be a somewhat different
use case of just rapidly getting students working on writing queries
without thinking about pg, connecting to a db, etc. But that's a future
deal, and a specific type of case where I'd be fine just recreating those
as needed]
--
Michael Ball | michaelball.c <https://michaelball.co/>
Lecturer | UC Berkeley Computer Science
Software Engineer | Gradescope
<https://www.gradescope.com/?utm_source=michael_ball>
m: (909) 993-3988 | CV <https://www.michaelball.co/resume>
…On Fri, Aug 23, 2024 at 12:23 PM felder ***@***.***> wrote:
In the meantime I can work on upgrading the client since the client is
backwards compatible with the server.
—
Reply to this email directly, view it on GitHub
<#5999 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AALPU43M7YUITKRIQZUCFN3ZS6D3XAVCNFSM6AAAAABM7EJ65SVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMBXGY4DGMZUGA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
@cycomachead unfortunately I'm encountering significant resistance with the client install. Currently there is no easy mechanism for us to pull in postgresql 16 from the pgdg repo (https://www.postgresql.org/download/linux/ubuntu/) via apt and so I was trying to see I could do it via conda. However, I'm having significant dependency conflicts when attempting to run The way to do this via apt would be to convert the data101 hub build process to use a Dockerfile so that we could add the repo and install from there. However, that is a somewhat significant task which would also require significant testing and that is not really practical this close to the start of instruction. I'm going to open up an internal ticket for this and invite some others on the team to take a look at it, but I'm not super optimistic. One technique is to attempt to do this on a running server and modify the installed packages accordingly to try and fix the dependency conflicts. However, I do not feel comfortable uninstalling packages or trying a bunch of different versions of packages because it's indeterminate what the impact of doing that would be on the notebooks your course relies on that use these packages. Here is sample output of what I see when attempting to install postgresql:
One possibility is to have someone on your course staff attempt to resolve the dependencies while testing to make sure all of your course notebooks continue to work, and then submitting a PR with a new list of packages and their locked versions. However, let's see if anyone else on the team has better success or other ideas. |
@cycomachead We will be discussing the a) issues reported by Jon F and b) the next steps for this request during our planning meeting on Sep 3rd. Sorry that we couldn't offer any resolution before the start of the semester. |
No worries, appreciate the update! |
@cycomachead During our discussion on Tuesday, we determined that, based on our current bandwidth, we will only be able to address this request before the start of SP 25. If you and Lisa are planning to offer Data 101 during SP 25, we can schedule a quick meeting in November or December to discuss the requirements and plan the implementation details. Let us know! |
November / December are totally fine |
Package Name
postgres 16 latest (upgrade from 14.5)
Hub URL
data101.datahub.berkeley.edu
Course Name
DATA 101
https://classes.berkeley.edu/content/2024-fall-data-101-001-lec-001
Semester Details
Fa24 and forward
Installation Deadline
Ideally 8/30/2024
but this change should be safe.
The text was updated successfully, but these errors were encountered: