-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: [log retention improvements pt. 3] impl and use per hour file layout #2570
base: main
Are you sure you want to change the base?
Conversation
baseLogsFilePath string | ||
} | ||
|
||
func NewPerHourFileLayout(time logs_clock.LogsClock, baseLogsFilePath string) *PerHourFileLayout { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This implementation of FileLayout
stores log files per hour so that logs can be removed per hour.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we put this in a code comment so it doesn't get lost when this PR is closed?
...ent_implementations/persistent_volume/stream_logs_strategy/stream_logs_strategy_impl_test.go
Show resolved
Hide resolved
logsDatabaseClient = persistent_volume.NewPersistentVolumeLogsDatabaseClient(kurtosisBackend, osFs, logFileManager, perWeekStreamLogsStrategy) | ||
perHourFileLayout := file_layout.NewPerHourFileLayout(realTime, volume_consts.LogsStorageDirpath) | ||
logFileManager := log_file_manager.NewLogFileManager(kurtosisBackend, osFs, perHourFileLayout, realTime, logRetentionPeriod, volume_consts.LogsStorageDirpath) | ||
if logFileManager.GetLogFileLayoutFormat() != vector_consts.VectorLogsFilepathFormat { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This check ensures that the log file format thats being used to read log files matches the format the logs aggregator is writing log files - if anyone changes the write format or the read format without updating the other, this should fail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice context! Can this be a comment to capture that this is an extra sanity check?
internal_testsuites/golang/testsuite/startlark_user_passing_test/starlark_user_passing_test.go
Show resolved
Hide resolved
var paths []string | ||
retentionPeriodInHours := DurationToHours(retentionPeriod) | ||
|
||
if retentionPeriodIntervals < 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe could find a way to merge this getLogFilePathsFromNowTillRetentionPeriod
and getLogFilePathsBeyondRetentionPeriod
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That seems sensible to me!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made it through!
// To construct the filepath, we utilize vectors template syntax that allows us to reference fields in log events | ||
// https://vector.dev/docs/reference/configuration/template-syntax/ | ||
baseLogsFilepath = "\"" + logsStorageDirpath + "%%Y/%%V/" | ||
baseLogsFilepath = "\"" + logsStorageDirpath + "%%Y/%%V/%%u/%%H/" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be worth a comment explaining what this does, equivalent to the one removed
perHourFilePathFmtSt = perHourDirPathFmtStr + "%s/%s%s" | ||
) | ||
|
||
type PerHourFileLayout struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does this struct actually represent?
} | ||
} | ||
|
||
func (phf *PerHourFileLayout) GetLogFileLayoutFormat() string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that this seems to be very vector-specific, should this be called GetVectorLayoutFormat
or something like that?
func (phf *PerHourFileLayout) GetLogFileLayoutFormat() string { | ||
// Right now this format is specifically made for Vector Logs Aggregators format | ||
// This wil be used my Vector LogsAggregator to determine the path to output to | ||
return fmt.Sprintf("\"%s%%%%Y/%%%%V/%%%%u/%%%%H/{{ enclave_uuid }}/{{ service_uuid }}.json\"", volume_consts.LogsStorageDirpath) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in this case, rather than wrestling with all the crazy escaping, might be cleaner to read with a simple "
+ volume_consts.LogsStorageDirpath + ...the rest...
} | ||
|
||
func (phf *PerHourFileLayout) GetLogFilePath(time time.Time, enclaveUuid, serviceUuid string) string { | ||
year, week, day, hour := TimeToWeekDayHour(time) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tiny little sixth sense wondering why this is a public function, and if it should be private first (but will wait to see what happens in the rest of the PR)
"time" | ||
) | ||
|
||
func TestConvertWeeksToDuration(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I love how methodical you are with testing 😍
logsDatabaseClient = persistent_volume.NewPersistentVolumeLogsDatabaseClient(kurtosisBackend, osFs, logFileManager, perWeekStreamLogsStrategy) | ||
perHourFileLayout := file_layout.NewPerHourFileLayout(realTime, volume_consts.LogsStorageDirpath) | ||
logFileManager := log_file_manager.NewLogFileManager(kurtosisBackend, osFs, perHourFileLayout, realTime, logRetentionPeriod, volume_consts.LogsStorageDirpath) | ||
if logFileManager.GetLogFileLayoutFormat() != vector_consts.VectorLogsFilepathFormat { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice context! Can this be a comment to capture that this is an extra sanity check?
) | ||
|
||
// PerWeekStreamLogsStrategy pulls logs from filesystem where there is a log file per year, per week, per enclave, per service | ||
// StreamLogsStrategyImpl pulls logs from filesystem where there is a log file per year, per week, per enclave, per service |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this comment still true? From what I understand, it's now transparent to whether it's per-week or per-hour right?
logRetentionPeriodInWeeks int | ||
type StreamLogsStrategyImpl struct { | ||
time logs_clock.LogsClock | ||
logRetentionPeriod time.Duration |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm trying to figure out why this struct cares about the retention period and fileLayout. From what I understood above, it's the LogManager
that's responsible for retention, and interfacing with the underlying fileLayout, so wouldn't that mean that StreamLogsStrategy
should interface with LogsManager
instead?
Otherwise, isn't StreamLogsStrategy then going to have to reimplement logic for retention-y stuff? (Or, maybe this is why FileLayout has the retention-y stuff in it, that confused me earlier..? 🤔)
@@ -154,6 +157,56 @@ func TestStreamUserServiceLogsPerWeek_WithFilters(t *testing.T) { | |||
require.NoError(t, testEvaluationErr) | |||
} | |||
|
|||
func TestStreamUserServiceLogsPerHour_WithFilters(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, nice job with these tests! I know that testing something as complex as this can feel like a drag, but it makes the code sooo much safer 👍
Description
Log Retention Improvements:
LogFileLayout
andPerWeekLogFileLayout
and adjustLogFileManager
andLogsDatabaseClient
to use this for retrieving log file pathsPerHourFileLayout
and swap storage format fromPerWeekFileLayout
toPerHourFileLayout
Is this change user facing?
YES
References
#2536