-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: [log retention improvements pt. 3] impl and use per hour file layout #2570
base: main
Are you sure you want to change the base?
Changes from all commits
ee510cd
f881126
487e5d1
f316abe
0614976
75fd409
42f7a30
5fe30ff
929f4b2
fdd8bf3
0883f39
dc9d1d1
d2b9f86
813c98b
6349c61
9510e22
3b73af1
470c61f
2c489e2
2c1f0bf
815eded
ccd49c3
684cbd9
fff5ff0
d00bb2b
8734578
3febce2
7d3dd38
11449c1
aacffb6
2b73907
8daf34d
0b75005
b191925
c6bd4f7
bf45ed2
f589e52
7890081
62f08c3
ae5019b
99a2c58
581c3fd
a3418c3
ddd274a
a8761a2
a07b1ac
02e0bb3
703e13c
a35a8ac
fcf9a22
0fe7f9e
80dfc94
878b65e
ea3206f
c0d6509
ee54acd
e4667a4
d0b9481
f8806c2
518ccd4
b1d5ef2
646fa15
6234fc2
936b4ad
6fb7218
6edc128
84a31f2
d0061b0
a5f7c46
b4124e2
3cb70fa
0c4d4ea
e724d90
1d6ff1e
9102416
b8ce98e
304ade0
642e7f3
24ecb4a
5c32a08
8f83284
85012df
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,139 @@ | ||
package file_layout | ||
|
||
import ( | ||
"fmt" | ||
"github.com/kurtosis-tech/kurtosis/engine/server/engine/centralized_logs/client_implementations/persistent_volume/logs_clock" | ||
"github.com/kurtosis-tech/kurtosis/engine/server/engine/centralized_logs/client_implementations/persistent_volume/volume_consts" | ||
"github.com/kurtosis-tech/kurtosis/engine/server/engine/centralized_logs/client_implementations/persistent_volume/volume_filesystem" | ||
"golang.org/x/exp/slices" | ||
"math" | ||
"os" | ||
"strconv" | ||
"time" | ||
) | ||
|
||
const ( | ||
// basepath year/week/day/hour/ | ||
perHourDirPathFmtStr = "%s%s/%s/%s/%s/" | ||
|
||
// ... enclave-uuid/service-uuid<filetype> | ||
perHourFilePathFmtSt = perHourDirPathFmtStr + "%s/%s%s" | ||
) | ||
|
||
type PerHourFileLayout struct { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What does this struct actually represent? |
||
time logs_clock.LogsClock | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Likewise, what is this guy? I initially thought that a "PerHourFileLayout" had one instance per each hour, or something similar. But now I'm seeing that's not the case, so curious about this prop |
||
baseLogsFilePath string | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is this a filepath, or a dirpath? I'd expect it to be a directory, and all logs go under a single dir.. right? |
||
} | ||
|
||
func NewPerHourFileLayout(time logs_clock.LogsClock, baseLogsFilePath string) *PerHourFileLayout { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This implementation of There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can we put this in a code comment so it doesn't get lost when this PR is closed? |
||
return &PerHourFileLayout{ | ||
time: time, | ||
baseLogsFilePath: baseLogsFilePath, | ||
} | ||
} | ||
|
||
func (phf *PerHourFileLayout) GetLogFileLayoutFormat() string { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Given that this seems to be very vector-specific, should this be called |
||
// Right now this format is specifically made for Vector Logs Aggregators format | ||
// This wil be used my Vector LogsAggregator to determine the path to output to | ||
return fmt.Sprintf("\"%s%%%%Y/%%%%V/%%%%u/%%%%H/{{ enclave_uuid }}/{{ service_uuid }}.json\"", volume_consts.LogsStorageDirpath) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. in this case, rather than wrestling with all the crazy escaping, might be cleaner to read with a simple |
||
} | ||
|
||
func (phf *PerHourFileLayout) GetLogFilePath(time time.Time, enclaveUuid, serviceUuid string) string { | ||
year, week, day, hour := TimeToWeekDayHour(time) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Tiny little sixth sense wondering why this is a public function, and if it should be private first (but will wait to see what happens in the rest of the PR) |
||
return phf.getHourlyLogFilePath(year, week, day, hour, enclaveUuid, serviceUuid) | ||
} | ||
|
||
func (phf *PerHourFileLayout) GetLogFilePaths( | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Curious about the purpose of this function. At first blush, I'd expect that Also curious why I need to pass in a |
||
filesystem volume_filesystem.VolumeFilesystem, | ||
retentionPeriod time.Duration, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. My gut is saying that it's odd I need to tell the |
||
retentionPeriodIntervals int, | ||
enclaveUuid, serviceUuid string, | ||
) ([]string, error) { | ||
var paths []string | ||
retentionPeriodInHours := DurationToHours(retentionPeriod) | ||
|
||
if retentionPeriodIntervals < 0 { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. maybe could find a way to merge this There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That seems sensible to me! |
||
return phf.getLogFilePathsFromNowTillRetentionPeriod(filesystem, retentionPeriodInHours, enclaveUuid, serviceUuid) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is there a missing errcheck & |
||
} else { | ||
paths = phf.getLogFilePathsBeyondRetentionPeriod(filesystem, retentionPeriodInHours, retentionPeriodIntervals, enclaveUuid, serviceUuid) | ||
} | ||
|
||
return paths, nil | ||
} | ||
|
||
func (phf *PerHourFileLayout) getLogFilePathsFromNowTillRetentionPeriod(fs volume_filesystem.VolumeFilesystem, retentionPeriodInHours int, enclaveUuid, serviceUuid string) ([]string, error) { | ||
var paths []string | ||
currentTime := phf.time.Now() | ||
|
||
// scan for first existing log file | ||
firstHourWithLogs := 0 | ||
for i := 0; i < retentionPeriodInHours; i++ { | ||
year, week, day, hour := TimeToWeekDayHour(currentTime.Add(time.Duration(-i) * time.Hour)) | ||
filePathStr := phf.getHourlyLogFilePath(year, week, day, hour, enclaveUuid, serviceUuid) | ||
if _, err := fs.Stat(filePathStr); err == nil { | ||
paths = append(paths, filePathStr) | ||
firstHourWithLogs = i | ||
break | ||
} else { | ||
// return if error is not due to nonexistent file path | ||
if !os.IsNotExist(err) { | ||
return paths, err | ||
} | ||
} | ||
} | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Bug question: if we don't find any log files in the loop above, won't it mean that There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. mmm yes, will add a test for this and fix : ) |
||
// scan for remaining files as far back as they exist before the retention period | ||
for i := firstHourWithLogs + 1; i < retentionPeriodInHours; i++ { | ||
year, week, day, hour := TimeToWeekDayHour(currentTime.Add(time.Duration(-i) * time.Hour)) | ||
filePathStr := phf.getHourlyLogFilePath(year, week, day, hour, enclaveUuid, serviceUuid) | ||
if _, err := fs.Stat(filePathStr); err != nil { | ||
break | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Won't this mean that a single missing file blocks the return of all log files after that (even if they're successful)? Maybe an error would be better here |
||
} | ||
paths = append(paths, filePathStr) | ||
} | ||
|
||
// reverse for oldest to most recent | ||
slices.Reverse(paths) | ||
|
||
return paths, nil | ||
} | ||
|
||
func (phf *PerHourFileLayout) getLogFilePathsBeyondRetentionPeriod(fs volume_filesystem.VolumeFilesystem, retentionPeriodInHours int, retentionPeriodIntervals int, enclaveUuid, serviceUuid string) []string { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Reading this signature, I'm confused about the relationship between |
||
var paths []string | ||
currentTime := phf.time.Now() | ||
|
||
// scan for log files just beyond the retention period | ||
for i := 0; i < retentionPeriodIntervals; i++ { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm sort of confused about this loop - why are we adding |
||
numHoursToGoBack := retentionPeriodInHours + i | ||
year, week, day, hour := TimeToWeekDayHour(currentTime.Add(time.Duration(-numHoursToGoBack) * time.Hour)) | ||
filePathStr := phf.getHourlyLogFilePath(year, week, day, hour, enclaveUuid, serviceUuid) | ||
if _, err := fs.Stat(filePathStr); err != nil { | ||
continue | ||
} | ||
paths = append(paths, filePathStr) | ||
} | ||
|
||
return paths | ||
} | ||
|
||
func (phf *PerHourFileLayout) getHourlyLogFilePath(year, week, day, hour int, enclaveUuid, serviceUuid string) string { | ||
// match the format in which Vector outputs week, hours, days | ||
formattedWeekNum := fmt.Sprintf("%02d", week) | ||
formattedHourNum := fmt.Sprintf("%02d", hour) | ||
return fmt.Sprintf(perHourFilePathFmtSt, phf.baseLogsFilePath, strconv.Itoa(year), formattedWeekNum, strconv.Itoa(day), formattedHourNum, enclaveUuid, serviceUuid, volume_consts.Filetype) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Small suggestion: might make it easier to read if you newlined each of these |
||
} | ||
|
||
func TimeToWeekDayHour(time time.Time) (int, int, int, int) { | ||
year, week := time.ISOWeek() | ||
hour := time.Hour() | ||
day := int(time.Weekday()) | ||
// convert sunday in golang's time(0) to sunday (0) in strftime/Vector log aggregator time(7) | ||
if day == 0 { | ||
day = 7 | ||
} | ||
return year, week, day, hour | ||
} | ||
|
||
func DurationToHours(duration time.Duration) int { | ||
return int(math.Ceil(duration.Hours())) | ||
} | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Okay finishing this class, I think I'm still a bit confused what the purpose of this struct is. I think the confusion is coming from the idea of retention being sort-of baked into this struct, but not fully handled by it. I think my gut would expect either:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be worth a comment explaining what this does, equivalent to the one removed